This page is offered as a service of Bristle Software, Inc. New tips are sent to an associated mailing list when they are posted here. Please send comments, corrections, any tips you'd like to contribute, or requests to be added to the mailing list, to tips@bristle.com.
Original Version: 2/10/2012
Last Updated: 2/10/2012
Applies to: All NoSQL databases
What are NoSQL databases? How do they differ from SQL databases? What are they good for?
"NoSQL" is an umbrella term that is better interpreted as "Not Only SQL" than as "No SQL". NoSQL databases, also known as "NoSchema" databases, are any data store that is not a traditional relational database (Oracle, DB2, MySQL, Postgres, Sybase, MS SQL Server, etc.) consisting of a well-defined set of tables, each with a well-defined set of columns. They include key-value pairs, document stores, graph databases, XML databases, object stores, etc.
They are generally useful for applications that:
Various types of NoSQL databases have been invented in recent years to support Web sites with massive numbers of users. Thus, we have Google's BigTable, Facebook's Cassandra (now open-sourced to Apache), Amazon's DynamoDB, as well as several that were created by smaller companies or community efforts to address the same types of needs: MongoDB, CouchDB, etc.
Different NoSQL databases use different strategies to achieve their speed, scalability, and flexibility. Some common approaches are:
For more info, see the NoSQL row of my links page:
http://bristle.com/~fred/#nosql_db
Thanks to Jonathan Addelston for prompting me to write this tip!
--Fred
Original Version: 5/31/2010
Last Updated: 4/13/2012
Applies to: MongoDB 1.8+
Someone asked me recently about my experience with NoSQL databases, so I figured it was time to finally write up these notes about my use of MongoDB on a year-long project.
My experience with MongoDB was all good.
Free open source software.
Very well supported by a very active mailing list, with questions being answered within minutes by the company (10gen) that produced the product, and sometimes extensive back-and-forth dialogs between them and various users as they drill down patiently into the newbie mistakes that the users make. Also, good on-line docs and printed O'Reilly books and local (Philly, DC, NY) MongoDB conferences. You can also buy a paid support contract, if you need it.
Very easy to administer. No install required. Simply download
and unzip. Run a MongoDB server on any Mac, Linux, Windows
or Solaris box, whether a laptop, desktop or server, by simply
typing:
mongod
at a command line. Run a client to access it by typing:
mongo
Drivers available for Java, JavaScript, C#, C++, and many others.
Very easy replication and sharding. For example, to create a
master:
mongod --master
and to create a slave:
mongod -slave --source master_ip:master_port
Sharding is just as easy, automatically slicing up the data so that different key ranges are stored on different servers. So are replica sets, where the nodes automatically monitor each other, share data, notice when the primary vanishes, elect a new primary from among the secondaries, and if the primary re-appears, it makes itself a secondary, etc.
Very flexible. The entire database structure is one or more collections of objects with each object being a simple JSON document. No fixed schema. Each collection can contain a mixture of different structures of JSON documents. We stored people, companies, addresses, etc., all in the same collection. Each object in a collection can have its own unique set of fields if you like.
In our case, we wanted more control over the structures of the documents. So we used JSON-Schema to define an explicit dynamic schema that we could easily enforce as needed, while still allowing our users to define their own fields of our objects or to define their own objects.
Very fast, and scalable, and robust. The name "Mongo" is from "humongous". The CERN Large Hadron Collider throws data into MongoDB as fast as it can collect it from its near light-speed experiments. It is trivially easy to define indexes on the data, which happily ignore documents that have no such fields, and rapidly find those that do.
No joins though, so it is a totally different mindset. You tend to embed related subdocs in a parent doc rather than setting up foreign key relationships between multiple normalized tables. That turned out to be very easy and very natural. The JSON documents in the DB map very nicely to domain objects. No need for an ORM because there is no R (relational database) and no need for any M (mapping). There are only O (objects). Nice!
No transactions either, but inserts and updates are atomic, even for very complex documents, so once you've structured the data to not need joins, you don't miss transactions much either.
For more details of our experience, see the video of a 45-minute talk by Mike Brocious, our tech lead:
http://screencasts.chariotsolutions.com/webpage/how-mongo-db-helps-visibiz-tackle-social-crmOr just read the slides:
http://www.slideshare.net/mikebrocious/mongodb-at-visibizYou can also download the audio as an MP3 and play it in the car:
http://techcast.chariotsolutions.com/philly-ete-2011-podcast-3-how-mongo-db-helps-visibiz-tackle-social-crmAlso, see the MongoDB row of my links page:
http://bristle.com/~fred/#mongodb
Thanks to Thor Collard for prompting me to finally write this up!
--Fred
Original Version: 2/10/2012
Last Updated: 2/10/2012
Applies to: Cassandra
I haven't used Cassandra yet, but I know a little about it.
It was originally created by Facebook for their use, to handle massive amounts of data, and then open-sourced as an Apache project. It's one of the new NoSQL databases, like MongoDB, CouchDB, etc.
See my links at:
http://bristle.com/~fred/#nosql_db
Especially, you might want to start with the brief summary at:
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redisand the quick explanation from Facebook at:
http://www.facebook.com/note.php?note_id=24413138919before going to the much more detailed comparison (100 pages) at:
http://www.christof-strauch.de/nosqldbs.pdfand the official Apache docs at:
http://cassandra.apache.org/
--Fred
Original Version: 2/10/2012
Last Updated: 2/10/2012
Applies to: CouchDB
I haven't used CouchDB yet. It's an open source Apache project, one one of the new NoSQL databases, like MongoDB, Cassandra, etc. See the links at my links page for comparisons of it with other NoSQL databases:
http://bristle.com/~fred/#nosql_db
Especially, you might want to start with the brief summary at:
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redisand the brief comparison (by the MongoDB team) of CouchDB with MongoDB and MySQL:
http://www.mongodb.org/display/DOCS/MongoDB,+CouchDB,+MySQL+Compare+Gridbefore going to the much more detailed comparison (100 pages) at:
http://www.mongodb.org/display/DOCS/Comparing+Mongo+DB+and+Couch+DB
http://www.christof-strauch.de/nosqldbs.pdfand the official Apache docs at:
http://couchdb.apache.org/
--Fred
©Copyright 2010-2021, Bristle Software, Inc. All rights reserved