- http://www.thoughtworks.com/articles/nosql-comparison
-
Why now?
- Database scaling issues
- Availability of options (open src)
- Nature of data
- Schema changing and/or needs unconventional treatment
-
Characteristics
- Easy to use in conventional load-balanced clusters
- Persistence (not just cache)
- Scale to available memory
- No fixed schema; schema migration w/o downtime
- Individual query system; no std query language
- ACID within a node of cluster; eventually consistent across cluster
-
Use cases
- Relational DB does not scale
- Data distributed in time dimension; Can't save all in RDBMS
- Performance issues
- Lots of BLOB/CLOB
- Temporary data (e.g. shopping card)
- Non-crucial data; Durability not reqd
- Non-conventional queries
-
Query issues
- No SQL uniformity
- SparQL for RDF/Tuples
- Need something uniform
-
Deployment situations
- Decoupling helps
- No what must be RDBMS; what could be NoSQL
-
Implementations
-
Key-value stores
- Tokyo cabinet, Voldemort
- Cache-type data
- Fast retrieval
- No schema
-
Document databases
- MongoDB, CouchDB
- Can handle incomplete data; Document-type web apps
- Query performance bad; No query language
-
Graph databases
- Neo4J, Infinite Graph
- Graph-type data and algorithms easy to handle
- Slow; Needs to traverse entire graph
-
XML databases
- Oracle, MarkLogic
- Document workflow systems
- XML validation, XPath/XSL tools
-
Distributed peer stores
- HBase, Cassandra, Riak
- Distributed file system type apps
- Fast retrieval; excellent distributed approach
- Low-level retrieval API
-
Object stores
- db4o, ObjectStore
- Uses OO paradigm
- Low-latency acid
- Querying is an issue