Mr.Ba.Code: NoSQL

Data Model	Performance	Scalability	Flexibility	Complexity	Functionality	Security	Matching DB
Key–value Stores	high	high	high	none	variable	Weak Authentication Weak Authorization No Audit function No Encryption	Redis, Memcached, DynamoDB
Column Store	high	high	moderate	low	minimal	Cassandra, HBase
Document Store	high	variable	high	low	low	MongoDB, CouchDB, CouchBase
Graph Database	variable	variable	high	high	graph theory	Neo4j, Titan
Relational Database	variable	variable	low	moderate	high	Strong	Oracle, MS SQL, MySQL

Name	Redis	Cassandra	HBase	MongoDB	Neo4j	MS SQL Server
Description	In-memory database with configurable options performance vs. persistency	Wide-column store based on ideas of BigTable and DynamoDB	Wide-column store based on Apache Hadoop and on concepts of BigTable	One of the most popular document stores	Open source graph database	Microsoft relational DBMS
Website	redis.io	cassandra.apache.org	hbase.apache.org	www.mongodb.org	neo4j.org	www.microsoft.com/sqlserver
Server operating systems	BSD Linux OS X Windows	BSD Linux OS X Windows	Linux Unix Windows(*)	Linux OS X Solaris Windows	Linux OS X Windows	Windows
Database model	Key-value store	Wide column store	Wide column store	Document store	Graph DBMS	Relational DBMS
Data scheme	schema-free	schema-free	schema-free	schema-free	schema-free	yes
Predefined data type	no	yes	no	yes	yes	yes
Secondary indexes	no	restricted	no	yes	yes	yes
Name	Redis	Cassandra	HBase	MongoDB	Neo4j	MS SQL Server
APIs and other access methods	proprietary protocol	Cassandra Query Language Thrift	Java API RESTful HTTP API Thrift	proprietary protocol using JSON	Cypher query language Java API RESTful HTTP API	OLE DB Tabular Data Stream (TDS) ADO.NET JDBC ODBC
Supported programming languages (C#, JavaScript, Java, Python, Ruby Powershell)	C C# C++ Java JavaScript Objective-C Perl PHP Python Ruby ……	C# C++ Java JavaScript Perl PHP Python Ruby ……	C C# C++ Java PHP Python ……	C C# C++ Java JavaScript Perl PHP PowerShell Python Ruby ……	C# Java JavaScript Perl PHP Python Ruby ……	C# Java PHP Python Ruby Visual Basic
Server-side scripts (Stored Procedure)	Lua	no	Coprocessor in Java	JavaScript	Server Plugin in Java	Transact-SQL and .NET languages
Triggers	no	no	yes	no	yes	yes
Partitioning methods	none	Sharding	Sharding	Sharding	none	tables can be distributed across several files (horizontal partitioning), but no sharding
High Availability	Replication Automatic failover by Redis Sentinel	No automatic failover Relies on client failover (like MS SQL mirroring)	Implement by HDFS and Zookeeper Automatic failover	Replication Automatic failover	Replication Automatic failover	Cluster, Replication, Mirroring, Always on, etc. Automatic failover
Replication methods	Master-slave replication	selectable replication factor Peer-to-peer replication	Replication based on HDFS	Master-slave replication	Master-slave replication (Enterprise only)	Snapshot, Transactional, Peer-to-peer replications
MapReduce	no	yes	yes	yes	no	no
Consistency in distributed system	n/a	Eventual Consistency Immediate Consistency	Immediate Consistency	Eventual Consistency Immediate Consistency	Eventual Consistency	n/a
Name	Redis	Cassandra	HBase	MongoDB	Neo4j	MS SQL Server
Foreign keys	no	no	no	no	yes	yes
Transaction concepts	optimistic locking	no	no	no	ACID	ACID
Concurrency	yes	yes	yes	yes	yes	yes
Durability (Data Persistent)	yes	yes	yes	yes	yes	yes
Access Control	very simple password-based access control No native LDAP support, can leverage third party component	Access rights for users can be defined per object Do not support LDAP natively.	Access Control Lists (ACL) depends on Hadoop and Zookeeper	Users can be defined with full access or read-only access Support LDAP authentication (v2.5)	IP-level restrictions No native LDAP support, can leverage third party component	Users with fine-grained authorization concept
Specific characteristics	Redis very much emphasize performance. In any design decisions performance has priority over features or memory requirements.	Supports multi-data center replication			Can also be used server-less as embedded Java database.	Is one of the "Big 3" commercial database management systems besides Oracle and DB2
Best used	For rapid changing data with a foreseeable data size (should fit mostly in memory).	Write more than read and write is faster than read.	The best way to run Map/Reduce jobs on huge datasets.	Dynamic queries. If you prefer to define indexes, not Map/Reduce jobs. Good performance on big data.	For graph-style, rich or complex, interconnected data.	…..
Typical application scenarios	Applications that can hold all data in memory, and that have high performance requirements. Stock prices, real-time data collection, analytics, communication.	Distributed databases with many write operations Log, data analysis	Searching engine, log analysis	For most things you would do with MySQL, but having predefined columns.	Searching routes in social relations, public transport links, road maps, or network topologies	…..
Name	Redis	Cassandra	HBase	MongoDB	Neo4j	MS SQL Server
Disadvantages	· Redis is an in-memory store: all your data must fit in memory. RDBMS usually stores the data on disks, and cache part of the data in memory. With a RDBMS, you can manage more data than you have memory. With Redis, you cannot. · Redis is a data structure server. There is no query language and no support for a relational algebra. No ad-hoc queries. All data accesses should be anticipated by the developer, and proper data access paths must be designed. A lot of flexibility is lost. · Redis offers 2 options for persistency: regular snapshotting and append-only files. None of them is as secure as a real transactional server providing redo/undo logging, block checksuming, point-in-time recovery, flashback capabilities, etc ... · Redis only offers basic security at the instance level. · A unique Redis instance is not scalable. It only runs on one CPU core in single-threaded mode. To get scalability, several Redis instances must be deployed and started. Distribution and sharding are done on client-side.	· No server side automatic failover. Client applications need to handle it. · Weak security control · No join or subquery support, limited support for aggregation · sorting of data is a design decision; it can be done through one of predefined ways; data can be retrieved back in same order; that’s all - there is no things like ORDER BY, GROUP BY · A single column value may not be larger than 2GB (someone stored large blob files more than 2GB unofficially).	· NameNode is a single point of failure. · Map/Reduce jobs are less efficient · Relies on Hadoop and HDFS	· If something crashes while it’s updating ‘table-contents’ – all data loss. Repair takes a lot of time, but usually ends up in 50-90% data loss if you aren’t lucky. So only way to be fully secure is to have 2 replicas in different data centers. · Indexes take up a lot of RAM. They are B-tree indexes and if you have many, you can run out of system resources really fast. · Data size in MongoDB is typically higher due to e.g. each document has field names stored it · Less flexibility with more complex querying (e.g. no joins) · No support for transactions · Global lock for either write or multiple reads, which makes concurrency less efficient.	· Lack of tools and frameworks support · Not support for ad-hod queries	…..