2013年10月7日 星期一

NoSQL–Basic Concepts

NoSQL Common Traits(特徵)

1. Non-relational

2. Non-schematized/schema-free

3. Eventual consistency

4. Open source

5. Distributed

6. Web scale

7. Developed at big Internet companies (Yahoo, Google, Facebook, LinkedIn…)

 

NoSQL Eventual Consistency

CAP Theorem: Databases may only excel at two of the following three attributes: Consistency, Availability and Partition Tolerance

1. Relational database are Consistency & Partition Tolerance

2. NoSQL are Availability & Partition Tolerance

 

NoSQL does not offer “ACID” guarantees: Atomicity, Consistency, Isolation and Durability. But relational database does.

NoSQL instead offers “eventual consistency”, similar to DNS propagation (繁殖)

 

NoSQL Indexing

1. Most NoSQL databases are indexed by key.

2. Some allow so-called “secondary” indexes

3. Often the primay key indexes are clustered

4. Hbase uses Hadoop Distributed File System(HDFS), which is append-only

  - Writes are logged

  - Logged writes are batched

  - File is re-created and sorted

 

NoSQL Queries

1. Typically no query language

2. Instead, create procedural program

3. Sometimes SQL is supported

4. Sometimes MapReduce code is used

 

NoSQL MapReduce

1. Map Step: split the query up

2. Reduce Step: merge the results

3. Most typical of Hadoop and used with Wide Column Stores, esp. Hbase

4. Amazon Web Services’ Elastic MapReduce (EMR) can read/write DynamoDB, S3, Relational Database Service (RDS)

5. Hive offers a HiveQL (SQL-like) abstraction over MapReduce

  - Use with Hive tables

  - Use with Hbase

 

NoSQL Sharding

1. A partitioning pattern where separate servers store partitions

2. Fan-out queries supported

3. Partitions may be duplicated, so replication also provided. Good for disaster recovery

4. Since “shards” can be geographically distributed, sharding can act like a CDN

5. Good for keeping data close to processing. Reduces network traffic when MapReduce splitting takes place

沒有留言:

張貼留言