NoSQL Common Traits(特徵)
1. Non-relational
2. Non-schematized/schema-free
3. Eventual consistency
4. Open source
5. Distributed
6. Web scale
7. Developed at big Internet companies (Yahoo, Google, Facebook, LinkedIn…)
NoSQL Eventual Consistency
CAP Theorem: Databases may only excel at two of the following three attributes: Consistency, Availability and Partition Tolerance
1. Relational database are Consistency & Partition Tolerance
2. NoSQL are Availability & Partition Tolerance
NoSQL does not offer “ACID” guarantees: Atomicity, Consistency, Isolation and Durability. But relational database does.
NoSQL instead offers “eventual consistency”, similar to DNS propagation (繁殖)
NoSQL Indexing
1. Most NoSQL databases are indexed by key.
2. Some allow so-called “secondary” indexes
3. Often the primay key indexes are clustered
4. Hbase uses Hadoop Distributed File System(HDFS), which is append-only
- Writes are logged
- Logged writes are batched
- File is re-created and sorted
NoSQL Queries
1. Typically no query language
2. Instead, create procedural program
3. Sometimes SQL is supported
4. Sometimes MapReduce code is used
NoSQL MapReduce
1. Map Step: split the query up
2. Reduce Step: merge the results
3. Most typical of Hadoop and used with Wide Column Stores, esp. Hbase
4. Amazon Web Services’ Elastic MapReduce (EMR) can read/write DynamoDB, S3, Relational Database Service (RDS)
5. Hive offers a HiveQL (SQL-like) abstraction over MapReduce
- Use with Hive tables
- Use with Hbase
NoSQL Sharding
1. A partitioning pattern where separate servers store partitions
2. Fan-out queries supported
3. Partitions may be duplicated, so replication also provided. Good for disaster recovery
4. Since “shards” can be geographically distributed, sharding can act like a CDN
5. Good for keeping data close to processing. Reduces network traffic when MapReduce splitting takes place
沒有留言:
張貼留言