NoSQLIt’s about making intelligent
choices
The Relation Model
• Simplicity and Elegance• Well Understood• Very Powerful Abstraction• Solve Many Storage Problem (Persistent Data)• Concurrency• Integration• A Mostly Standard Model• …• But It also has its Limitation…
Business Database
Issues With Implementing A Relational Database• Agility and Programmability (Impedance Mismatch)• Flexibility• Performance and Scalability• Availability
NoSQL Business Drivers
NoSQLNo SQL
Not Only SQLNon-relational Database
Key/Value Store
Typical Usage
• Image Stores• Key-Based File Systems• Object Cache• Systems Designed to Scale
Key/Value Store
• BerkeleyDB• LevelDB• Memcached• Project Voldemort• Redis• Riak
Document Database
Typical Usage
• Web Crawler Results• Big Data Problems That Can Relax Consistency Rules
Document Database
• CouchDB• MongoDB• OrientDB• RavenDB• Terrastore
Column Family
Typical Usage
• High-Variability Data• Document Search• Integration Hubs• Web Content Management• Publishing
Column Family
• Amazon SimpleDB• Cassandra• Hbase• HyperTable
Graph Database
Typical Usage
• Social Networks• Fraud Detection• Relationship-Heavy Data
Graph Database
• FlockDB• HyperGraphDB• InfiniteGraph• Neo4J• OrientDB
Common Features of NoSQL Databases
• Designing Aggregations• An aggregate in a NoSQL database is similar to a row in a
table in a relational database
• Materializing Summary Data• Map/Reduce
• Implementing High Availability • Clusters
• Improving Scalability and Reducing Network Latency• Sharding
• Improving Consistency• Data Versioning
• Schemas and Non-Uniformity
NoSQLCase Studies
LiveJournal’s Memcache
LiveJournal’s Memcache
• Driver• Need to increase performance of database queries.
• Finding• By using hashing and caching, data in RAM can be
shared. This cuts down the number of read requests sent to the database, increasing performance.
Google’s MapReduce
MapReduce Example – Word Count
Google’s MapReduce
• Driver• Need to index billions of web pages for search using
low-cost hardware.
• Finding• By using parallel processing, indexing billions of web
pages can be done quickly with a large number of commodity processors.
Google BigTable
• Driver• Need to flexibly store tabular data in a distributed
system.
• Finding• By using a sparse matrix approach, users can think of
all data as being stored in a single table with billions of rows and millions of columns without the need for up-front data modeling.
Amazon’s Dynamo
Amazon’s Dynamo
• Driver• Need to accept a web order 24 hours a day, 7 days a
week.
•Finding• A key-value store with a simple interface can be
replicated even when there are large volumes of data to be processed.
Polyglot Persistence
Key Points
• Relational databases have been a successful technology for twenty years, providing persistence, concurrency control, and an integration mechanism.
• Application developers have been frustrated with the impedance mismatch between the relational model and the in-memory data structures.
• There is a movement away from using databases as integration points towards encapsulating databases within applications and integrating through services.
• The most important result of the rise of NoSQL is Polyglot Persistence.
Key Points
• The vital factor for a change in data storage was the need to support large volumes of data by running on clusters. Relational databases are not designed to run efficiently on clusters.
• NoSQL is an accidental neologism. There is no prescriptive definition—all you can make is an observation of common characteristics.
• The common characteristics of NoSQL databases are• Not using the relational model• Running well on clusters• Open Source• Built for the 21st century web estates• Schemaless
Top Related