Summarizing Literature Who’s Afraid of the Big Bad Wolf, the Big Bad Wolf, the Big Bad Wolf…
Big Data and Bad Analogies
-
Upload
mark-madsen -
Category
Internet
-
view
3.149 -
download
0
Transcript of Big Data and Bad Analogies
-
Big Data, Bad Analogies
GOTO Copenhagen September, 2014 Mark Madsen www.ThirdNature.net @markmadsen
http://www.thirdnature.net/
-
Copyright Third Nature, Inc.
The problem with bad framing
s
Leads to bad assumptions about use, inappropriate features,
poor understanding of substitutability and the impacts it will have.
-
Copyright Third Nature, Inc.
The data lake
-
Copyright Third Nature, Inc.
The data lake after a little while
-
Copyright Third Nature, Inc.
Data Exhaust
-
Copyright Third Nature, Inc.
Data is the new oil
-
Copyright Third Nature, Inc.
Reality: data is a choice
-
Copyright Third Nature, Inc.
There is nothing new under the sun but there are lots of old things we don't know.
Ambrose Bierce
http://www.goodreads.com/photo/author/14403.Ambrose_Bierce
-
Copyright Third Nature, Inc.
Looking at past ways of organizing data
-
Copyright Third Nature, Inc.
The Elizabethan Era
Commercial printing presses
Data management tech:
Perfect copies
Topical catalogs
Font standardization
Taxonomy ascends
Information explosion:
8M books in 1500
200M by 1600
Commoditization
Overload
-
Copyright Third Nature, Inc.
Elizabethan Era Storage and Retrieval
-
Copyright Third Nature, Inc.
Elizabethan Era Storage and Retrieval
-
Copyright Third Nature, Inc.
Elizabethan Era Storage and Retrieval
-
Copyright Third Nature, Inc.
The Georgian Era: The Explosion of Natural Philosophy
-
Copyright Third Nature, Inc.
The Victorian Era
The powered printing information explosion:
Card catalogs, cross-referencing, random access metadata
Universal classification
Extended information management debates
Trading effort and flexibility for storage and retrieval
Stereotyping
-
Copyright Third Nature, Inc.
Melvil Dewey
Dewey Decimal System
Top down orientation
Static structure
Descriptive rather than explanatory
Taxonomic classification
-
Copyright Third Nature, Inc.
Cutter Expansive Classification System (~1882)
Bottom up orientation
More flexible structure
Explanatory, descriptive
Faceted classification
Charles Ammi Cutter
-
Copyright Third Nature, Inc.
So why did Dewey beat Cutter?
Pragmatism
Good enough wins the day
It wasnt solving the problem you thought it was.
X In every choice, something is lost when something is gained.
-
Copyright Third Nature, Inc.
What lessons does this history teach us?
1. Information requires organizing principles at multiple levels from items to collections.
2. Differences in scale require different principles.
3. At a key point in the adoption cycle, emphasis shifts from management of information to its dissemination and consumption.
First we record, then we use and share.
Like transaction processing and analysis.
-
Copyright Third Nature, Inc.
What has this to do with data and persistence?
schema is a broad term, a way of organizing and making something relatable and findable.
Data (or object) is to Database
as
Books are to Library
-
Copyright Third Nature, Inc.
The printed became more important than the printer.
The book outlived generations of presses.
Just like data is now.
Which means we should pay attention to the broader organization and use of data, and persistence layers.
-
Copyright Third Nature, Inc.
Order Entry
Order Database
Customer
Service
Interface Program
Inventory Database
Distribution
Interface Program
Receivables Database
Accounts
Receivable
Data Warehouse
Analysts &
users
Someone else always wants to use your data
-
Copyright Third Nature, Inc.
Context (one company)
"In an infinite universe, the one thing sentient life cannot afford to have is
a sense of proportion. Douglas Adams
-
Copyright Third Nature, Inc.
Monthly
Production plans
Weekly pre-
orders for
bulk cheese
Availability
confirmation
and location
In store system
Store
Stock
Management
Store EPOS
dataCategory
Supervis
or
Stock
adjustments/
order
interventions
Order
adjustment
Stock/order
interventions
*
*
Orders
(based on 6
day
forecast)
Dallas
Distrib Centre
WMS
Picking/load
teams
Pos/Pick
lists/Load
sheetsConfirmed
Deliveries/
Confirmed
picks +
loads
FarmersMilk intake/
silos Cheese plant
Plant
Processor
In-house Cheese
store
Contract Cheese
store
Processor
Packing plant
Processor
National
Distribution
Centre
Retailer
RDC
Retailer Stores
(550)
Retailer HQ
Consolidated
Demand
Ordering
Processor NDC
Customer
Services
Daily order -
SKU/Depot/
Vol
Sent @ 12.30-13:00
Delivery
orders
Processor HQ
Sales
Team/
Account
Manager
Processor HQ
Forecasting
Team
Processor HQ
Bulk Planning
Team
Cheese plant
Planner/Stock
office
Processor HQ
Milk
Purchasing
Team
Cheese plant
Transport
Manager
Actual
daily
delivery
figures
Daily
collection
planning
Weekly order for delivery to
Packing plant
Daily &
weekly Call-
off
Daily Call-off
15/day
22 pallet loads 15/day
A80
Shortages/
Allocation
instructions
Annual
Buying plan
Milk Availability
Forecast
Annual
prediction
of milk
production
Shortages/
Allocation
instructions
Daily milk
intake
Weekly milk
shortages
shortages
Spot mkt or
Processor
ingredients
Packing plant
Planning
Team
Processor HQ
JBA Invoicing
and
Sales Monitor
FGI and Last 5
weeks sales
Expedite
Changes
to existing
forecast -
exceptions
Retailer HQ
Retailer Buyer
Meeting
every 6
weeks
Packing plant
Cheese
ordering
10 day stock
plan
On line
stock info
7 day order
plan for bulk
cheese
Arrange
daily
delivery
schedule
Emergency
call-off
Daily
optimisation
of loads
Service
Monitor
Despatch and
delivery
confirmations
Processor NDC
Transport
Planning
Transport
Plan
Processor NDC
Inventory
MonitoringStock and delivery
monitoring
Processor NDC
Warehouse
management
syatem
Operation
Instructions
Key
Shaded Boxes = Product flow system
Un-shaded boxes = Information flow system
Retailer Cheese ProcessorFarms
Schedule
weekly &
Daily
10 Day
plan(wed) and
daily plan
15/day
Changes
to existing
forecast -
exceptions
Stock
availabilityMonthly
review
Annual
f/cast
Source: IGD Food Chain Centre, February 2008
Context (multiple company supply chain) A value chain diagram, showing the data supply chain for cheese.
The side effects of a single bug can be massive.
-
Copyright Third Nature, Inc.
Aside: technical debt: what those diagrams show
tek-ni-kuh l det: the cost that accrues due to decisions made in software design and coding.
Look at the choices and mistakes in development:
Intentional Unintentional
Purposeful
choices to
optimize
schedule,
budget,
satisfaction
Missed
requirements,
poor code
quality, poor
design
-
Copyright Third Nature, Inc.
It is a poor carpenter who blames his tools*
*but sometimes it is the tools
-
Copyright Third Nature, Inc.
Technical Debt
The cost of some choices can be dealt with in the short term (e.g. the next sprint) and some only in the long term (redesign, start over)
Short term
Long term
Mostly about the
application code
Mostly about architecture,
design, and infrastructure
-
Copyright Third Nature, Inc.
Code flaws
(i.e. bugs)
Design flaws
If you enter into decisions knowing the true nature of your coding alternatives, you will be better off
Green: these are deliberate, the tradeoffs known
Yellow : these are minor defects
Red: these are the things that kill a system
Short term
Long term
Intentional Unintentional
Code choices
Design
choices
-
Copyright Third Nature, Inc.
Technical Debt cant be avoided
Development
methods
Experience,
education
(but it can be managed)
Sometimes you think its intentional: incremental design
Long term debts can only be dealt with through planning
Short term
Long term
Intentional Unintentional
Agile
methods
Redesign
-
Copyright Third Nature, Inc.
Technical Debt cant be avoided
Lets move this
to the left
What you believe about the technology underlying your system has a big influence on design choices, so the focus of this part is on architecture and design with the hope it will help reduce or avoid long term debt.
Short term
Long term
Intentional Unintentional
-
Copyright Third Nature, Inc.
How did we get here?
Theres a difference between having no past and actively rejecting it.
-
Copyright Third Nature, Inc.
MultiValue, Hierarchical
PICK
IMS
IDS
ADABAS
Relational
CODASYL
System R (SEQUEL)
SQL/DS
INGRES (QUEL)
Mimer
Oracle
RDBMS, SQL standard
DB2
Teradata
Informix
Sybase
Postgres
OODBMS, ORDBMS
Versant
Objectivity
Gemstone
Informix*
Oracle*
MPP Query, NoSQL
Netezza
Paraccel
Vertica
MongoDB
CouchBase
Riak
Cassandra
NewSQL
SciDB
MonetDB
NuoDB
CitusDB
1960s 1980s 2000s
1970s 1990s 2010s
Spanner
F1
-
Copyright Third Nature, Inc.
In the beginning: RMSs and pre-relational DBs
At first, common code libraries so there was reusability for file ops.
Problems:
Portability across languages, OSs
Queries of more than one file
No metadata, whats in there? Who wrote it?
Concurrency
The databases brought things like recoverability, durability, ACID transactions. But they were rigid, prone to breakage.
Operations:
First
Next
Prev
Last
-
Copyright Third Nature, Inc.
What is
best in life?
-
Copyright Third Nature, Inc.
To crush the vendors,
see them driven
before you, and hear
the lamentation of
their salespeople.
-
Copyright Third Nature, Inc.
Wrong!
Loose coupling.
Scalability.
Reusability.
-
Copyright Third Nature, Inc.
The miracle of pre-relational DB: schema
Loose coupling the physical model of data structures and physical placement are no longer a programs responsibility
Reusability More than one program can access the same data, and no more custom coding for each application or OS
Scalability Constraints of schema and typing reduce resource usage, have finer granularity for concurrent access, multiple online users.
-
Copyright Third Nature, Inc.
Key schema flexibility tradeoff for data management
Global validation vs
contextual validation
=
Strict rules vs
lenient rules =
Write rules vs
read rules
-
Copyright Third Nature, Inc.
Schema on write vs schema on read
Match the shape to the hole
or
Match the hole to the shape
Predicate schemas for write flexibility (agility) and speed
-
Copyright Third Nature, Inc.
Flexibility a recent experience in query
The problem with many of these pr-relational databases is tight coupling between a program and data structures.
The physical model leaks into the logical with potentially career-ending effects if the DB is used for the wrong thing.
And its happening again.
Its a poor carpenter who blames his tools. Or the users.
-
Copyright Third Nature, Inc.
Trading for consistency for performance: CAP theorem
http://blog.nahurst.com/visual-guide-to-nosql-systems
-
Copyright Third Nature, Inc.
Services provided
Standard API/query layer*
Transaction / consistency
Query optimization
Data navigation, joins
Data access
Storage management Database
Database
Tradeoffs: In NoSQL the DBMS is in your code
SQL database NoSQL database
Application Application
Anything not done by the DB becomes a developers task.
Welcome to 1985!
-
Copyright Third Nature, Inc.
Simplifying ACID vs BASE
Remember: its a poor carpenter who blames his tools.
-
Copyright Third Nature, Inc.
Google on eventual consistency:
F1: A Distributed SQL Database That Scales, Proceedings of the VLDB Endowment, Vol. 6, No. 11, 2013
-
Copyright Third Nature, Inc.
Party like its 1985
Fastest TPC-B benchmark in 1985 was IMS running on an IBM 370, 100 TPS, 400 iops, 30 iops/disk
The best relational vendors could muster was 10 TPS
25 years later, SQLServer on an Intel box ran the TPC-B at 25,000 TPS, 100,000 iops, 300 iops/disk
It looks like youre
running a benchmark...
-
Copyright Third Nature, Inc.
Joins. Wait, theres more than one?
Wait, theres more
than one?
1986: SQL, joins!
-
Copyright Third Nature, Inc.
Hipster bullshit
I cant get MySQL to scale
therefore
Relational databases dont scale
therefore
We must use NoSQL* for everything *including Hadoop and related
-
Copyright Third Nature, Inc.
Enumerate logically equivalent plans by applying
equivalence rules
For each logically equivalent plan, enumerate all
alternative physical query plans
Estimate the cost of each of the alternative physical
query plans
Run the plan with lowest estimated overall cost
2
1
3
4
Logical vs physical is an important thing. The CBO turns a SQL query into an optimal* execution plan for a parallel pipelined dataflow engine.
Diagram: David J. DeWitt
The relational gift: declarative language + CBO
-
Copyright Third Nature, Inc.
A simple 3 table join: a programmers job?
SELECT C.name, O.num
FROM Orders O, Lines L, Customers C
WHERE C.City = Copenhagen AND L.status = X
AND O.num = L.num AND C.cid = O.cid
Number of logical plans: 9
Ways to join (hash, merge, nested): 3
For each plan, there are multiple physical plans: 36
That makes a total of 324 physical plans, the efficiency of which changes based on cardinality.
-
Copyright Third Nature, Inc.
Scalability? ZOMG just add nodes!
"The most amazing achievement of the computer software industry is its continuing cancellation of the steady and staggering gains made by the computer hardware industry. Henry Peteroski
After all, your database is web scale, isnt it?
-
Copyright Third Nature, Inc.
Just add hardware?
No amount of hardware will make incorrectly coded software run in parallel.
Declarative languages make this easier by turning the problem over to the computer to resolve.
Guess which runs in parallel:
Open cursor C
Loop (Fetch row C)
Open cursor O
Loop (Fetch row O)
Open cursor L
Output (Do-things)
End loop
End loop ...
SELECT C.name, O.num
FROM Orders O, Lines L,
Customers C
WHERE C.City = Aarhus AND
C.cid = O.cid AND O.num =
L.num AND L.status = X
-
Copyright Third Nature, Inc.
A more realistic example: TPC-H query #8 Select o_year,
sum(case
when nation = 'BRAZIL' then volume
else 0
end) / sum(volume)
from
(
select YEAR(O_ORDERDATE) as o_year,
L_EXTENDEDPRICE * (1 - L_DISCOUNT) as volume,
n2.N_NAME as nation
from PART, SUPPLIER, LINEITEM, ORDERS, CUSTOMER, NATION n1,
NATION n2, REGION
where
P_PARTKEY = L_PARTKEY and S_SUPPKEY = L_SUPPKEY
and L_ORDERKEY = O_ORDERKEY and O_CUSTKEY = C_CUSTKEY
and C_NATIONKEY = n1.N_NATIONKEY and n1.N_REGIONKEY = R_REGIONKEY
and R_NAME = 'AMERICA and S_NATIONKEY = n2.N_NATIONKEY
and O_ORDERDATE between '1995-01-01' and '1996-12-31'
and P_TYPE = 'ECONOMY ANODIZED STEEL'
and S_ACCTBAL
-
Copyright Third Nature, Inc.
0%
100%
200%
300%
400%
500%
600%
700%
800%
900%
1000%
1 2 3 4 5 6 7 8 9 10
10% overhead Linear
Just add hardware?
Number of processors
Speedup
(amdahls law)
-
Data Warehouse +
Behavioral
Singularity
Data Warehouse
Semi-Structured
SQL++
Structured
SQL
Low End Enterprise-class System
Contextual-Complex Analytics
Deep, Seasonal, Consumable Data Sets
Production Data Warehousing
Large Concurrent User-base
+ + 150+
concurrent users
500+ concurrent users
Enterprise-class System
5-10 concurrent users
Unstructured
Java/C Structure the Unstructured
Detect Patterns
Commodity Hardware System
6+PB 40+PB 20+PB
Hadoop
Analyze & Report Discover & Explore
Parallel Efficiency and Platform Costs
-
Platform Metrics for Table Scan and Sum, Hadoop vs Teradata
-
Copyright Third Nature, Inc.
There are really three workload classes to consider
1. Operational: OLTP systems
2. Analytic: Query systems
3. Scientific: Computational systems
Unit of focus:
1. Transaction
2. Query
3. Computation
Different problems require different platforms
-
Copyright Third Nature, Inc.
Workloads
OLTP BI Analytics
Access Read-Write Read-only Read-mostly
Predictability Fixed path Unpredictable All data
Selectivity High Low Low
Retrieval Low Low High
Latency Milliseconds
-
Copyright Third Nature, Inc.
A key point worth remembering:
Performance over size performance over complexity
OLTP performance is mostly related to transaction coordination and writing under high concurrency.
BI performance is mostly related to data volume and query complexity.
Analytics performance is about the intersection of these with computational complexity.
-
Copyright Third Nature, Inc.
A history of databases in No notation
1970s: NoSQL = We have no SQL
1980s: NoSQL = Know SQL
2000s: NoSQL = No SQL!
2005s: NoSQL = Not only SQL
2010s: NoSQL = No, SQL!
(R)DB(MS)
-
Copyright Third Nature, Inc.
TANSTAAFL
Technologies are not perfect replacements for one another. Often not better, only different.
When replacing the old with the new (or ignoring the new over the old) you always make tradeoffs, and usually you wont see them for a long time.
-
Copyright Third Nature, Inc.
Unintended consequences
Unintended consequences
-
Copyright Third Nature, Inc.
Away from one throat to choke, back to best of breed
Tight coupling leads to slow changes. The market is not in the tight coupling phase
In a rapidly evolving market, componentized architectures, modularity and loose coupling are favorable over monolithic stacks, single-vendor architectures and tight coupling.
-
Copyright Third Nature, Inc.
Dont follow the market
Some people cant resist getting the next new thing because its new and new is always better.
Many IT organizations are like this, promoting a solution and hunting for the problem that matches it.
Better to ask What is the problem for which this technology is the answer?
Copyright Third Nature, Inc.
-
Copyright Third Nature, Inc.
How we develop best practices: survival bias
We dont need best practices, we need worst failures. Copyright Third Nature, Inc.
-
Copyright Third Nature, Inc.
Summarizing some key points
1. Data lives longer than code. It pays to focus on data when choosing and using persistence layers.
2. Different persistence layer approaches have different tradeoffs (like performance vs app complexity in BASE models).
3. Dont throw away 30 years of engineering unless know why it was put there.
4. Decoupling, reusability (of data) and scalability are nice things to have.
-
Copyright Third Nature, Inc.
References (things worth reading on the way home) A relational model for large shared data banks, Communications of the ACM, June, 1970, http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf
Column-Oriented Database Systems, Stavros Harizopoulos, Daniel Abadi, Peter Boncz, VLDB 2009 Tutorial http://cs-www.cs.yale.edu/homes/dna/talks/Column_Store_Tutorial_VLDB09.pdf
Nobody ever got fired for using Hadoop on a cluster, 1st International Workshop on Hot Topics in Cloud Data ProcessingApril 10, 2012, Bern, Switzerland.
A co-Relational Model of Data for Large Shared Data Banks, ACM Queue, 2012, http://queue.acm.org/detail.cfm?id=1961297
A query language for multidimensional arrays: design, implementation and optimization techniques, SIGMOD, 1996
Probabilistically Bounded Staleness for Practical Partial Quorums, Proceedings of the VLDB Endowment, Vol. 5, No. 8, http://vldb.org/pvldb/vol5/p776_peterbailis_vldb2012.pdf
Amorphous Data-parallelism in Irregular Algorithms, Keshav Pingali et al
MapReduce: Simplified Data Processing on Large Clusters, http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/mapreduce-osdi04.pdf
Dremel: Interactive Analysis of Web-Scale Datasets, Proceedings of the VLDB Endowment, Vol. 3, No. 1, 2010 http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/36632.pdf
Spanner: Googles Globally-Distributed Database, SIGMOD, May, 2012, http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/es//archive/spanner-osdi2012.pdf
F1: A Distributed SQL Database That Scales, Proceedings of the VLDB Endowment, Vol. 6, No. 11, 2013, http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/41344.pdf
http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdfhttp://cs-www.cs.yale.edu/homes/dna/talks/Column_Store_Tutorial_VLDB09.pdfhttp://cs-www.cs.yale.edu/homes/dna/talks/Column_Store_Tutorial_VLDB09.pdfhttp://cs-www.cs.yale.edu/homes/dna/talks/Column_Store_Tutorial_VLDB09.pdfhttp://queue.acm.org/detail.cfm?id=1961297http://vldb.org/pvldb/vol5/p776_peterbailis_vldb2012.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/archive/mapreduce-osdi04.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/archive/mapreduce-osdi04.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/archive/mapreduce-osdi04.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/archive/mapreduce-osdi04.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/pubs/archive/36632.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/pubs/archive/36632.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/es/archive/spanner-osdi2012.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/es/archive/spanner-osdi2012.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/es/archive/spanner-osdi2012.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/es/archive/spanner-osdi2012.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/41344.pdfhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/41344.pdf
-
Copyright Third Nature, Inc.
About the Presenter
Mark Madsen is president of Third Nature, a technology research and consulting firm focused on business use of data and analytics. Mark is an award-winning author, architect and CTO whose work has been featured in numerous industry publications. Over the past ten years Mark received awards for his work from the American Productivity & Quality Center, TDWI, and the Smithsonian Institute. He is an international speaker, a contributor to Forbes Online and on the OReilly Strata program committee. For more information or to contact Mark, follow @markmadsen on Twitter or visit http://ThirdNature.net
-
Copyright Third Nature, Inc.
About Third Nature
Third Nature is a research and consulting firm focused on new and
emerging technology and practices in business intelligence, analytics and
performance management. If your question is related to BI, analytics,
information strategy and data then youre at the right place.
Our goal is to help companies take advantage of information-driven
management practices and applications. We offer education, consulting
and research services to support business and IT organizations as well as
technology vendors.
We fill the gap between what the industry analyst firms cover and what IT
needs. We specialize in product and technology analysis, so we look at
emerging technologies and markets, evaluating technology and hw it is
applied rather than vendor market positions.
-
Copyright Third Nature, Inc.
CC Image Attributions
Thanks to the people who supplied the creative commons licensed images used in this presentation:
bookshelf by spectrum.jpg - http://flickr.com/photos/santos/1704875109/
Vatican library - http://www.flickr.com/photos/paullew/1550844955
round hole square peg - https://www.flickr.com/photos/epublicist/3546059144
text composition - http://flickr.com/photos/candiedwomanire/60224567/
twitter_network_bw.jpg - http://www.flickr.com/photos/dr/2048034334/
round hole square peg - https://www.flickr.com/photos/epublicist/3546059144
glass_buildings.jpg - http://www.flickr.com/photos/erikvanhannen/547701721 CAP diagram - http://blog.nahurst.com/visual-guide-to-nosql-systems
refinery-hdr.jpg - http://www.flickr.com/photos/vermininc/2477872191/
refinery-night.jpg - http://www.flickr.com/photos/vermininc/2485448766/
http://flickr.com/photos/santos/1704875109/http://flickr.com/photos/santos/1704875109/http://www.flickr.com/photos/paullew/1550844955http://www.flickr.com/photos/paullew/1550844955https://www.flickr.com/photos/epublicist/3546059144https://www.flickr.com/photos/epublicist/3546059144http://flickr.com/photos/candiedwomanire/60224567/http://www.flickr.com/photos/dr/2048034334/https://www.flickr.com/photos/epublicist/3546059144https://www.flickr.com/photos/epublicist/3546059144http://www.flickr.com/photos/erikvanhannen/547701721http://www.flickr.com/photos/erikvanhannen/547701721http://blog.nahurst.com/visual-guide-to-nosql-systemshttp://blog.nahurst.com/visual-guide-to-nosql-systemshttp://blog.nahurst.com/visual-guide-to-nosql-systemshttp://blog.nahurst.com/visual-guide-to-nosql-systemshttp://blog.nahurst.com/visual-guide-to-nosql-systemshttp://blog.nahurst.com/visual-guide-to-nosql-systemshttp://blog.nahurst.com/visual-guide-to-nosql-systemshttp://blog.nahurst.com/visual-guide-to-nosql-systemshttp://blog.nahurst.com/visual-guide-to-nosql-systemshttp://blog.nahurst.com/visual-guide-to-nosql-systemshttp://blog.nahurst.com/visual-guide-to-nosql-systemshttp://www.flickr.com/photos/vermininc/2477872191/http://www.flickr.com/photos/vermininc/2477872191/http://www.flickr.com/photos/vermininc/2485448766/http://www.flickr.com/photos/vermininc/2485448766/