An Introduction to NOSQL, Graph Databases and Neo4j
-
Upload
debanjan-mahata -
Category
Technology
-
view
133 -
download
6
description
Transcript of An Introduction to NOSQL, Graph Databases and Neo4j
![Page 1: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/1.jpg)
NOSQL Databases and Neo4j
![Page 2: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/2.jpg)
Database and DBMS
• Database - Organized collection of data• The term database is correctly applied to the
data and their supporting data structures.
• DBMS - Database Management System: a software package with computer programs that controls the creation, maintenance and use of a database.
![Page 3: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/3.jpg)
Types of Happening Databases
• Relational database – nothing new but still in use and it seems it will always be a happening one.
• Cloud databases – everything is cloudy.• Data warehouse – Huge! Huge! Huge! archives.• Embedded databases – you can’t see them :P • Document oriented database – the In thing.• Hypermedia database – WWW.• Graph database – facebook, twitter, social
network.
![Page 4: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/4.jpg)
Not Only SQLNOSQL is simply…
![Page 5: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/5.jpg)
Why NOSQL now?
Driving trends
![Page 6: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/6.jpg)
Trend 1: Data Size
![Page 7: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/7.jpg)
Trend 2: ConnectednessIn
form
ation
con
necti
vity
Text Documents
Hypertext
Feeds
Blogs
Wikis
UGC
Tagging
Folksonomies
RDFa
Onotologies
GGG
![Page 8: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/8.jpg)
Trend 3: Semi-structured information
• Individualisation of content– 1970’s salary lists, all elements exactly one job– 2000’s salary lists, we need many job columns!
• Store more data about each entity• Trend accelerated by the decentralization of
content generation – Age of participation (“web 2.0”)
![Page 9: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/9.jpg)
Trend 4: Architecture
DB
Application
1980’s: Single Application
![Page 10: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/10.jpg)
Trend 4: Architecture
DB
Application
1990’s: Integration Database Antipattern
ApplicationApplication
![Page 11: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/11.jpg)
Trend 4: Architecture
2000’s: SOA
DB
Application
DB
Application
DB
Application
RESTful, hypermedia, composite apps
![Page 12: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/12.jpg)
Side note: RDBMS performanceSalary list
Most Web apps
Social Network
Location-based services
![Page 13: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/13.jpg)
Four NOSQL Categories
![Page 14: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/14.jpg)
Four NOSQL Categories
![Page 15: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/15.jpg)
Key-Value Stores
• “Dynamo: Amazon’s Highly Available Key-Value Store” (2007)
• Data model:– Global key-value mapping– Highly fault tolerant (typically)
• Examples:– Riak, Redis, Voldemort
![Page 16: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/16.jpg)
Column Family (BigTable)
• Google’s “Bigtable: A Distributed Storage System for Structured Data” (2006)
• Data model:– A big table, with column families– Map-reduce for querying/processing
• Examples:– HBase, HyperTable, Cassandra
![Page 17: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/17.jpg)
Document Databases
• Data model– Collections of documents– A document is a key-value collection– Index-centric, lots of map-reduce
• Examples– CouchDB, MongoDB
![Page 18: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/18.jpg)
Graph Databases
• Data model:– Nodes with properties– Named relationships with properties– Hypergraph, sometimes
• Examples:– Neo4j (of course), Sones GraphDB, OrientDB,
InfiniteGraph, AllegroGraph
![Page 19: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/19.jpg)
Why Graph Databases?• Schema Less and Efficient storage of Semi Structured Information• No O/R mismatch – very natural to map a graph to an Object Oriented
language like Ruby.• Express Queries as Traversals. Fast deep traversal instead of slow SQL
queries that span many table joins.• Very natural to express graph related problem with traversals
(recommendation engine, find shortest parth etc..)• Seamless integration with various existing programming languages.• ACID Transaction with rollbacks support.• Whiteboard friendly – you use the language of node,properties and
relationship to describe your domain (instead of e.g. UML) and there is no need to have a complicated O/R mapping tool to implement it in your database. You can say that Neo4j is “Whiteboard friendly” !( http://video.neo4j.org/JHU6F/live-graph-session-how-allison-knows-james/)
![Page 20: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/20.jpg)
Social Network “path exists” Performance
• Experiment:• ~1k persons• Average 50 friends per
person• pathExists(a,b)
limited to depth 4
# persons query time
Relational database
1000 2000ms
![Page 21: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/21.jpg)
Social Network “path exists” Performance
• Experiment:• ~1k persons• Average 50 friends per
person• pathExists(a,b)
limited to depth 4
# persons query time
Relational database
1000 2000ms
Neo4j 1000 2ms
![Page 22: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/22.jpg)
Social Network “path exists” Performance
• Experiment:• ~1k persons• Average 50 friends per
person• pathExists(a,b)
limited to depth 4
# persons query time
Relational database
1000 2000ms
Neo4j 1000 2ms
Neo4j 1000000 2ms
![Page 23: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/23.jpg)
What are graphs good for?• Recommendations• Business intelligence• Social computing• Geospatial• Systems management• Web of things• Genealogy• Time series data• Product catalogue• Web analytics• Scientific computing (especially bioinformatics)• Indexing your slow RDBMS• And much more!
![Page 24: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/24.jpg)
Graphs
![Page 25: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/25.jpg)
Directed Graphs
![Page 26: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/26.jpg)
Breadth First Search
![Page 27: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/27.jpg)
Depth First Search
?????????????????
![Page 28: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/28.jpg)
Graph Databases
• A graph database stores data in a graph, the most generic of data structures, capable of elegantly representing any kind of data in a highly accessible way.
![Page 29: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/29.jpg)
Graphs
• “A Graph —records data in→ Nodes —which have→ Properties”
![Page 30: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/30.jpg)
Graphs
• “Nodes —are organized by→ Relationships —which also have→ Properties”
![Page 31: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/31.jpg)
Query a graph with Traversal
• “A Traversal —navigates→ a Graph; it —identifies→ Paths —which order→ Nodes”
![Page 32: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/32.jpg)
Indexes
• “An Index —maps from→ Properties —to either→ Nodes or Relationships”
![Page 33: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/33.jpg)
Neo4j is a Graph Database
• “A Graph Database —manages a→ Graph and —also manages related→ Indexes”
![Page 34: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/34.jpg)
Neo4j – Hey! This is why I am a Graph Database.
• The fundamental units that form a graph are nodes and relationships.
• In Neo4j, both nodes and relationships can contain properties.
• Nodes are often used to represent entities, but depending on the domain relationships may be used for that purpose as well.
![Page 35: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/35.jpg)
Node in Neo4j
![Page 36: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/36.jpg)
Relationships in Neo4j
• Relationships between nodes are a key part of Neo4j.
![Page 37: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/37.jpg)
Relationships in Neo4j
![Page 38: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/38.jpg)
Twitter and relationships
![Page 39: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/39.jpg)
Properties
• Both nodes and relationships can have properties.
• Properties are key-value pairs where the key is a string.
• Property values can be either a primitive or anarray of one primitive type.
For example String, int and int[] values are valid for properties.
![Page 40: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/40.jpg)
Properties
![Page 41: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/41.jpg)
Paths in Neo4j
• A path is one or more nodes with connecting relationships, typically retrieved as a query or traversal result.
![Page 42: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/42.jpg)
Traversals in Neo4j
• Traversing a graph means visiting its nodes, following relationships according to some rules.
• In most cases only a subgraph is visited, as you
already know where in the graph the interesting nodes and relationships are found.
• Traversal API
• Depth first and Breadth first.
![Page 43: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/43.jpg)
Starting and Stopping
![Page 44: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/44.jpg)
Preparing the database
![Page 45: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/45.jpg)
Wrap mutating operations in a transaction.
![Page 46: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/46.jpg)
Creating a small graph
![Page 47: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/47.jpg)
Print the data
![Page 48: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/48.jpg)
Remove the data
![Page 49: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/49.jpg)
The Matrix Graph Database
![Page 50: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/50.jpg)
Traversing the Graph
![Page 51: An Introduction to NOSQL, Graph Databases and Neo4j](https://reader033.fdocuments.net/reader033/viewer/2022061223/54c6c7424a795938448b4580/html5/thumbnails/51.jpg)
Resources & References
• Neo4j website : http://neo4j.org/• Neo4j learning resources:
http://neo4j.org/resources/• Videos about Neo4j: http://video.neo4j.org/• Neo4j tutorial:
http://docs.neo4j.org/chunked/snapshot/tutorials.html
• Neo4j Java API documentation: http://api.neo4j.org/current/