Object-relational Databases Extend relational databases with OO features 1.
Data stores: beyond relational databases
-
Upload
javier-garcia-magna -
Category
Software
-
view
563 -
download
5
Transcript of Data stores: beyond relational databases
• Relational vs. NoSQL
• Definitions and examples
• Other database classifications
• 9 Databases in 40 minutes!
• Polyglot Persistence
• Some statistics
• Summary
SQLCommercial example: Oracle | OS example: (Oracle) MySQL
NoSQL“Mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases.”
“Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable.”
NoSQL systems are also sometimes called "Not only SQL".
SQL? ACID? Relations? Distributed?
Commercial example: DynamoDB | OS example: MongoDB
NewSQLModern relational database management systems that seek to provide the samescalable performance of NoSQL systems for online transaction processing (OLTP) read-write workloads while still maintaining the ACID guarantees of a traditional database system.
OS example: VoltDB
Y
AX B
NoSQL vs. SQL vs. NewSQL
Wikipedia
No-sql.org
More Database classifications
On premises vs. Cloud “As a service” (Azure DocumentDB)
Memory / Disk vs. Only in memory (OrigoDB, Redis, SQL Server)
OLTP vs. OLAP
Databases vs. Not a database but a data store (Zookeeper, Kafka)
CAP classifications
In action…
Key-value stores (Redis)
Document stores (RavenDB …ok, MongoDB)
Wide column stores (Cassandra)
Graph DBMS (Neo4j)
Search engines (Elastic Search)
Time Series DBMS (InfluxDB)
Event Stores (Event Store)
MultiModel (OrientDB)
Relational DBMS (MS SQL Server 2016)
Use cases…
Show latest itemsCount itemsLeaderboardsUnique itemsPub/SubQueuesCacheAs the main database
Key Value
Use cases…
Log dataProduct catalogMetadata / asset managementCMSPrototypingAs the main database
Document Store
Use cases…
Time series analyticsHuge # writesAs the main database(for big data storage!)
Wide Column
CQL vs. Internal structure (Cassandra CLI)
cqlsh:test> SELECT * FROM tweets;user | time | lat | long | tweet--------------+--------------------------+--------+---------+---------------------softwaredoug | 2013-07-13 08:21:54-0400 | 38.162 | -78.549 | Having chest pain.softwaredoug | 2013-07-21 12:15:27-0400 | 38.093 | -78.573 | Speedo self shot.
jnbrymn | 2013-06-29 20:53:15-0400 | 38.092 | -78.453 | I like programming.jnbrymn | 2013-07-14 22:55:45-0400 | 38.073 | -78.659 | Who likes cats?jnbrymn | 2013-07-24 06:23:54-0400 | 38.073 | -78.647 | My coffee is cold.
[default@test] list tweets;-------------------RowKey: softwaredoug=> (column=2013-07-13 08\:21\:54-0400:, value=, timestamp=1374673155373000)=> (column=2013-07-13 08\:21\:54-0400:lat, value=4218a5e3, timestamp=1374673155373000)=> (column=2013-07-13 08\:21\:54-0400:long, value=c29d1917, timestamp=1374673155373000)=> (column=2013-07-13 08\:21\:54-0400:tweet, value=486176696e67206368657374207061696e2e, timestamp=1374673155373000)=> (column=2013-07-21 12\:15\:27-0400:, value=, timestamp=1374673155407000)=> (column=2013-07-21 12\:15\:27-0400:lat, value=42185f3b, timestamp=1374673155407000)=> (column=2013-07-21 12\:15\:27-0400:long, value=c29d2560, timestamp=1374673155407000)=> (column=2013-07-21 12\:15\:27-0400:tweet, value=53706565646f2073656c662073686f742e, timestamp=1374673155407000)-------------------RowKey: jnbrymn=> (column=2013-06-29 20\:53\:15-0400:, value=, timestamp=1374673155419000)=> (column=2013-06-29 20\:53\:15-0400:lat, value=42185e35, timestamp=1374673155419000)=> (column=2013-06-29 20\:53\:15-0400:long, value=c29ce7f0, timestamp=1374673155419000)=> (column=2013-06-29 20\:53\:15-0400:tweet, value=49206c696b652070726f6772616d6d696e672e, timestamp=1374673155419000)=> (column=2013-07-14 22\:55\:45-0400:, value=, timestamp=1374673155434000)=> (column=2013-07-14 22\:55\:45-0400:lat, value=42184ac1, timestamp=1374673155434000)=> (column=2013-07-14 22\:55\:45-0400:long, value=c29d5168, timestamp=1374673155434000)=> (column=2013-07-14 22\:55\:45-0400:tweet, value=57686f206c696b657320636174733f, timestamp=1374673155434000)=> (column=2013-07-24 06\:23\:54-0400:, value=, timestamp=1374673155485000)=> (column=2013-07-24 06\:23\:54-0400:lat, value=42184ac1,
user – partition key time – clustering key
Use cases…
General data managementNetwork and IT operationsRecommendation enginesFraud detectionSocial networks
Graph DBs
Just a few slides remaining…
Some C# code… log4net + ElasticSearch + Kibana{
"settings": {"index": {
"number_of_shards": 1,"number_of_replicas": 0
}},"mappings": {
"LogEvent": {"properties": {
"timeStamp": {"type": "date","format": "dateOptionalTime"
},"message": {
"type": "string"},"messageObject": {
"type": "object"},"exception": {
"type": "object"},
….
2 ElasticSearch general purpose libraries for .Net:• Nest – High level• ElasticSearch.Net – Low level
C# + InfluxDB + Grafana + … IoT?
InfluxDB + Grafana <> ElasticSearch + KibanaTime series (metrics) <> Structured data, e.g. logs
The Beast
• SQL and NoSQL (JSON support)
• In-Memory tables
• Row level security
• Always Encrypted
• Query Store
• Polybase Hadoop / Azure blob storage
Polyglot persistence
Any decent sized enterprise will have a variety of different data storage technologies for different kinds of data
Key Takeaways
Always think about the schema
(even with schema less DBs)
Best DB? “It depends”
• Prototyping?
• Domain?
• How the data is going to be used?
Most of us don’t work with “big data” but “small or medium”
Docker images usedspotify/cassandrabalsamiq/docker-elasticsearchbalsamiq/docker-kibanatutum/influxdbneo4j/neo4jwkruse/eventstoreredis
Resources
Different DB images: https://www.thoughtworks.com/insights/blog/nosql-databases-overview
Polyglot persistence images: http://www.slideshare.net/mongodb/webinar-mongodb-and-polyglot-persistence-architecture
DATABASE NAME AVAILABLE FOR WINDOWS?
Redis Yes (C)
MongoDB Yes (C++)
Cassandra Yes (Java)
Neo4j Yes (Java)
ElasticSearch Yes (Java)
InfluxDB Yes (Go)
EventStore Yes
OrientDB Yes (Java)
SQL Server Yes (C++)