IIMB presentation
-
Upload
aveekshith-bushan -
Category
Technology
-
view
108 -
download
0
Transcript of IIMB presentation
Speed @ Scale with NoSQLAveekshith BushanRegional Sales and SA Director - [email protected] Twitter: @aveekshith
2Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Then and now!
3Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Volume, Variety and Velocity
4Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Scale - Closer to Home
1956IBM 350 Hard Disk5MB of storageSystem Cost: 160K$
1980IBM 33801GB of storageCost: 50K$
2015Multiple Options1TB of storageCost: 0.8K$
5Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Over the Years – Scale!
6Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Scale Changes Everything!
Source: The Black Swan by Nassim Nocholas Taleb
7Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
The Black Swan Effect
8Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
“Known” and “Unknown” Unknowns!
Known Unknowns• Can be Planned For• Through BCP, Risk Matrix etc
Unknown Unknowns• Difficult to Model and Foresee• Impact can be reduced by
Diversification Across Investments, Business, Markets and Product Types
9Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
What Does it Mean – IT Perspective
Positive Black Swans• Explosion in Data• Exposure to Different
Types of Data• Agility in IT
Infrastructure• Ex: Successful New
Product or Market Launch
Negative Black Swans• Globally Distributed IT
Infrastructure• No Vendor Lock-In• Easy Deployment
Models• Ex: Natural or Man-
made Disasters, Market Changes
Gaussian World• Structured Data• Predictable Growth in
Data Volume• Lower Cost of Overall
Operation• Ex: Traditional
Applications
10Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Positive Black Swans - Data
Positive Black Swans• Explosion in Data• Exposure to Different
Types of Data• Agility in IT
Infrastructure• Ex: Successful New
Product or Market Launch
Horizontal Scalability
Dynamic Data Model
PerformanceAgility
Geospatial Information
11Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Negative Black Swans - Data
Negative Black Swans• Globally Distributed
IT Infrastructure• No Vendor Lock-In• Easy Deployment
Models• Ex: Natural or Man-
made Disasters, Market Changes
Geographically Distributed
Clusters
Built on Commodity Hardware
Cloud-ReadyFlexible Data Model
Low Cost Solution
12Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Gaussian World - Data
Gaussian World• Structured Data• Predictable Growth
in Data Volume• Lower Cost of Overall
Operation• Ex: Traditional
Applications
Consistency
Query Model
Structured Data
Manageability
Ecosystem
13Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Real World ERD Diagram
14Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Familiar World!
ORM Relational DB
15Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Making Changes
New Table New
Table
16Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
What you don’t get with Relational Databases!
• Unstructured Data
• Semi-structured Data
Data Types
• Speed at Scale
• Petabytes Scale
Volume• Quick Time to Market
• Agile Development
Agility
• Cloud Ready• Scale-out and Scale-up
Deployment Models
17Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
NoSQL Types
Key Value StoresDocument StoresColumnar StoresGraph StoresOther Stores
Time-SeriesNew SQLSSD Optimized DBsIn-Memory Stores
18Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
SSD
Key Value Store
Relational Key Value Store
F_Name
L_Name
Dept
Location
Skill_Details
John Marsh E11 [45.123,47.232]
{ Skill_Name: ‘Java’, Version: ‘1.8’, Level:3, … },{Skill_Name: ‘Go’, Version: ‘1.7’, Level:2, … }
0 Memory
Ex: Aerospike, Redis
Emp_ID F_Name L_Name Dept City1 John Marsh E11 New York
2 Satish Rao E12Bengaluru
3 Alok Jain E12New Delhi
4 Raghu G E11BengaluruSkill_ID Skill_Name Version
1 Java 1.82 Go 1.73 Python 3.5
ID Emp_ID Skill_ID Level100 1 1 3101 1 2 2102 2 2 3103 3 1 4104 4 3 1
19Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Document Stores
Document DB{ F_Name: ‘John’, L_Name: ‘Marsh’ city: ‘New York’, location: [45.123,47.232], skills: [ { Skill_Name: ‘Java’, Version: ‘1.8’, Level: 3, … }, { Skill_Name: ‘Go’, Version: ‘1.7’, Level: 2, … } ]}
Ex: MongoDB, CouchDB, OrientDB
RelationalEmp_ID F_Name L_Name Dept City
1 John Marsh E11 New York
2 Satish Rao E12Bengaluru
3 Alok Jain E12New Delhi
4 Raghu G E11BengaluruSkill_ID Skill_Name Version
1 Java 1.82 Go 1.73 Python 3.5
ID Emp_ID Skill_ID Level100 1 1 3101 1 2 2102 2 2 3103 3 1 4104 4 3 1
20Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Hadoop and NoSQL
Hadoop is a Map/Reduce FrameworkUsed to partition computation on large datasetsUsed where you need to analyse most of the dataE.g.
Count all the links on all the web pages in IndiaAnalyse the recommendations based on yesterdays purchasesUse a connector to Push and Pull Data from Hadoop in to NoSQL
MONGODB
22Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Architecture
AEROSPIKE
24Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Architecture
1) No Hotspots – Distributed Hash Table simplifies data partitioning
2) Smart Client – 1 hop to data, no load balancers
3) Shared Nothing Architecture, every node is identical
6) XDR – sync replication across data centers ensures Zero Downtime
4) Smart Cluster, Zero Touch – auto-failover, rebalancing, rolling upgrades
5) Operations and long-running tasks prioritized in real-time
25Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Data is Distributed Randomly
Every key is hashed into a 20 byte (fixed length) string using the RIPEMD160 hash function
This hash + additional data (fixed 64 bytes)are stored in RAM in the index
12 bits of this hash are used to compute the partition id
There are 4096 partitions
Partition id maps to node id based on cluster membership
cookie-abcdefg-12345678
182023kh15hh3kahdjsh
PartitionID
Master node
Replica node
… 1 4
1820 2 3
1821 3 2
4096 4 1
26Proprietary & Confidential | © 2015 Aerospike Inc. All rights reserved. [ ]
Even record distribution
Node A Node B Node C
Z
Z’
Y
Y’
X
X’
AerospikeClientApplication
Thank You!