Deploying Massive Scale Graphs for Realtime Insights
-
Upload
neo4j-the-fastest-and-most-scalable-native-graph-database -
Category
Software
-
view
66 -
download
1
Transcript of Deploying Massive Scale Graphs for Realtime Insights
Deploying massive scale forgraphs for realtime insights
B Brech
CTO POWER Solutions
2
2
© 2016 International Business Machines Corporation
Drive Efficiency- Time Reduction- Cost Reduction- Consistency
Better Insights- Broader Scope- Learning Models- Speed & Accuracy
Better Business- Innovation- Customer Care- Reactivity
Business relies more on Data than ever before
3
3
© 2016 International Business Machines Corporation
1990’s
2020’s
Video
Text
Exa
Peta
Tera
Giga
Da
taV
olu
me
2000’s
2010’s
Structured data
Audio
Image
Med
High
Low
Co
mp
uta
tio
na
lN
ee
ds
So
ph
isti
ca
tio
no
fA
na
lys
is
Ex
pre
ssiv
en
es
s
Digital Marketing
10+% of videoviews
Wide Area Imagery
100’s TB per day72 videohrs/minute
MediaSource: IBMMarket Insightsbased oncompositesources
Safety / Security
Healthcare
Customer
1B camera phones
1B medical images/yr
10s millions cameras
Enterprise Video
Used by 1/3 ofenterprises
Data VolumeData Velocity
Data AuthenticityData Complexity
Data VariabilityData Variety
While Data is Exploding
4
4
© 2016 International Business Machines Corporation
Time is Moneyand
Insights are critical
Ingest Analyze Act Measure Learn
Optimize
Decision time is shrinking
5
5
© 2016 International Business Machines Corporation
Recommendation engines- used in variety of industries
Network intrusion prevention
Fraud prevention
Financial Services
BioMedical - Genomics
Combination of Scale & Speed is criticalin many use cases
Extreme Scale Example:- 30TB and growing DB- 25 BG/s ingress- over 400K updates / Sec- 60B+ relationships- Query Response < 200ms
6
6
© 2016 International Business Machines Corporation
DB2 > DB2Blu
SAP > SAP Hana
Oracle > 12C
CICS
EnterpriseDB
Etc..
NoSQLs :
MemCached, REDIS,
NEO4J, CASSANDRA,
MARIA, MONGO,
ORIENT, COUCH,
Etc…
Traditional DBs
going in-memoryDesigned as
in-memory repositories
AnalyzeDecision
Innovation
Act
Ingest
But in-memory has some constraints and limits.
Data repositories are changing also
7
7
© 2016 International Business Machines Corporation
Built with open innovation toput your data to work across the enterprise
Designed forBig Data
OpenInnovationPlatform
Superior CloudEconomics
IBM POWER8 : Designed for Big Data
8
8
© 2016 International Business Machines Corporation
UNSTRUCTURED IN-MEMORY STRUCTURED
Flash for extremeperformance
Massive IObandwidth
Continuous
data load
Parallelprocessing
Large-scalememory processing
Optimized for a broad range of big data & analytics workloads:
Processorsflexible, fast execution of
analytics algorithms
Memorylarge, fast workspace to
maximize business insight
Cacheensure continuous data load
for fast responses
4Xthreads per core vs. x86
(up to 1536 threads per system)
4Xmemory bandwidth vs. x861
(up to 16TB of memory)
4Xmore cache vs. x862
(up to 800MB cache per socket)
IBM POWER8 brings performance and scale
9
9
© 2016 International Business Machines Corporation
POWER Ecosystem
Designedfor Big Data
WorkloadAcceleration
Definedby Software
Retail Healthcare
Banking Government Telecom
Open andCollaborative
Technology &Price/PerfLeadership
Watson
LinuxHadoop
POWER8
Hypervisor
Virt I/O ServerShared I/O
Single SMP Hardware System
Built inVirtualization
LeadingPerformance
ProcessorInnovation
Streams
Foundations
SuzhouPowerCoreTechnology
VirtualizationOfferings
Key solutions:+Open Source Tools+Middleware+Industry Solutions+ Social / Mobile / Analytics / Cloud
HadoopSpark
10
10
© 2016 International Business Machines Corporation
Fundamental forces are acceleratingindustry change
IT innovation can no longercome from just the processor
Solution Innovation andAcceleration is a key to
the future
Price/P
erf
orm
ance
Full system stack innovationrequired
Moore’s Law
Technology andProcessors
2000 2020
Firmware / OSAcceleratorsSoftwareStorageNetwork
Full StackAcceleration (Lower
isbetter)
The OpenPOWER Foundationis an open ecosystem,
using thePOWER Architecture to serve
the evolving needs ofcustomers.
11
11
© 2016 International Business Machines Corporation
NVLINK
GPUFPGA
Flash NIC
MRAM PCM
Solution Acceleration is a key to the future
12
12
© 2016 International Business Machines Corporation
NVLINK
GPU
Flash
Graphics – CAE - EDA
Weather
Defense
Financial Services
Bio-Sciences
General: Compression
Encryption
DataBases: Flash
Finance: Algorithms, Facial
Genomics : Algorithms
DecisionSupport
DataAnalytics
FinancialSimulations
GenomicAnalysis
NetworkData Forensics
FacialRecognition
Solution Acceleration is a key to the future
13
13
© 2016 International Business Machines Corporation
IBM Data Engine for NoSQL is an integrated platform for large and fast growing NoSQL datastores. It builds on the CAPI capability of POWER8 systems and provides super-fast access to
large flash storage capacity. It delivers high speed access to both RAM and flash storage whichcan result in significantly lower cost, and higher workload density for NoSQL deployments than astandard RAM-based system. The solution offers superior performance and price-performance toscale out x86 server deployments that are either limited in available memory per server or have
flash memory with limited data access latency.
Up to 56TB of extended memory with one POWER8 server + CAPI attach FLASH
Power S822L /S812L
Flash System 900Power S822L / S812L / S822 LC
NEW
External Flash Configuration Integrated Flash Configuration
Up to 8TB of super-fast storage tier on one POWER8 server
IBM Data Engine for NoSQLCost Savings for In-Memory NoSQL Data Stores
14
14
© 2016 International Business Machines Corporation
Identical hardware with 3 differentpaths to data
FlashSystem
ConventionalI/O (FC) CAPI - E
IBM POWER S822L
CAPI - I
IBM's CAPI NVMe Flash Accelerator is almost 5X moreefficient in performing IO vs traditional storage.
21%
35%
56%
100%
0%
25%
50%
75%
100%
CAPI NVMe Traditional NVMe Traditional Storage -Direct IO
Traditional Storage -Filesystem
Relative CAPI vs. NVMe Instruction Counts per IO
Kernel Instructions User Instructions
ONCAPI Unlocks the Next Level
of Performance for Flash
15
15
© 2016 International Business Machines Corporation
ONEfficient IO Enables True Utilization
of Storage Bandwidth
Under heavy load, IOPs per threadbecomes a critical metric for sustainingthroughput in a storage system. Asthroughput increases, more CPU is requiredto maintain performance.
CAPI NVMe flash leverages improved pathlength, architectural improvements, andhardware built-in to POWER8 to greatly-improve the relative IOPs per CPU thread.
At high levels of IO (sustained millions ofIOPs), more data can be processed moreefficiently, radically changing the amount ofCPU required to “feed the (IO) beast.”
0.6X
1X
2.6X
3.7X
0%
100%
200%
300%
400%
Fibre Channel NVMe CAPI Fibre Channel CAPI NVMe
Average Relative IOPs per CPU Thread
CAPI-accelerated NVMe Flash can issue 3.7X more IOsper CPU thread than regular NVMe flash.
16
16
© 2016 International Business Machines Corporation
Neo4j + IBM POWER8:Unparalleled Scale and Performance
Neo4j on IBM POWER8
The strength and tooling of Neo4j
The performance of POWER8
The scalability of POWER8 & CAPIFlash
Unrivaled graph applicationscalability and performance
ON
© 2016 IBM Corporation
Real-World mixed graph transaction workloadrunning Neo4j on POWER8 delivers 1.82X better
performance than Intel Xeon E5-2650 v4 Broadwell
711
390
0
100
200
300
400
500
600
700
800
POWER8 x86
Re
pre
se
nta
tive
mix
ed
wo
rklo
ad
Th
rou
gh
pu
t
IBM Power S822LC (20c/160t) x86 Broadwell Server (24c/48t)
82%More
Throughput
• POWER8 delivers 1.82X morequery throughput for arepresentative mixed sampleworkload than x86
– POWER8 (20 cores / 256 GB):
– x86 system with Broadwellprocessor (24 cores / 256 GB):
•Based on IBM internal testing of single system and OS image running a real-world mixed graph transaction workload based on LDBC benchmark. Conducted under laboratory condition, individual result can vary based on workload size, use of storagesubsystems & other conditions.• IBM Power System S822LC; 20 cores (2 x 10c chips) / 160 threads, POWER8; 256 GB memory, Neo4j, Ubuntu 16. Competitive stack: HP Proliant DL380 Gen9; 24 cores (2 x 12c chips) / 48 threads; Intel E5-2650 v4; 256 GB memory, Neo4j, RHEL 7.2 .
Pricing is based bundled pricing for S822LC with Integrated CAPI Flash card.
© 2016 International Business Machines Corporation 18
Scale up and/or out based on yourapplication requirements
• Out-of-order, super-scalar design forexploiting instructionlevel parallelizationleading to low CPI
• Larger caches and99.94% data-cachehit rate
• SMT design to improvecore efficiency andincrease throughputcapability
Use the paradigm shift to realize yourimagination
CA
PI-
Fla
sh
Performance and Scale as YOU Need ON
Open innovation to put data to workacross the enterprise
Thanks!
© 2016 International Business Machines Corporation 19
© Copyright International Business Machines Corporation 2016
Printed in the United States of America September 2016
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp.,registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies.A current list of IBM trademarks is available on the Web at “Copyright and trademark information” atwww.ibm.com/legal/copytrade.shtml.
The following terms are trademarks or registered trademarks licensed by Power.org in the United States and/or other countries: Power ISA.Information on the list of U.S. trademarks licensed by Power.org may be found at www.power.org/about/brand-center/.Linux is a trademark of Linus Torvalds in the United States, other countries, or both.Other company, product, and service names may be trademarks or service marks of others.
All information contained in this document is subject to change without notice. The products described in this documentare NOT intended for use in applications such as implantation, life support, or other hazardous uses where malfunctioncould result in death, bodily injury, or catastrophic property damage. The information contained in this document does notaffect or change IBM product specifications or warranties. Nothing in this document shall operate as an express or impliedlicense or indemnity under the intellectual property rights of IBM or third parties. All information contained in this documentwas obtained in specific environments, and is presented as an illustration. The results obtained in other operatingenvironments may vary.
While the information contained herein is believed to be accurate, such information is preliminary, and should not be relied upon for accuracy or completeness, and no representationsor warranties of accuracy or completeness are made.
Note: This document contains information on products in the design, sampling and/or initial production phasesof development. This information is subject to change without notice. Verify with your IBM field applicationsengineer that you have the latest version of this document before finalizing a design.
You may use this documentation solely for developing technology products compatible with Power Architecture®. You may not modify or distribute this documentation. No license,express or implied, by estoppel or otherwise to any intellectual property rights is granted by this document.
THE INFORMATION CONTAINED IN THIS DOCUMENT IS PROVIDED ON AN “AS IS” BASIS. In no event will IBM beliable for damages arising directly or indirectly from any use of the information contained in this document.
IBM Systems and Technology Group2070 Route 52, Bldg. 330Hopewell Junction, NY 12533-6351
The IBM home page can be found at ibm.com®.
Version 1.1January, 2016