DataBench Benchmarks and Communities
Transcript of DataBench Benchmarks and Communities
EUROPEAN BIG DATA VALUE FORUM 2020
DataBench Benchmarks and Communities
2020 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench’20)
Nov 15-16, 2020
Todor Ivanov ([email protected])
Lead Consult
EUROPEAN BIG DATA VALUE FORUM 2020
Data Analytics/ML(including data
processing for analysis, AI and Machine
Learning)
Data Visualisation, Action/Interaction
(including data presentation environment/boundary/user
action and interaction)
Data Acquisition/Collection
(including Data Ingestion processing, Streaming, Data
Extraction, Ingestion Storage, Different data types)
Data Storage/Preparation
(including Storage Retrieval/Access/Queries Data Protection, Curation, Integration, Publication)
Streaming/ Realtime Processing
Interactive Processing
Batch Processing
Data Privacy/Security
Data Governance/Mgmt
Data Storage
Industrial Analytics (Descriptive, Diagnostic, Predictive, Prescriptive)
Machine Learning, AI, Data Science
Visual Analytics
BDVA Reference Model
DataBench Pipeline Methodology
Big Data and AI Landscape Big Data and AI Benchmark EcosystemICT Big Data PPPICT-13-14 Projects
▪ I-BiDaaS▪ TheyBuyForYou (TBFY)▪ Track&Know▪ DataBio▪ DeepHealth
EUROPEAN BIG DATA VALUE FORUM 2020
BenchCounsil Benchmarks
Data Analytics/ML(including data
processing for analysis, AI and Machine
Learning)
Data Visualisation, Action/Interaction
(including data presentation environment/boundary/user
action and interaction)
Data Acquisition/Collection
(including Data Ingestion processing, Streaming, Data
Extraction, Ingestion Storage, Different data types)
Data Storage/Preparation
(including Storage Retrieval/Access/Queries Data Protection, Curation, Integration, Publication)
DataBench Pipeline Methodology
AIBench
BigDataBench
AIBench
BigDataBench
AIBench
BigDataBench
AIBench
BigDataBench
HPC AI500 HPC AI500 HPC AI500
Edge AIBench Edge AIBench
AIoTBench AIoTBench
MLPerf
ABench ABench
MLPerf
ABench
Hobbit Benchmark Platform
Hobbit Benchmark Platform
Hobbit Benchmark Platform
Hobbit Benchmark Platform
Linked Data Benchmark Council (LDBC)
Graphalytics
Semantic Publishing Benchmark (SPB)
Graphalytics
Semantic Publishing Benchmark (SPB)
Semantic Publishing Benchmark (SPB)
TheyBuyForYou(TBFY)
TPC-H TPC-H
HiBench HiBench HiBenchI-BiDaaS
EUROPEAN BIG DATA VALUE FORUM 2020
PI-DV x x x x x x x
PI-DA x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
PI-DS x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
PI-DI x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
Sp
arkB
ench
TP
Cx
-V
IoT
Ab
ench
Big
FU
N
TP
C-D
S v
2
TP
Cx
-BB
City
Ben
ch
Grap
haly
tics
Yah
oo
Stream
ing
Ben
chm
ark (Y
SB
)
Sh
enZ
hen
Tran
spo
rtation
Sy
stem (S
ZT
S)
Deep
Ben
ch
Deep
Mark
Ten
sorF
low
Ben
chm
arks
Fath
om
Ad
Ben
ch
RIo
TB
ench
Ho
bb
it Ben
chm
ark
TP
Cx
-HS
v2
Big
Ben
ch V
2
San
zu
AIM
Ben
chm
ark
GA
RD
EN
IA
Pen
n m
achin
e learnin
g b
ench
mark
(PM
LB
)
Op
enM
L b
ench
mark
suites
Ben
chip
Deep
Learn
ing
Ben
chm
arkin
g S
uite (D
LB
S)
TP
Cx
-IoT
Sen
ska
DA
WN
Ben
ch
Blo
ckB
ench
IDEBench
AB
ench
Stream
WatD
iv
TE
RM
inato
r Su
ite
HE
RM
IT
ML
Ben
ch S
ervices
ML
Ben
ch D
istribu
ted
ML
Perf
Train
ing
Ben
chm
ark fo
r DN
Ns (T
BD
)
Po
lyB
ench
NN
Ben
ch-X
GD
PR
ben
ch
Ben
chIo
T
IoT
Ben
ch
Visu
al Ro
ad
Ad
aBen
ch
MiD
Ben
ch
CB
ench
-Dy
nam
o
Ed
ge A
IBen
ch
AIB
ench
HP
C A
I50
0
Sp
arkA
IBen
ch
AI M
atrix
20
19
Ben
chm
arks
20
15
20
16
20
17
20
18
Mapping between the DataBench Pipeline Steps and the Benchmark Ecosystem → matrix available in the DataBench ToolBox
Data Analytics/ML(including data
processing for analysis, AI and Machine
Learning)
Data Visualisation, Action/Interaction
(including data presentation environment/boundary/user
action and interaction)
Data Acquisition/Collection
(including Data Ingestion processing, Streaming, Data
Extraction, Ingestion Storage, Different data types)
Data Storage/Preparation
(including Storage Retrieval/Access/Queries Data Protection, Curation, Integration, Publication)
DataBench Pipeline Methodology
The DataBench Toolbox includes:
• HiBench
• SparkBench
• YCSB
• TPCx-IoT
• Yahoo Streaming Benchmark
• BigBench V2
• TPC-H
• TPC-DS
• Hadoop Workload Examples
• PigMix
• Social Network Benchmark
• WatDiv
• Sanzu
• BigDataBench
• CLASS Benchmark
• ...
Searchable
• HiBench
• YCSB
• Yahoo! Streaming
• CLASS
• BigBench V2
• TPCx-BB (in progress)
Integrated & Runnable
• Classified around 80+
benchmarks
developed between
1999 and 2019!
• More than 80 are
already searchable in
the Toolbox!