auto-tuning database kernels 2 - Harvard...
Transcript of auto-tuning database kernels 2 - Harvard...
auto-tuning database kernels 2.0prof. Stratos Idreos
HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/
class 22
Stratos Idreos
a bit more about auto-tuning db kernels
and then a couple of open research topics
Stratos Idreos
storage indexing queryschema loadX X X
be able to query the data immediately & with good performance
raw data explore data and gain knowledge “immediately”
Stratos Idreos
database cracking
Stratos Idreos
database cracking
idle timeworkload knowledge
external tools
human control
Stratos Idreos
database crackingauto-tuning database kernels
incremental, adaptive, partial indexing
idle timeworkload knowledge
external tools
human control
Stratos Idreos
database crackingauto-tuning database kernels
incremental, adaptive, partial indexing
indexing
initialization querying
idle timeworkload knowledge
external tools
human control
Stratos Idreos
database crackingauto-tuning database kernels
incremental, adaptive, partial indexing
indexing
initialization querying
idle timeworkload knowledge
external tools
human control
Stratos Idreos
database crackingauto-tuning database kernels
incremental, adaptive, partial indexing
indexing
initialization querying
idle timeworkload knowledge
external tools
human control
every query is treated as an advice on how data should be stored
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
Database Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
Database Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
Database Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
piece3: A>=14
Database Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
piece3: A>=14
resu
lt
Database Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
piece3: A>=14
resu
lt
gain knowledge on how data is organized
Database Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
piece3: A>=14
dynamically/on-the-fly within the select-operator
resu
lt
gain knowledge on how data is organized
Database Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
piece3: A>=14
Q2: select R.A from R where R.A>7 and R.A<=16
dynamically/on-the-fly within the select-operatorDatabase Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
piece3: A>=14
Q2: select R.A from R where R.A>7 and R.A<=16
dynamically/on-the-fly within the select-operatorDatabase Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
piece3: A>=14
4 2 1 3 6 7 9 8 13 12 11 14 16 19
piece1: A<=7
piece2: 7<A<=10Q2: select R.A from R where R.A>7 and R.A<=16
dynamically/on-the-fly within the select-operatorDatabase Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
piece3: A>=14
4 2 1 3 6 7 9 8 13 12 11 14 16 19
piece1: A<=7
piece2: 7<A<=10
piece3: 10<A<14Q2: select R.A from R where R.A>7 and R.A<=16
dynamically/on-the-fly within the select-operatorDatabase Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
piece3: A>=14
4 2 1 3 6 7 9 8 13 12 11 14 16 19
piece1: A<=7
piece2: 7<A<=10
piece3: 10<A<14
piece4: 14<=A<=16
piece5: A>16
Q2: select R.A from R where R.A>7 and R.A<=16
dynamically/on-the-fly within the select-operatorDatabase Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
piece3: A>=14
4 2 1 3 6 7 9 8 13 12 11 14 16 19
piece1: A<=7
piece2: 7<A<=10
piece3: 10<A<14
piece4: 14<=A<=16
piece5: A>16
Q2: select R.A from R where R.A>7 and R.A<=16
dynamically/on-the-fly within the select-operator
resu
lt
Database Cracking CIDR 2007
Stratos Idreos
13 16 4 9 2 12 7 1 19 3 14 11 8 6
column AQ1: select R.A from R where R.A>10 and R.A<14
4 9 2 7 1 3 8 6 13 12 11 16 19 14
piece1: A<=10
piece2: 10<A<14
piece3: A>=14
4 2 1 3 6 7 9 8 13 12 11 14 16 19
piece1: A<=7
piece2: 7<A<=10
piece3: 10<A<14
piece4: 14<=A<=16
piece5: A>16
Q2: select R.A from R where R.A>7 and R.A<=16
dynamically/on-the-fly within the select-operator
resu
lt
the more we crack, the more we learn
Database Cracking CIDR 2007
Stratos Idreos
100K random selections random selectivity random value ranges in a 10 million integer column
set-upScan
Full Index
Crack
0.001
0.01
0.1
1
10
100
1000
1 10 100 1000 10000 100000
Resp
onse
tim
e (
secs
)
Query sequence (x1000)
continuous adaptation
Database Cracking CIDR 2007
Stratos Idreos
100K random selections random selectivity random value ranges in a 10 million integer column
almost no initialization overhead
set-upScan
Full Index
Crack
0.001
0.01
0.1
1
10
100
1000
1 10 100 1000 10000 100000
Resp
onse
tim
e (
secs
)
Query sequence (x1000)
continuous adaptation
Database Cracking CIDR 2007
Stratos Idreos
100K random selections random selectivity random value ranges in a 10 million integer column
almost no initialization overhead
continuous improvement
set-upScan
Full Index
Crack
0.001
0.01
0.1
1
10
100
1000
1 10 100 1000 10000 100000
Resp
onse
tim
e (
secs
)
Query sequence (x1000)
continuous adaptation
Database Cracking CIDR 2007
Stratos Idreos
100K random selections random selectivity random value ranges in a 10 million integer column
almost no initialization overhead
continuous improvement
set-upScan
Full Index
Crack
0.001
0.01
0.1
1
10
100
1000
1 10 100 1000 10000 100000
Resp
onse
tim
e (
secs
)
Query sequence (x1000)
continuous adaptation
Database Cracking CIDR 2007
cracking databases
updates (SIGMOD07)
storage-restrictions (SIGMOD09)
benchmarking (TPCTC10)
concurrency control
(PVLDB12)
algorithms (PVLDB11)
basics (CIDR07)
multi-cores (SIGMOD15)
hadoop (Yale/Saarland)
b-trees (HP Labs)
>1 columns (SIGMOD09)
robustness (PVLDB12)
adaptive storage
(SIGMOD14)
time-series (SIGMOD14)
encryption (SIGMOD16)
Stratos Idreos
concurrency control
Concurrency Control, PVLDB 12
problem: read queries become write queries! (?)
goal: be able to crack for multiple queries in parallel
Stratos Idreos
traditional indexing adaptive indexing
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexingwrite queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
change index contents and structure
only index structure changes
write queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
change index contents and structure
only index structure changes
no need for traditional locks = too heavy
short term latches = fast and release quickly
write queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
all or nothing incremental and optional
change index contents and structure
only index structure changes
write queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
all or nothing incremental and optional
change index contents and structure
only index structure changes
v1 v2 v3
write queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
all or nothing incremental and optional
change index contents and structure
only index structure changes
v1 v2 v3
a b
write queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
all or nothing incremental and optional
change index contents and structure
only index structure changes
v1 v2 v3 v1 v2 v3
a b
a b
write queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
all or nothing incremental and optional
change index contents and structure
only index structure changes
v1 v2 v3 v1 v2 v3
a b
a b
write queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
all or nothing incremental and optional
change index contents and structure
only index structure changes
v1 v2 v3 v1 v2 v3
a b
a b
v1 v2 v3a
write queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
all or nothing incremental and optional
change index contents and structure
only index structure changes
v1 v2 v3 v1 v2 v3
a b
a b
v1 v2 v3a
write queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
all or nothing incremental and optional
impact stable storage stable storage optional
change index contents and structure
only index structure changes
write queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
all or nothing incremental and optional
impact stable storage stable storage optional
change index contents and structure
only index structure changes
v1 v2 v3 v1 v2 v3a b
write queries read queries
a b
Concurrency Control, PVLDB 12
Stratos Idreos
traditional indexing adaptive indexing
all or nothing incremental and optional
impact stable storage stable storage optional
change index contents and structure
only index structure changes
need to serialize can execute in any order
write queries read queries
Concurrency Control, PVLDB 12
Stratos Idreos
select min(C) from R where A<10 & B<20
diskA B C D
Concurrency Control, PVLDB 12
Stratos Idreos
select min(C) from R where A<10 & B<20
A<10disk memory
A B C D
Concurrency Control, PVLDB 12
Stratos Idreos
select min(C) from R where A<10 & B<20
A<10disk memory
A B C D IDs
Concurrency Control, PVLDB 12
Stratos Idreos
select min(C) from R where A<10 & B<20
A<10disk memory
A B C D IDs B
Concurrency Control, PVLDB 12
Stratos Idreos
select min(C) from R where A<10 & B<20
B<20A<10disk memory
A B C D IDs B
Concurrency Control, PVLDB 12
Stratos Idreos
select min(C) from R where A<10 & B<20
B<20A<10disk memory
A B C D IDs B CIDs
Concurrency Control, PVLDB 12
Stratos Idreos
select min(C) from R where A<10 & B<20
B<20 minCA<10disk memory
A B C D IDs B CIDs
Concurrency Control, PVLDB 12
Stratos Idreos
select min(C) from R where A<10 & B<20
B<20 minCA<10disk memory
A B C D IDs B CIDs
column lock and release as soon as an operator completes
Concurrency Control, PVLDB 12
Stratos Idreos
need to latch only to be cracked pieces (max 2 per select)
Concurrency Control, PVLDB 12
select [a,b]
Stratos Idreos
piece locking
avl-tree crack column
Concurrency Control, PVLDB 12
Stratos Idreos
piece locking
avl-tree crack columncrack select
Concurrency Control, PVLDB 12
Stratos Idreos
piece locking
avl-tree crack columncrack selectwlock
Concurrency Control, PVLDB 12
Stratos Idreos
piece locking
avl-tree crack columncrack select
wlock
Concurrency Control, PVLDB 12
Stratos Idreos
piece locking
avl-tree crack column
Concurrency Control, PVLDB 12
Stratos Idreos
piece locking
avl-tree crack columnmax
Concurrency Control, PVLDB 12
Stratos Idreos
piece locking
avl-tree crack columnmaxrlock
Concurrency Control, PVLDB 12
Stratos Idreos
piece locking
avl-tree crack columnmax
rlock
Concurrency Control, PVLDB 12
Stratos Idreos
piece locking
avl-tree crack columnmax
rlock
Concurrency Control, PVLDB 12
Stratos Idreos
piece locking
avl-tree crack columnmax
rlock
Concurrency Control, PVLDB 12
Stratos Idreos
avl-tree crack column1090
140
200
300
piece locking
Stratos Idreos
avl-tree crack column crack select1090
140
200
300
65
230
piece locking
Stratos Idreos
avl-tree crack column crack selectwlock
wlock
1090
140
200
300
65
230
piece locking
Stratos Idreos
avl-tree crack column crack selectwlock
wlock
r/wlock
1090
140
200
300
65
230
piece locking
Stratos Idreos
avl-tree crack column crack selectwlock
wlock
q1,q2,q3...qn1090
140
200
300
65
230
10
90
piece locking
Stratos Idreos
avl-tree crack column crack selectwlock
wlock
q1,q2,q3...qn1090
140
200
300
65
230
10
90
2030
7080
piece locking
Stratos Idreos
avl-tree crack column crack selectwlock
wlock
q1,q2,q3...qn1090
140
200
300
65
230
10
90
2030
7080
wlock
piece locking
Stratos Idreos
0
10
20
30
40
50
60
70
80
1 2 4 8 16 32
Tota
l Tim
e for
1024 Q
ueries
(secs
)
Number of Concurrent Clients
scansort
crack
300
0
50
100
150
200
250
350
1 2 4 8 16 32
Thro
ughput ove
r 1024 Q
ueries
(Queries
/ se
c)
Number of Concurrent Clients
scansort
crack
4.5
4.55
4.6
4.65
4.7
4.75
4.8
Tota
l Tim
e for
1024 Q
ueries
(secs
)
(Sequential Execution)
Concurrency Control
Enabled Disabled
Rows: 100M Query: select sum(A) from R where v1 < A1 < v2 Selectivity: 0.01%, Random, # of queries: 1024 Clients: 1-32, Machine: 4 cores
adaptive indexing maintains its performance advantage
Concurrency Control, PVLDB 12
Stratos Idreos
crack time and wait time
10-6
10-5
10-4
10-3
10-2
10-1
100
1 10 100 1000
Tim
e p
er
query
(se
cs)
Query sequence
index refinementwait time
Query: select sum(A) from R where v1 < A1 < v2 Selectivity: 0.01%, Random, # of queries: 1024 Clients: 8, Machine: 4 cores
adaptive indexing maintains
the adaptive behavior
Concurrency Control, PVLDB 12
Stratos Idreos
stochastic cracking robustness
Stochastic Cracking, PVLDB 12
Stratos Idreos
adaptive indexing
v4v5 v3 v2 v1
Stratos Idreos
Response time: X
adaptive indexing
v4v5 v3 v2 v1
Stratos Idreos
Response time: X
adaptive indexing
v4v5 v3v2 v1
Stratos Idreos
Response time: X
adaptive indexing
Response time: 1000X
v4v5 v3v2 v1
Stratos Idreos
v4v5 v3 v2 v1
Stochastic Cracking, PVLDB 12
Stratos Idreos
v4v5 v3 v2 v1 v1 v2 v3 v4 v5
Stochastic Cracking, PVLDB 12
Stratos Idreos
v4v5 v3 v2 v1
v1 v2 v3 v4 v5
Stochastic Cracking, PVLDB 12
Stratos Idreos
v4v5 v3v2 v1
v1 v2 v3 v4 v5
Stochastic Cracking, PVLDB 12
Stratos Idreos
v4v5 v3v2 v1
v1 v2 v3 v4 v5
v1 v2 v3 v4 v5
Stochastic Cracking, PVLDB 12
Stratos Idreos Stochastic Cracking, PVLDB 12
Stratos Idreos
select [15,55]
Stochastic Cracking, PVLDB 12
Stratos Idreos
select [15,55]
Stochastic Cracking, PVLDB 12
Stratos Idreos
10 20 30 40 50 60
select [15,55]
Stochastic Cracking, PVLDB 12
Stratos Idreos
10 20 30 40 50 60
select [15,55]
select [15,55]
Stochastic Cracking, PVLDB 12
Stratos Idreos
10 20 30 40 50 60
select [15,55]
select [15,55]
Stochastic Cracking, PVLDB 12
Stratos Idreos
10 20 30 40 50 60
select [15,55]
select [15,55]
Stochastic Cracking, PVLDB 12
Stratos Idreos
10 20 30 40 50 60
select [15,55]
select [15,55]
the amount of work for each query depends on the index state
the state of the index depends on past queries patterns
Stochastic Cracking, PVLDB 12
Stratos Idreos
good sequence
column with 100 unique integers [1,100]
Stochastic Cracking, PVLDB 12
Stratos Idreos
q1
good sequence
column with 100 unique integers [1,100]<50
Stochastic Cracking, PVLDB 12
Stratos Idreos
Nq1
good sequence
column with 100 unique integers [1,100]<50
Stochastic Cracking, PVLDB 12
Stratos Idreos
Nq1
good sequence
column with 100 unique integers [1,100]<50
Stochastic Cracking, PVLDB 12
Stratos Idreos
Nq1q2
good sequence
column with 100 unique integers [1,100]<50<25
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2
q1q2
good sequence
column with 100 unique integers [1,100]<50<25
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2
q1q2
good sequence
column with 100 unique integers [1,100]<50<25
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2
q1q2 q3
good sequence
column with 100 unique integers [1,100]<50<25 <75
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
good sequence
column with 100 unique integers [1,100]<50<25 <75
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
good sequence
column with 100 unique integers [1,100]<50<25 <75
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
good sequence
column with 100 unique integers [1,100]<50<25 <75
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
good sequence
bad sequence
column with 100 unique integers [1,100]<50<25 <75
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
q1
good sequence
bad sequence
column with 100 unique integers [1,100]<50<25 <75
<2
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
q1N
good sequence
bad sequence
column with 100 unique integers [1,100]<50<25 <75
<2
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
q1N
good sequence
bad sequence
column with 100 unique integers [1,100]<50<25 <75
<2
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
q1 q2N
good sequence
bad sequence
column with 100 unique integers [1,100]<50<25 <75
<2 <3
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
q1 q2NN-1
good sequence
bad sequence
column with 100 unique integers [1,100]<50<25 <75
<2 <3
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
q1 q2NN-1
good sequence
bad sequence
column with 100 unique integers [1,100]<50<25 <75
<2 <3
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
q1 q2 q3NN-1
good sequence
bad sequence
column with 100 unique integers [1,100]<50<25 <75
<2 <3 <4
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
q1 q2 q3NN-1N-2
good sequence
bad sequence
column with 100 unique integers [1,100]<50<25 <75
<2 <3 <4
Stochastic Cracking, PVLDB 12
Stratos Idreos
NN/2N/2
q1q2 q3
q1 q2 q3NN-1N-2
good sequence
bad sequence
column with 100 unique integers [1,100]<50<25 <75
<2 <3 <4
blindly adapting to queries is not always a good idea
Stochastic Cracking, PVLDB 12
Stratos Idreos Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
to be cracked
Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
to be cracked
q random
Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
to be cracked
q random
progressive cracking
Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
to be cracked
q random
q1: <v1progressive cracking
Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
to be cracked
q random
q1: <v1 crack + filter <v1progressive cracking
Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
to be cracked
q random
random
swap
q1: <v1 crack + filter <v1progressive cracking
Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
to be cracked
q random
random
swap
q1: <v1 crack + filter <v1
scan + filter <v1
progressive cracking
Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
to be cracked
q random
randomq1: <v1progressive cracking
q2: <v2
Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
to be cracked
q random
randomq1: <v1progressive cracking
scan + filter <v2q2: <v2
Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
to be cracked
q random
randomq1: <v1progressive cracking
scan + filter <v2
crack + filter <v2
q2: <v2
swap
Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
to be cracked
q random
randomq1: <v1progressive cracking
scan + filter <v2
crack + filter <v2
q2: <v2
swap
Stochastic Cracking, PVLDB 12
query driven
Stratos Idreos
cracking on Skyserver (4TB) (Sloan Digital Sky Survey, www.sdss.org)
cracking answers 160.000 queries while full indexing is still half way creating one index
Stochastic Cracking, PVLDB 12
Stratos Idreos
multi-core utilization
Multi-cores, SIGMOD 15
Stratos Idreos Multi-cores, SIGMOD 15
select [a,b]
core 1 core 2
Stratos Idreos
select [a,b]
Multi-cores, SIGMOD 15
Stratos Idreos
select [a,b]
Multi-cores, SIGMOD 15
Stratos Idreos
select [a,b]
<a >=a
Multi-cores, SIGMOD 15
Stratos Idreos
select [a,b]
<a >=a
core 1 core 2 core 3 … core N
Multi-cores, SIGMOD 15
Stratos Idreos Multi-cores, SIGMOD 15
Stratos Idreos
…
Multi-cores, SIGMOD 15
Stratos Idreos
core 1 core 2 core 3
…
core N…
Multi-cores, SIGMOD 15
Stratos Idreos
core 1 core 2 core 3
…
core N…
…
Multi-cores, SIGMOD 15
Stratos Idreos
<a
>=a
core 1 core 2 core 3
…
core N…
…
Multi-cores, SIGMOD 15
Stratos Idreos
<a
>=a
core 1 core 2 core 3
…
core N…
…
Multi-cores, SIGMOD 15
Stratos Idreos
<a
>=a
core 1 core 2 core 3
…
core N…
…
<a >=a
merge
Multi-cores, SIGMOD 15
Stratos Idreos
problem: cores may be under utilized
goal: either fully utilize a core or shut it down
Holistic, SIGMOD 15
Stratos Idreos
scheduler
monitoring CPU utilization
worker 1worker 2
worker 3worker 4
worker 5
worker 6worker 7
worker 8
thread pool
when there is an underutilized CPU, pin a thread to it and to do a cracking task
Stratos Idreos
index space
crack
partitions size - access frequency - hit ratio
random works best
Stratos Idreos
108 tuples - 10 attributes, random queries
1
10
100
1000
10000
1 10 100 1000
Cu
mu
lativ
e R
esp
on
se T
ime
(se
c)
Query sequence
no indexing
offline indexing
online indexing
adaptive indexing
holistic indexing
(a) Performance
0
50
100
150
200
AdaptiveIndexing
HolisticIndexing
To
tal R
esp
on
se T
ime
(se
c)
1
9
90
900
(b) Performance Breakdown
0
500
1000
1500
2000
2500
3000
3500
0 2 4 6 8 10
Cu
mu
lativ
e #
ind
ex
pa
rtiti
on
s
Query sequence (x100)
adaptive indexing
holistic indexing
(c) Index Partitions
0
5
10
15
20
25
30
2 4 6 8 10 12 14 0
8
16
24
32
To
tal R
esp
on
se T
ime
(se
c) o
f #
wo
rke
rs
#W
ork
ers
Holistic Worker Activation
time
#workers
(d) Idle CPU UtilizationFigure 6: Improving performance with holistic indexing.
center slice is contiguous, while the remaining n� 1 slices con-sist of two disjoint halves, each, that are arranged concentricallyaround the center slice (xi and yi indicate the first and the last el-ement of piece i respectively). n threads crack the n slices inde-pendently applying a vectorized, out-of-place cracking algorithm(Figure 5), which was proven in [44] to be the most CPU efficientsingle-threaded cracking implementation reported so far. Finally,the local data are merged into a big cracked piece (Figure 4(b)).We found that devoting all resources to perform adaptive indexingfor user queries in parallel does not lead to the absolute best per-formance. Specifically, we found that we can improve performanceeven more by assigning part of the resources to holistic indexing. Inthis way, some of the CPU resources are assigned to parallel crack-ing for user queries but the rest of the CPU resources are distributedacross several holistic workers for additional index refinements. Inthe experimental section we show why this approach is better thanassigning all available resources to user queries.
5. EXPERIMENTAL ANALYSISIn this section, we demonstrate that holistic indexing leads to a
self-organizing always-on DBMS with substantial benefits in termsof response time; with zero administration or set-up effort holisticindexing improves performance adaptively by exploiting all avail-able CPU resources to the maximum. We present a detailed exper-imental analysis using both standard benchmarks such as TPC-Hand real-life workloads such as SkyServer as well as synthetic mi-crobenchmarks for a fine-grained analysis.
We use a dual-socket machine equipped with two 2.00 GHz In-tel(R) Xeon(R) CPU E5-2650 processors and with 256 GB RAM.Each processor has 8 hyper-threading cores resulting in 32 hard-ware threads in total. The operating system is Fedora 20 (kernelversion 3.12.10). All experiments we report are based on an im-plementation of holistic indexing in MonetDB and assume a main-memory environment.
5.1 Improving over State-of-the-Art IndexingIn our first experiment we demonstrate that holistic indexing
has the potential to bring substantial performance improvementsover existing state-of-the-art indexing approaches. We test holisticindexing against parallel versions of adaptive indexing (databasecracking), offline indexing, online indexing and plain scans.
For plain scans (no indexing), we use a parallel select opera-tor implemented in MonetDB. For offline and online indexing wesort the columns using a highly parallel NUMA-aware sorting algo-rithm that was introduced in [9] (m-way, 16-byte keys) and is pub-licly available in [1]. Specifically, in offline indexing we pre-sortall the columns before query processing, while in online indexing
we assume that after processing a few queries we understand theworkload patterns and then we sort the relevant columns. MonetDBautomatically detects that a column is sorted and can use efficientbinary search actions during select operations. For adaptive index-ing we use the parallel vectorized database cracking algorithm thatwas introduced in [44] (see Section 4.2).
Here we use a synthetic benchmark. The query workload con-sists of 103 range select queries over a table of 10 attributes; eachquery touches a single attribute (we will see more complex querieslater on). Each attribute consists of 230 uniformly distributed inte-gers, while the value range requested by each query (and thus theselectivity) is random. All queries are of the following form.
select A from R where A < v
The tested scenario assumes a dynamic and ad-hoc environmentwith zero workload knowledge and zero idle time to pre-index thedata. Figure 6(a) shows the results. On the x-axis queries are rankedin execution order. The y-axis represents the cumulative responsetime as the query sequence evolves, i.e., each point (x,y) representsthe sum of the execution time y for the first x queries. In this way,the graph shows how the response times evolve as we process moreand more queries.
If there is no indexing support (plain scans), the entire column isscanned in parallel by 32 threads for every query. Because of thisstable access pattern, the cumulative response time of the query se-quence grows linearly as every query has similar cost. With offlineindexing, on the other hand, it takes 12 seconds to completely sorteach column, assuming a priori workload knowledge. This leads toa 120 seconds initialization overhead to sort all 10 columns. Sincethere is no idle time before the first query, the sorting cost is addedto the execution time of the first query in Figure 6(a). After thefirst query, all queries are answered with fast binary search actionswhich results in a rather flat cumulative curve. With online index-ing, the first 100 queries are answered without any index supportand thus the cost grows linearly. After the 100th query, assumingenough workload knowledge has been obtained via monitoring, weproceed to sort all 10 columns. The sorting cost is added to theexecution cost of the 101st query, since there is no idle time be-tween the 100th and the 101st query. As with offline indexing, allsubsequent queries are answered very fast with binary search overthe sorted column. Thus, both in offline and in online indexing,the whole query sequence is affected by the sorting costs. On theother hand, adaptive indexing continuously improves performancerequiring no workload knowledge and without penalizing individ-ual queries. This improvement is due to the fact that adaptive in-dexing builds only partial indices which it incrementally refines asqueries arrive. However, there is still room for big improvements.
Stratos Idreos
NORMALIZED DATA DENORMALIZED DATA
Stratos Idreos
NORMALIZED DATA DENORMALIZED DATA
good for updates, storage but we need joins
Stratos Idreos
NORMALIZED DATA DENORMALIZED DATA
good for updates, storage but we need joins
only fast scans but expensive to create,
storage & updates
Stratos Idreos
adaptive denormalization
normalized data
first award in ACM SIGMOD undergrad
research competition
Stratos Idreos
adaptive denormalization
normalized data possible denormalized space
first award in ACM SIGMOD undergrad
research competition
Stratos Idreos
adaptive denormalization
continuously physically reorganize data based on incoming query patterns (joins)
normalized data possible denormalized space
first award in ACM SIGMOD undergrad
research competition
Stratos Idreos
adaptive denormalization
continuously physically reorganize data based on incoming query patterns (joins)
normalized data
denormalized fragments queries only need to fast scan
possible denormalized space
first award in ACM SIGMOD undergrad
research competition
Stratos Idreos
adaptive denormalization
continuously physically reorganize data based on incoming query patterns (joins)
normalized data
denormalized fragments queries only need to fast scan
possible denormalized space
first award in ACM SIGMOD undergrad
research competition
Stratos Idreos
adaptive denormalization
continuously physically reorganize data based on incoming query patterns (joins)
normalized data
denormalized fragments queries only need to fast scan
possible denormalized space
first award in ACM SIGMOD undergrad
research competition
Stratos Idreos
adaptive denormalization
continuously physically reorganize data based on incoming query patterns (joins)
normalized data
denormalized fragments queries only need to fast scan
possible denormalized space
first award in ACM SIGMOD undergrad
research competition
Stratos Idreos
years
daily
dat
a
[IBMbigdata]years
daily
dat
a
[StratosGuess]
data
* ski
llsdata system design,
set-up, tune, use
Stratos Idreos
data systems that are easy to design(storage, data flow, algorithms, tuning, etc)
Stratos Idreos
SystemDesign
FullImplemention
PrototypeImplementation
workloads, h/wexpected properties
(performance, throughput,
energy, budget, …)
1
2
3
application specific workload, h/w & expected properties
Set-up &
Tune
4 Auto-tuningduring query processing
5
Design & DevelopmentRoles: Architects & Developers
Set-up and TuningRoles: Database Administrators
Stratos Idreos
say we need a system for workload X (data/access patterns): should we strip down a relational system?
should we build up a key-value store or main-memory system? should we build something from scratch?
Stratos Idreos
say we need a system for workload X (data/access patterns): should we strip down a relational system?
should we build up a key-value store or main-memory system? should we build something from scratch?
say the workload (read/write ratio) shifts (e.g., due to app features):should we use a different data layout for base data - diff updates?
should we use different indexing or no indexing?
Stratos Idreos
say we buy new hardware X (flash/memory): should we change the size of b-tree nodes?
should we change the merging strategy in our LSM-tree?
say we need a system for workload X (data/access patterns): should we strip down a relational system?
should we build up a key-value store or main-memory system? should we build something from scratch?
say the workload (read/write ratio) shifts (e.g., due to app features):should we use a different data layout for base data - diff updates?
should we use different indexing or no indexing?
Stratos Idreos
say we buy new hardware X (flash/memory): should we change the size of b-tree nodes?
should we change the merging strategy in our LSM-tree?
say we want to improve response time:would it be beneficial if we would buy faster flash disks?
would it be beneficial if we buy more memory? or just spend the same budget on improving software?
say we need a system for workload X (data/access patterns): should we strip down a relational system?
should we build up a key-value store or main-memory system? should we build something from scratch?
say the workload (read/write ratio) shifts (e.g., due to app features):should we use a different data layout for base data - diff updates?
should we use different indexing or no indexing?
Stratos Idreos
# of design options?
Stratos Idreos
# of design options?
> 1 QUINTILLION(and counting…)
Stratos Idreos
data systems design (and research) is kind of an art(both in the good & in the bad sense)
Stratos Idreos
data systems design (and research) is kind of an art
but can we keep up?
(both in the good & in the bad sense)
Stratos Idreos
disk memory flash …
1 easily utilize past concepts/designs
Stratos Idreos
do not miss out on cool ideas and concepts
# of
cita
tions
0
7
14
21
28
35
1996 1999 2002 2005 2008 2011 2014
P. O’Neil, E. Cheng, D. Gawlick, E, O'Neil The log-structured merge-tree (LSM-tree) Acta Informatica 33 (4): 351–385, 1996
2
Stratos Idreos
do not miss out on cool ideas and concepts
# of
cita
tions
0
7
14
21
28
35
1996 1999 2002 2005 2008 2011 2014
P. O’Neil, E. Cheng, D. Gawlick, E, O'Neil The log-structured merge-tree (LSM-tree) Acta Informatica 33 (4): 351–385, 1996
Google publishes BigTable
2
Stratos Idreos
data+queries+hardware
data system
self-designing data systems
can we design new systems in weeks instead of years?
Stratos Idreos
data+queries+hardware
data system
self-designing data systems
easy to design
adapt to environment
can we design new systems in weeks instead of years?
Stratos Idreos
INTERACTIVE DATA SYSTEM DESIGN/TUNING/TESTING
datasystem
designer
Stratos Idreos
How to model/abstract design decisions?
How much can happen automatically (auto-design)?
Stratos Idreos
How to model/abstract design decisions?
How much can happen automatically (auto-design)?
move from design based on intuition & experience only to a more formal and systematic way to design systems
Stratos Idreos
row-store
column-store
key-value store
hybrid store
ACM SIGMOD Blog, June’15
1. write/extend modules in a high level language
(optimizations)2. modules=
storage/execution/data flow
3. try out >>1 designs (sets of modules)
Stratos Idreos
data systems that are easy to use
dbTouch
show me something interesting
Queriosity
DATA
cidr2013/icde2014
bigdata2015
can we design systems where no SQL is needed?
cracking example
dbTouch CIDR 2013
Stratos Idreos
data systems todayallow us to answer queries fast
data systems tomorrow should allow us to find fast which queries to ask
db
db explore
Stratos Idreos
a semester of quizzes and brainstorming
option between systems project (key-value store) & research with DASlab
(only for CS165 students or otherwise advanced students)
the end
\*project deadline Dec 21 OH and labs continue until then*/
DASlab productions 2016
learn the concepts, not just the techniques learn to adapt
it all starts with how we store the data yes you can do research