[BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and...
Transcript of [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and...
![Page 1: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/1.jpg)
DataStructuresoftheFuture:Concurrent,Optimistic,andRelaxed
DanAlistarhETHZurich/ISTAustria
![Page 2: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/2.jpg)
WhyConcurrent?
Simple:Togetspeeduponnewerhardware.Scaling:morethreads shouldimplymoreusefulwork.
![Page 3: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/3.jpg)
TheProblemwithConcurrency
Concurrencycanbeverybadvalueformoney.
Isthisprobleminherent?
0.00E+00
1.00E+06
2.00E+06
3.00E+06
4.00E+06
5.00E+06
6.00E+06
0 10 20 30 40 50 60 70
Throug
hput(Events/Second
)
NumberofThreads
ThroughputofLock-FreeQueue(PacketProcessing)
<$1000/machine
>$10000/machine
![Page 4: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/4.jpg)
InherentSequentialBottlenecks
Datastructureswithstrongorderingsemantics• Stacks,Queues,PriorityQueues,ExactCounters
Thisisbadnews becauseofAmdahl’sLaw• Programswhosecriticalpathcontainscontendeddatastructureswon’tparallelizewell
Theorem:Givennthreads,any deterministic,stronglyordereddatastructurehasanexecutioninwhich
aprocessortakeslinearinntimetoreturn.[Ellen,Hendler,Shavit,SICOMP2013]
[Alistarh,Aspnes,Gilbert,Guerraoui, JACM2014]
Togetperformance,itiscriticaltospeedupshareddatastructures.
![Page 5: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/5.jpg)
Today’sTalk
Howcanwescalesuchdatastructures?
Theory↔Software↔Hardware
NewHardwareInstructions!
NewDataStructureDesigns!
Theorem:Givennthreads,any deterministic,stronglyordereddatastructurehasanexecutioninwhich
aprocessortakeslinearinntimetoreturn.[Ellen,Hendler,Shavit,SICOMP2013]
[Alistarh,Aspnes,Gilbert,Guerraoui, JACM2014]
![Page 6: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/6.jpg)
Lock-FreeDataStructures101• Optimisticprogrammingpatterns
• Donotuselocks,butatomicinstructions(Compare&Swap)
• Blockingofonethreadshouldn’t stopthewholesystem• Lotsofimplementations:HashTables,Lists,Trees,Queues,Stacks,etc.
Memory location R;void fetch-and-increment ( ) {
int val;do {
val = Read( R );new_val = val + 1;
} while (! Compare&Swap ( &R, val, new_val ));return val;
}
Example:Lock-freecounter.
![Page 7: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/7.jpg)
TheLock-FreeParadox
val 0
Thread 0 Thread 1
Intheory,threadscouldstarve inoptimisticlock-freeimplementations.
Practice:thisdoesn’talwayshappen.Threadsrarelystarve.
Why?
Usemorecomplexwait-free algorithms.
Memory location R;void fetch-and-increment ( ) {
int val;do {
val = Read( R );new_val = val + 1;
} while (! Compare&Swap ( &R, val, new_val ));return val;
}
Example:Lock-freecounter.
val 0
0
Counter Value R
1
val 1val 1
2
![Page 8: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/8.jpg)
AnalyzingLock-FreePatterns• StochasticScheduler [STOC14,Transact15]:
• Ateachschedulingstep,thenextscheduledthreadpickedfromadistribution p=(p1,p2,…,pn)withpi >0foralli
Theorem1:Underanystochasticscheduler,anylock-freealgorithm iswait-freewithprobability1.[Alistarh,Censor-Hillel, Shavit,STOC14/JACM16].
Theorem2:Underhighcontention,roughlyoneinϴ (1/norm2(p )) opssucceeds.[Alistarh,Sauerwald,Vojnovic,PODC15]
Lock-FreeAlgorithmStochasticSchedulerStochastic
ContentionGame
![Page 9: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/9.jpg)
The ContentionGame
READ(R)
CAS(R,old,old +1)
success
READ(R)
CAS(R,old,old +1)
success
READ(R)
CAS(R,old,old +1)
success
RegisterR
Value=0
READ(R)
CAS(R,old,old +1)
success
LocationR
Value=1
Distribution(p1,p2,…,pn)
LocationR
Value=2
Givenarbitraryp,whatisthestationarybehaviourofthissystem?
![Page 10: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/10.jpg)
TheContentionGame,Balls&Bins view
Thread1 Thread2 Thread3 Thread4
Rules fortheCounter• Bins=threads• Balls=steps• Placementaccordingtop• ToCompletetheOperation
• 3ballsbeforeothers• Resetsallbinswith2Balls• Winnerkeepsoneball
READ(R)
CAS(R,old,old +1)
success
Howmanyballsdoesabinreceiveonaveragebetweentwowins?
Howmanytotalballsaredistributedbetweentwowins,onaverage?
StepComplexity
SystemLatency
Distribution(p1,p2,…,pn)
![Page 11: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/11.jpg)
TheResult
Examples:1. Uniformp=(1/n,1/n,…,1/n):
• Systemlatencyisϴ (sqrt n)[ACHS,JACM16]• Individuallatencyisϴ (nsqrt n)
2. Non-uniformp=(↗1,↘0,…,↘0)• Systemlatencyis(closeto)constant• Individuallatencyiseitherconstant,or↘0
3. Giventhreadsi andj,relativethroughputsare(pi /pj)2
Theorem.Givenarbitrarydistributionp andconstant-lengthlock-freealgorithm,thefollowinghold:
• Systemlatencyisϴ (1/norm2(p ))• Individuallatencyisϴ (norm2(p )/pi2 )
Fairness-ThroughputTrade-off
Othergametypescovered,e.g.obstruction-freealgorithms.
Moral:Underhighcontention,roughlyoneinsqrt (n)opssucceeds.
![Page 12: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/12.jpg)
Whydoesthisgraphlooksobad?
0.00E+00
1.00E+06
2.00E+06
3.00E+06
4.00E+06
5.00E+06
6.00E+06
0 10 20 30 40 50 60 70
Throug
hput(Events/Second
)
NumberofThreads
ThroughputofParallelEventProcessingQueue
![Page 13: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/13.jpg)
WhatHappensattheHardwareLevel?
Directory-basedcachecoherence (Intel,AMD)
Resp(R)Read(R)
CAS(R,old,new)
…
FailureRead(R)
CAS(R,old,new)
Thread0
Thread1
13WewastetimebecauseownershipofRcirculateswithoutusefulwork!Example:At64threads,only onein8messageexchangesisuseful.
![Page 14: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/14.jpg)
Fixingit:Lease/Release[Alistarh,Haider,Hasenplaugh, PPOPP2016]
Directory-basedcachecoherence(Intel,AMD)
Resp(R)Read(R)
CAS(R,old,new)
…
Read(R)
CAS(R,old,new)
Core0
Core1LeaseIntervalT
LeaseIntervalT
success
Delayed
14
Eachtransferresultsinatleastoneuseful
operation!
Doublingdownonoptimism!
![Page 15: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/15.jpg)
Lease/Release,MorePrecisely• Programmeroptimisticallyleasesvariablesforboundedtime
• void ReqLease(void* address,int data_size,time T);• void ReqRelease(void*address,int data_size,time T);• Leasetimeintheorderof1000cycles
• Performancepenaltyifleasesexpirebeforeoperationcompletion• Usuallyoccurs<5%ofthetime
• PrototypeintheMITGraphiteProcessorSimulator• Directory-basedMESICacheCoherenceProtocol• Protocolremainsprovablycorrect• Minimalchanges tothearchitecture
15
Doesitwork?
![Page 16: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/16.jpg)
PacketProcessingQueuewithLease-Release(SimulatedinGraphite)
16
0.00E+00
1.00E+06
2.00E+06
3.00E+06
4.00E+06
5.00E+06
6.00E+06
7.00E+06
0 10 20 30 40 50 60 70
Throughput
NumberofThreads
QueueThroughput
NO_LEASE
SINGLE_LEASE4.5X
• Dequeue Operation
1. Top_Node=Lease&Read(Head )2. Next_Node =Read(Top_Node.ptr )3. ATOMIC
{if(Read(Head ) == Top_Node )thenWrite&Release(Head, Next_Node )
elseReleaseandgoto 1
}
0.00E+00
5.00E+03
1.00E+04
1.50E+04
2.00E+04
2.50E+04
0 10 20 30 40 50 60 70
#Threads
EnergyfortheQueue(nJ /operation)
NO_LEASE
SINGLE_LEASE
![Page 17: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/17.jpg)
WhatElse?Locks
Canweavoidthewastedcoherencemessages?
Req(R,EX)
Resp(R)
Directory-basedcache(Intel,AMD)
Core0Core1
Req(R,EX)
Resp(R)Resp(R)
Req(R,EX)
Resp(R)Req(R,EX)
Resp(R)
Acquire(L)
CAS(L)Acquire(L)
Write(L)Unlock (L)
…CAS(L)
…
Resp(R)
Resp(R)
Spinning
CAS(L)
Delayed
17
LeaseIntervalT
SimplyLeasethelockonacquire!
![Page 18: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/18.jpg)
PageRankwithL/R• Workswithlock-basedprogramsaswell
• Leasethelockbeforeacquiringit• Releasebeforegivingitup
0
5E+09
1E+10
1.5E+10
2E+10
2.5E+10
2 4 8 16 32
CompletionTime(ns)
ParallelPageRankRunningTime(lowerisbetter)
NO_LEASE WITH_LEASE
9.5X
![Page 19: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/19.jpg)
Lease/Release• HardwareLockQueues[iQOLB:Rajwar,Kaegi,Goodman;HPCA2000]
• LocksusingLoad-Linked/Store-Conditional• Load-Linkedtakesa“lease”onthelock,Store-Conditional“releases”• Applied automaticallybytheprocessorspeculationmechanism
• TransientBlockingSynchronization[Shalev,Shavit;SunTechReport2004]• ProposeLoad&Lease /Store&Release instructionsfornon-coherentDSMmachines• Differentsemantics,neverimplemented
• Thepaperalsocontains:• Hardwareimplementationdetails(nodirectorymodifications!)• Blueprintforimplementingmultipleconcurrentleases(transactions)• Lotsofexperiments
19
![Page 20: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/20.jpg)
TheHigh-LevelView• TheProblemwithConcurrency
• Inherentbottlenecksleadtomeltdowns
• Why?• Contentionhurtsoptimisticpatterns,quantifiablyso
• Lease/Release:• Wecannowscalebottlenecks,withinreason• Optimismenforcedatthehardwarelevel
Canwescalebeyondbottlenecks?
Let’s Relax!
![Page 21: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/21.jpg)
ConcurrentPriorityQueues
1 task
3 task
4 task
5 task
7 task
8 task
Methods:• GetTopTask• InsertaTask
• SearchforTask15 task
11 task
18 task
PriorityQueue<key,value>
Search(key)
Insert/Delete(k,v)
DeleteMin()
Extremelyuseful:• GraphOperations(ShortestPaths)
• OperatingSystemKernel• Time-BasedSimulations
Wearelookingforafastconcurrent PriorityQueue.
![Page 22: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/22.jpg)
TheProblem
Target:fast, concurrent PriorityQueue.
Lots ofwork onthetopic:[Sanders97],[Lotan&Shavit00],[Sundell&Tsigas07],
[Linden&Jonsson13], [Lenhart etal.14],[Wimmer etal.14]
Currentsolutionsarehardtoscale:DeleteMin is highlycontended.
Everyonewantsthesameelement!
![Page 23: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/23.jpg)
ConcurrentSolution
head
● Linkedlist,sortedbypriority● Eachnodehasrandom“height”(geometricallydistributedwithparameter½)● Elementsatthesameheight formtheirownlists
H 1 3 4 5 9 … T
![Page 24: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/24.jpg)
● Linkedlist,sortedbypriority● Eachnodehasrandom“height”(geometricallydistributedwithparameter½)● Elementsatthesameheight formtheirownlists● Averagetime Search,Insert,Deletelogarithmic, workconcurrently [Pugh98,Fraser04]
H 1 3 4 5 9 … T
head tail
Search(5)
!
ConcurrentSolution:theSkipList [Pugh90]
[H,9]
[H,9]
[1,9]
[5,9]stop
![Page 25: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/25.jpg)
I.LotanandN.Shavit.Skiplist-BasedConcurrentPriorityQueues.2000.
● DeleteMin:simply removethesmallestelementfromthebottom list● Allprocessorscompeteforsmallestelement● Doesnotscale!
head tail
TheSkipList asaPQ
![Page 26: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/26.jpg)
● Wewanttochooseanitematrandomwith‘good’ guarantees● Minimize lossofexactness byonlychoosing itemsnearthefrontofthelist● Minimizecontention bykeepingcollision probabilitylow
TheIdea:Relax!
![Page 27: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/27.jpg)
Two examples for starting height 4
procedureSpray()● Ateachskiplist level,flipcointostay orjumpforward● Repeatforeachlevelfrom logndownto1 (thebottom)● Asifremovingarandompriorityelementnearthehead
jump
stayjump
jump
DeleteMin:TheSpray[Alistarh,Kopinsky,Li,Shavit,PPoPP2015]
Spray and pray?
![Page 28: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/28.jpg)
✓ MaximumvaluereturnedbySprayhasrankO(𝑛 log3𝑛)- Spraysaren’ttoowide
✓ Forallx,p(x)=Õ(1/𝑛)- Spraysdon’tclustertoomuch
✓ Ifx>yisreturnedbysomeSpray,thenp(y)=Ω,(1/𝑛)- Elementsdonotstarveinthelist
p(x) = probability that a spray returns value at
index x
Õ(𝑛)
SprayListProbabilisticGuarantees
![Page 29: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/29.jpg)
• DiscreteEventSimulation• Exactalgorithms havenegativescalingafter8threads• SprayList competitivewiththerandomremover
(noguarantees,incorrectexecution)
Inmanypractical settings(D.E.S.,shortestpaths),priority inversionsarenotexpensive.
OneBenchmark
![Page 30: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/30.jpg)
TheMultiQueue [Rihani,Dementiev,Sanders,SPAA15]
• nlock-freeorlock-basedqueues• Insert:pickarandomqueue,lock,andinsertintoit• Remove:picktwoqueuesatrandom,lockandremovethebetterelement
0
20
40
60
80
0 7 14 21 28 35 42 49 56Threads
Thro
ughp
ut (M
Ops
/s)
MultiQ c=2MultiQ HT c=2MultiQ c=4SpraylistLindenLotan
Figure 3: Throughput of 50% insert 50% deleteMin operations of uniformly distributedkeys
0
20
40
60
80
0 7 14 21 28 35 42 49 56Threads
Thro
ughp
ut (M
Ops
/s)
MultiQ c=2MultiQ HT c=2MultiQ c=4Spray
Figure 4: Throughput of 50% insert 50% deleteMin operations of monotonic keys
8
Looks good, but does it actually guarantee anything?
![Page 31: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/31.jpg)
TheRandomProcess
1
6
10
13
4
7
12
16
2
3
8
15
5
9
11
14
We are interested in the average rank removed at each step.
Q1 Q2 Q3 Q4WLOG, elements are consecutive labels.1. Insert Elements u.a.r.2. Remove using two choices• Cost = rank of element removed
among remaining elements
Cost(2) = 2
Cost(4) = 3
Cost(1) = 1
Intuitively, the distance from optimal.
![Page 32: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/32.jpg)
TheResult
Theorem: Given n queues, foranyt>0,thecostatt isO(n)inexpectation,andO(nlogn) w.h.p.
• Strategy1:reductiontopoweroftwo-choicesanalysis?[Azaretal.,SICOMP99]• Wouldapplyifwecouldequatequeuesizewithtoplabel(round-robininsert)
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4The reduction does not hold in general, and in fact experimentally height and top priority appear to be uncorrelated.
![Page 33: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/33.jpg)
• Strategy2:somesimple sortofinduction• Theinitialcostdistributionisnice;canweproveitalwaysstaysnice?
TheResult
Theorem: Foranyt>0,thecostatt isO(n)inexpectation,andO(nlogn) w.h.p.
1
2
…
K K+1 K + 2 K+3Hard case: over time, we’ll eventually get arbitrary distributions.We have to prove that the algorithm gets out of those reasonably fast.
![Page 34: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/34.jpg)
• Strategy3:somesimple complicated sortofinduction/potentialargument• Idea:characterizewhat’sgoingonstep-by-step
TheResult
Theorem 1: Foranyt>0,thecostatt isO(n)inexpectation,andO(nlogn) w.h.p.
1
7
?
3
11
?
?
?
4 2 5
9
In expectation, increment is n.
Problem: the behavior at a step is highly correlated with what happened in previous steps.
![Page 35: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/35.jpg)
ProofStrategy
Theorem 1: Foranyt>0,thecostatt isO(n)inexpectation,andO(nlogn) w.h.p.
• Step1:reducetoanuncorrelated exponentialprocess• Provethattherankdistributionispreserved
• Step2:characterizetheexponentialprocess• Characterizeaverageweightontopofqueuesviapotentialargument
• Step3:characterizerankdistributionofexponentialprocess• ProvethataveragerankisO(n)
![Page 36: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/36.jpg)
• Insert:pickarandomqueue• Insertexponentiallydistributedincrementwithmeannintoit• Remove:picktwoqueuesatrandom,removethelowerlabel• Cost:therankoftheelementremoved(still)
Step1:Theexponentialprocess
1.8
5.9
10.2
13.2
4.7
7.3
12.5
16.8
2.2
3.2
8.3
15.2
5.1
9.5
11.7
14.2
Theorem: Thedistributionofremovedranksisthesameinthediscreteprocessandintheexponentialprocess.
Pr[ rank k is in queue j ] = 1 / n. Holds since the exponential is memoryless.
Expected value n
![Page 37: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/37.jpg)
Uses tools from [Peres, Talwar, Wieder, R.S.A. 14]
Step2:Analyzingtheexponentialprocess• Fixaremovalstep t.Let𝒘𝒊(𝒕) bethelabel(realvalue)ontopofbini.• Let𝒙𝒊 𝒕 = 𝒘𝒊 𝒕
𝒏 (normalizedweights),and𝝁(𝒕) = ∑ 𝒙𝒊(𝒕)/𝒏𝒏𝒊6𝟏
• Let𝜱 𝒕 = ∑ 𝐞𝐱𝐩(𝒙𝒊 𝒕 −𝒏𝒊6𝟏 𝝁(𝒕)) and𝜳 𝒕 = ∑ 𝒆𝒙𝒑(−(𝒙𝒊 𝒕 −𝒏
𝒊6𝟏 𝝁(𝒕))).
• Nomorecorrelations: sinceweightincrementsareindependentofprevioussteps,wecanboundtheexpectedincreaseinpotentialateachstep.
• Badconfigurations:Φ 𝑡 andΨ 𝑡 cannotbothbelargeatthesametime.IftheirsumbreakstheO(n)barrier,thenthelargepotentialwilldecreaseveryfast.
• 𝜱 𝒕 +𝜳 𝒕 isthenasuper-martingale,whichimpliesthebound.
Theorem: For any t > 0, 𝔼[𝜱 𝒕 +𝜳 𝒕 ] = 𝑶 𝒏 .
![Page 38: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/38.jpg)
Step3:Whatdoesallthishavetodowithranks?
• Let𝑩H𝒔(𝒕) bethenumberofbinswithweight> µ + 𝑠 attimet.• Let𝑩NO𝒔(𝒕) bethenumberofbinswithweight< µ− 𝑠 attimet.
• Butonaverage,we’llchoosesomethingclosetothemeanvalue!So,weconclude:
Theorem: For any t > 0, 𝔼 𝐵HR 𝑡 = 𝑂 TUVWX
Y
and 𝔼 𝐵NOR 𝑡 = 𝑂 TUVWX
Y
.
Weights become “rarefied” at ranks s-higher and s-lower than the mean value.
Theorem: Foranyt>0,therankcostatt isO(n)inexpectation.
Worst-case bound follows in a similar way.
![Page 39: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/39.jpg)
Applications
What if we do two choices only 𝜷% of the time?(one choice otherwise)
Theorem: Foranyt>0,thecostatt isO(n/𝜷*)inexpectation,andO(nlogn/𝜷* ) w.h.p.
What if the input distribution is biased?Still works (within reason).
Works well in practice.
We can use this for approximate queues, stacks, counters, timestamps.
![Page 40: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/40.jpg)
ConcurrentDataStructures
“Thedatastructuresofourchildhoodarechanging.”Nir Shavit
Arelaxationrenaissance[KarpZhang93], [DeoP92],[Sanders98],
[HenzingerKPSS13],[NguyenLP13],[WimmerCVTT14],[LenhartNP15], [RihaniSD15],[JeffreySYES16]
DatastructuressuchastheSpraylist andtheMultiQueuemergebothrelaxedsemantics and
optimisticprogresstoachievescalability.
![Page 41: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/41.jpg)
TheLastSlide
Howcanwescalethem?
Theory ↔ Software ↔ Hardware
Howdowespecify andprove relaxeddatastructurescorrect?
Whatnewdatastructuresareoutthere?
Theorem: Strongly ordered data structures won’t scale.[Ellen, Hendler, Shavit, SICOMP 2013]
[Alistarh, Aspnes, Gilbert, Guerraoui, JACM 2014]
Howdothesedatastructuresinteractwithexistingapplications?
Canweprovestrongerlowerbounds?
![Page 42: [BIRS]Data Structures of the Future · Data Structures of the Future: Concurrent, Optimistic, and Relaxed Dan Alistarh ETH Zurich / IST Austria. Why Concurrent? Simple: To get speedup](https://reader033.fdocuments.net/reader033/viewer/2022050601/5fa89ba8a7f7d37e33715955/html5/thumbnails/42.jpg)
WorkshopAnnouncement
• Theory&PracticeinConcurrentDataStructures• Co-locatedwithDISC2017(Vienna)
• Overall goals• Fosteringcollaborationbetweenpractically-minded (PPoPP,SOSPetc)conferences,andthePODC/DISCcommunity
• Newchallengesinconcurrentdatastructuredesign
• Precisegoals• Betterbenchmarksforconcurrentdatastructures• Realapplicationsandpracticalissues(e.g.memorymanagement)• Usefulnessofrelaxeddesigns