Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera,...
Transcript of Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera,...
![Page 1: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/1.jpg)
CS848 Paper Presentation
Sinfonia: a new paradigm for building scalable distributed systems
Aguilera, Merchant, Shah, Veitch, Karamanolis
SOSP 2007
Presented by Somayyeh Zangooei
David R. Cheriton School of Computer Science
University of Waterloo
22 February 2010
![Page 2: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/2.jpg)
February 22, 2010 Sinfonia 2
Motivation
● Increasing need for scalable distributed systems/applications● Large data centers (1000s servers)● Serve billions of users around the world
● Sharing data● Current solution: use message-passing
● Complex protocols● Error prone● Hard to use
![Page 3: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/3.jpg)
February 22, 2010 Sinfonia 3
Outline
● Sinfonia Structure● Minitransactions● Design Choices● Two Applications● Evaluation● Conclusion● Questions & Discussions
![Page 4: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/4.jpg)
February 22, 2010 Sinfonia 4
Focus of Sinfonia
● Data Center Environment● Small and predictable network latencies● Trustworthy applications● Nodes may crash
● Target: Infrastructure applications● Applications that support other applications● Examples: lock managers, cluster file systems, and
group communication services● Need to provide reliability, consistency, and
scalability
![Page 5: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/5.jpg)
February 22, 2010 Sinfonia 5
Sinfonia
Memory node Memory node Memory node
minitransactionsSinfonia
application node
applicationnode
applicationnode
applicationnode
user library user library user libraryuser library
![Page 6: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/6.jpg)
February 22, 2010 Sinfonia 6
Outline
● Sinfonia Structure● Minitransactions● Design Choices● Two Applications● Evaluation● Conclusion● Questions & Discussions
![Page 7: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/7.jpg)
February 22, 2010 Sinfonia 7
Minitransactions
mem-id add len
mem-id add len data
mem-id add len
mem-id add len
mem-id add len data
mem-id add len data
compare items
write items
read items
● Minitransactions: ● Atomically update data at
multiple memory nodes● Consistes of: a set of compare
items, a set of read items, a set of write items
● Semantics:● Check data in compare items
(equality comparison)● If all match then apply read and
write items
data
![Page 8: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/8.jpg)
February 22, 2010 Sinfonia 8
Minitransactions (example)
37
56
34
78
123
3
● API:
● Example:
Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); }
t = new Minitransaction();t.cmp(2,3,1,70); t.write(1,2,1,45);t.write(3,4,2,37,848);status = t.exec_and_commit();
70
34
78
123
3
4
17
234
5
38
46
37
Memnode 1 Memnode 2 Memnode 3
![Page 9: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/9.jpg)
February 22, 2010 Sinfonia 9
Minitransactions (example)
37
56
34
78
123
3
● API:
● Example:
Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); }
t = new Minitransaction();t.cmp(2,3,1,70); t.write(1,2,1,45);t.write(3,4,2,37,848);status = t.exec_and_commit();
70
34
78
123
3
4
17
234
5
38
46
37
Memnode 1 Memnode 2 Memnode 3
![Page 10: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/10.jpg)
February 22, 2010 Sinfonia 10
Minitransactions (example)
37
56
34
78
123
3
● API:
● Example:
Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); }
t = new Minitransaction();t.cmp(2,3,1,70); t.write(1,2,1,45);t.write(3,4,2,37,848);status = t.exec_and_commit();
70
34
78
123
3
4
17
234
5
38
46
37
Memnode 1 Memnode 2 Memnode 3
![Page 11: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/11.jpg)
February 22, 2010 Sinfonia 11
Minitransactions (example)
37
56
34
78
123
3
● API:
● Example:
Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); }
t = new Minitransaction();t.cmp(2,3,1,70); t.write(1,2,1,45);t.write(3,4,2,37,848);status = t.exec_and_commit();
70
34
78
123
3
4
17
45
5
38
46
37
Memnode 1 Memnode 2 Memnode 3
![Page 12: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/12.jpg)
February 22, 2010 Sinfonia 12
Minitransactions (example)
37
56
34
78
123
3
● API:
● Example:
Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); }
t = new Minitransaction();t.cmp(2,3,1,70); t.write(1,2,1,45);t.write(3,4,2,37,848);status = t.exec_and_commit();
70
34
78
123
3
4
17
45
5
38
46
37
Memnode 1 Memnode 2 Memnode 3
![Page 13: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/13.jpg)
February 22, 2010 Sinfonia 13
Minitransactions (example)
37
56
34
78
37
3
● API:
● Example:
Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); }
t = new Minitransaction();t.cmp(2,3,1,70); t.write(1,2,1,45);t.write(3,4,2,37,848);status = t.exec_and_commit();
70
34
78
123
3
4
17
45
5
38
46
848
Memnode 1 Memnode 2 Memnode 3
![Page 14: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/14.jpg)
February 22, 2010 Sinfonia 14
Minitransactions (example)
37
56
34
78
37
3
● API:
● Example:
Class Minitransaction{ void cmp(memid,addr,len,data); void read(memid,addr,len,buf); void write(memid,addr,len,data); int exec_and_commit(); }
t = new Minitransaction();t.cmp(2,3,1,70); t.write(1,2,1,45);t.write(3,4,2,37,848);status = t.exec_and_commit();
70
34
78
123
3
4
17
45
5
38
46
848
Memnode 1 Memnode 2 Memnode 3
![Page 15: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/15.jpg)
February 22, 2010 Sinfonia 15
Minitransactions
● Balance between: ● Functionality (Power): powerful enough,
general-purpose, easy to use
● Efficiency: can be executed and commited efficiently, with a small number of network round-trips
![Page 16: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/16.jpg)
February 22, 2010 Sinfonia 16
Minitransaction Efficiency
exec
ute
Tw
o-ph
ase
com
mit
coordinator
exec
ute
(pig
gyba
k ing
) T
wo-
phas
e co
mm
it
application node
p1 p2 p3 m1 m2 m3
Sinfonia minitransactionsTraditional transactions
![Page 17: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/17.jpg)
February 22, 2010 Sinfonia 17
Outline
● Sinfonia Structure● Minitransactions● Design Choices● Two Applications● Evaluation● Conclusion● Questions & Discussions
![Page 18: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/18.jpg)
February 22, 2010 Sinfonia 18
Caching and Load Balancing
● Caching● Sinfonia does not cache data at
application nodes● Caching is left to application nodes
● Load balancing● Sinfonia does not balance data across
memory nodes● Load balancing is left to application
nodes● Sinfonia provides per-memory-node load
information
![Page 19: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/19.jpg)
February 22, 2010 Sinfonia 19
Fault Tolerance
● Mechanisms for fault tolerance:● Disk image● Logging● Replication● Backup
● Trade off between fault tolerance and amount of resources
![Page 20: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/20.jpg)
February 22, 2010 Sinfonia 20
Sinfonia Modes
![Page 21: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/21.jpg)
February 22, 2010 Sinfonia 21
Sinfonia Modes
![Page 22: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/22.jpg)
February 22, 2010 Sinfonia 22
Outline
● Sinfonia Structure● Minitransactions● Design Choices● Two Applications● Evaluation● Conclusion● Questions & Discussions
![Page 23: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/23.jpg)
February 22, 2010 Sinfonia 23
Application: Cluster File System
SinfoniaFS Fault tolerant Scalable
Exports NFS v2 Each NFS function: a
single minitransaction.
For each function: Validate cache Modify data
![Page 24: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/24.jpg)
February 22, 2010 Sinfonia 24
Application: Group Communication Service
GCS: chat room Join and leave Broadcast msgs
SinfoniaGCS Messages stored in
memory nodes Private queue for
each member Global list
![Page 25: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/25.jpg)
February 22, 2010 Sinfonia 25
Outline
● Sinfonia Structure● Minitransactions● Design Choices● Two Applications● Evaluation● Conclusion● Questions & Discussions
![Page 26: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/26.jpg)
February 22, 2010 Sinfonia 26
Evaluation: Ease of Use
SinfoniaFS LinuxNFS SinfoniaGCS Spread Toolkit
lines of code
3,855(C++)
5,900(C)
2,492(C++)
22,148(C)
develop time 1 month unknown 2 months years
major versions
1 2 1 4
![Page 27: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/27.jpg)
February 22, 2010 Sinfonia 27
Evaluation: Scalability
spread= 2scalable
spread= # of memory nodenot scalable
![Page 28: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/28.jpg)
February 22, 2010 Sinfonia 28
Evaluation: SinfoniaFS
![Page 29: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/29.jpg)
February 22, 2010 Sinfonia 29
Evaluation: SinfoniaGCS
![Page 30: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/30.jpg)
February 22, 2010 Sinfonia 30
Conclusion
Sinfonia: a service for building scalabe distributed systems
Protocol design data structure design
A sequence of minitransactions over unstructured data
Effective in building infrastructure applications
Extensions
![Page 31: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/31.jpg)
February 22, 2010 Sinfonia 31
Thanks
![Page 32: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/32.jpg)
February 22, 2010 Sinfonia 32
Coordinator Crash
● Traditional 2PC blocks on coordinator crash● Not desirable in Sinfonia: Sinfonia does not have
control on coordinators
● Traditional solution: 3PC● Sinfonia Solution: modified 2PC+recovery
coordinator
![Page 33: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/33.jpg)
February 22, 2010 Sinfonia 33
Coordinator Crash
log
coordinator application node
p1 p2 p3 m1 m2 m3
Sinfonia 2PCTraditional 2PC
log log log
![Page 34: Aguilera, Merchant, Shah, Veitch, Karamanoliskmsalem/courses/CS848W10/presentations… · Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David](https://reader030.fdocuments.net/reader030/viewer/2022040603/5e9fe0bb575e1a34223c92c1/html5/thumbnails/34.jpg)
February 22, 2010 Sinfonia 34
Coordinator Crash