A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric...
Transcript of A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric...
![Page 1: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/1.jpg)
Haoyuan Li, Tachyon [email protected]
September 22, 2015 @ SDC 2015
A Reliable Memory-Centric Distributed Storage System
![Page 2: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/2.jpg)
• Team consists of Tachyon creators, top contributors, people from UC Berkeley, Google, CMU, VMware, Stanford, Facebook, etc.
• $7.5 million Series A from Andreessen Horowitz
• Committed to Tachyon Open Source
2
![Page 3: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/3.jpg)
3
![Page 4: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/4.jpg)
Outline
• Overview – Motivation – Tachyon Architecture – Using Tachyon
• Open Source – Status – Production Use Cases
• Roadmap
4
![Page 5: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/5.jpg)
Outline
• Overview – Motivation – Tachyon Architecture – Using Tachyon
• Open Source – Status – Production Use Cases
• Roadmap
5
![Page 6: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/6.jpg)
Tachyon: Born in UC Berkeley AMPLab
6
Cluster manager Parallel computation framework
Reliable, distributed memory-centric storage system
![Page 7: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/7.jpg)
7
Why Tachyon?
![Page 8: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/8.jpg)
Memory is Fast
• RAM throughput increasing exponentially
• Disk throughput increasing slowly
8 Memory-locality key to interactive response times
![Page 9: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/9.jpg)
Memory is Cheaper
source: jcmit.com 9
![Page 10: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/10.jpg)
Realized by many…
10
![Page 11: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/11.jpg)
11
Is the Problem Solved?
![Page 12: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/12.jpg)
12
Missing a Solution for the Storage Layer
![Page 13: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/13.jpg)
An Example: -
• Fast, in-memory data processing framework – Keep one in-memory copy inside JVM – Track lineage of operations used to derive data – Upon failure, use lineage to recompute data
map
filter map
join reduce
Lineage Tracking
13
![Page 14: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/14.jpg)
Issue 1
14
Data Sharing is the bottleneck in analytics pipeline:Slow writes to disk
Spark Job1
Spark mem block manager
block 1
block 3
Spark Job2
Spark mem block manager
block 3
block 1
HDFS / Amazon S3 block 1
block 3
block 2
block 4
storage engine & execution engine same process (slow writes)
![Page 15: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/15.jpg)
Issue 1
15
Spark Job
Spark mem block manager
block 1
block 3
Hadoop MR Job
YARN
HDFS / Amazon S3 block 1
block 3
block 2
block 4
Data Sharing is the bottleneck in analytics pipeline:Slow writes to disk
storage engine & execution engine same process (slow writes)
![Page 16: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/16.jpg)
Issue 2
16
Spark Task
Spark memory block manager
block 1
block 3
HDFS / Amazon S3 block 1
block 3
block 2
block 4
execution engine & storage engine same process
Cache loss when process crashes
![Page 17: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/17.jpg)
Issue 2
17
crash
Spark memory block manager
block 1
block 3
HDFS / Amazon S3 block 1
block 3
block 2
block 4
execution engine & storage engine same process
Cache loss when process crashes
![Page 18: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/18.jpg)
HDFS / Amazon S3
Issue 2
18
block 1
block 3
block 2
block 4
execution engine & storage engine same process
crash
Cache loss when process crashes
![Page 19: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/19.jpg)
HDFS / Amazon S3
Issue 3
19
In-memory Data Duplication & Java Garbage Collection
Spark Task1
Spark mem block manager
block 1
block 3
Spark Task2
Spark mem block manager
block 3
block 1
block 1
block 3
block 2
block 4
execution engine & storage engine same process (duplication & GC)
![Page 20: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/20.jpg)
Tachyon
Reliable data sharing at memory-speed within and across
cluster frameworks/jobs
20
![Page 21: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/21.jpg)
Technical Overview
Ideas • A memory-centric storage architecture • Push lineage down to storage layer • Manage tiered storage Facts • One data copy in memory • Re-computation for fault-tolerance
21
![Page 22: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/22.jpg)
Eco-System
22
![Page 23: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/23.jpg)
Tachyon Memory-Centric Architecture
23
![Page 24: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/24.jpg)
Tachyon Memory-Centric Architecture
24
![Page 25: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/25.jpg)
Lineage in Tachyon
25
![Page 26: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/26.jpg)
Issue 1 revisited
26
Memory-speed data sharingamong jobs in different
frameworks execution engine & storage engine same process (fast writes)
Spark Job
Spark mem
Hadoop MR Job
YARN
HDFS / Amazon S3 block 1
block 3
block 2
block 4
HDFS disk
block 1
block 3
block 2
block 4 Tachyon"in-memory
block 1
block 3 block 4
![Page 27: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/27.jpg)
HDFS / Amazon S3 block 1
block 3
block 2
block 4 Tachyon"in-memory
block 1
block 3 block 4
Issue 2 revisited
27
Spark Task
Spark memory block manager
execution engine & storage engine same process
Keep in-memory data safe,even when a job crashes.
![Page 28: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/28.jpg)
Issue 2 revisited
28
HDFS disk
block 1
block 3
block 2
block 4
execution engine & storage engine same process
Tachyon"in-memory
block 1
block 3 block 4
crash
HDFS / Amazon S3 block 1
block 3
block 2
block 4
Keep in-memory data safe,even when a job crashes.
![Page 29: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/29.jpg)
Issue 3 revisited
29
No in-memory data duplication,much less GC
Spark Task
Spark mem
Spark Task
Spark mem
HDFS / Amazon S3 block 1
block 3
block 2
block 4
execution engine & storage engine same process (no duplication & GC)
HDFS disk
block 1
block 3
block 2
block 4 Tachyon"in-memory
block 1
block 3 block 4
![Page 30: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/30.jpg)
Comparison with In-Memory HDFS
30
![Page 31: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/31.jpg)
Outline
• Overview – Motivation – Tachyon Architecture – Using Tachyon
• Open Source – Status – Production Use Cases
• Roadmap
31
![Page 32: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/32.jpg)
Open Source Status
• Started at UC Berkeley AMPLab in Summer 2012
• Apache License 2.0, Version 0.7.1 (August 2015)
• Deployed at > 50 companies (July 2014)
• 30+ Companies Contributing
32
![Page 33: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/33.jpg)
Contributors Growth
33
v0.4"Feb ‘14
v0.3"Oct ‘13
v0.2 Apr ‘13
v0.1 Dec ‘12
v0.6"Mar ‘15
v0.5"Jul ‘14
v0.7"Jul ‘15
1 3 15
30
46
70
111
![Page 34: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/34.jpg)
Codebase Growth
34
v0.4"Feb ‘14
v0.3"Oct ‘13
v0.2 Apr ‘13
v0.6"Mar ‘15
v0.5"Jul ‘14
v0.7"Jul ‘15
465commits
696 commits
1080 commits
1610 commits
2884 commits
5021 commits
![Page 35: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/35.jpg)
Thanks to Our Contributors!
35
![Page 36: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/36.jpg)
Reported Tachyon Usage
36
![Page 37: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/37.jpg)
Under Filesystem Choices (Big Data, Cloud, HPC, Enterprise)
37
![Page 38: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/38.jpg)
Use Case: Baidu
• Framework: SparkSQL • Under Storage: Baidu’s File System • Storage Media: MEM + HDD • 100+ nodes deployment • 1PB+ managed space • 30x Performance Improvement
38
![Page 39: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/39.jpg)
Use Case: a SAAS Company
• Framework: Impala
• Under Storage: S3
• Storage Media: MEM + SSD
• 15x Performance Improvement
39
![Page 40: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/40.jpg)
Use Case: an Oil Company
• Framework: Spark
• Under Storage: GlusterFS
• Storage Media: MEM only
• Analyzing data in traditional storage
40
![Page 41: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/41.jpg)
Use Case: a SAAS Company
• Framework: Spark
• Under Storage: S3
• Storage Media: SSD only
• Elastic Tachyon deployment
41
![Page 42: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/42.jpg)
Outline
• Overview – Motivation – Tachyon Architecture – Using Tachyon
• Open Source – Status – Production Use Cases
• Roadmap
42
![Page 43: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/43.jpg)
New Features
• Lineage in Storage (alpha) • Tiered Storage (alpha)
43
![Page 44: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/44.jpg)
New Features
• Lineage in Storage (alpha) • Tiered Storage (alpha) • Data Serving • Support for New Hardware • … • Your New Feature!
44
![Page 45: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/45.jpg)
45
Tachyon’s Goal?
![Page 46: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/46.jpg)
Distributed Memory-Centric Storage:Better Assist Other Components
Welcome Collaboration!
46
JIRA New Contributor Tasks
![Page 47: A Reliable Memory-Centric Distributed Storage System · PDF fileA Reliable Memory-Centric Distributed Storage System ... – Tachyon Architecture ... Hadoop MR Job YARN HDFS / Amazon](https://reader031.fdocuments.net/reader031/viewer/2022011723/5a880c2e7f8b9afc5d8e3540/html5/thumbnails/47.jpg)
• Website: http://tachyon-project.org
• Github: https://github.com/amplab/tachyon
• Meetup: http://www.meetup.com/Tachyon
• News Letter Subscription: http://goo.gl/mwB2sX • Email: [email protected]
47