DDSS: A Low-Overhead Distributed Data Sharing Substrate...
Transcript of DDSS: A Low-Overhead Distributed Data Sharing Substrate...
![Page 1: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/1.jpg)
DDSS: A Low-Overhead Distributed Data Sharing
Substrate for Cluster-Based Data-Centers over Modern
Interconnects
K. Vaidyanathan, S. Narravula and D. K. Panda
Network Based Computing Laboratory (NBCL)
The Ohio State University
![Page 2: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/2.jpg)
Presentation Outline
• Introduction and Motivation
• Proposed DDSS Framework
• Experimental Results
• Conclusions and Future Work
![Page 3: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/3.jpg)
Introduction and Motivation
WANWAN
Clients
Web-server(Apache)
DatabaseServer
(MySQL)
Storage
• Internet growth– Number of Users, Type of Service, Amount of data– E-Commerce, online-banking, stocks, airline reservations
• Data-centers enable such services– Process data and reply to queries– Need for services like caching, resource adaptation for performance,
scalability
ProxyServer
Caching,load
balancing
Application Server (PHP)
CGI, PHP
Multi-TierData-Centers
![Page 4: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/4.jpg)
High-Performance Networks
• InfiniBand, 10 GigE– High Bandwidth– Low Latency
• Provides rich features– RDMA semantics, Atomic operations, Protocol offload
• OpenFabrics stack– Single interface for InfiniBand, iWARP/10 GigE, etc
• Targeted for Multi-Tier Data-Centers• Can the data-center processes coordinate
better?
![Page 5: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/5.jpg)
Information-Sharing is common• Applications typically employ their own
– Data placement and management protocols– Synchronization protocols
• Data-Center services– Active Resource Adaptation
• Maintain Server state information
• Locking requirements– Caching
• Coherency & Consistency requirements
– Resource Monitoring (IBM Websphere)
• Load information shared across several servers
– Critical decisions based on shared information
ProxyModule M1(S1, S2
S3, S4 load)
ProxyModule M2
(S1, S2S3, S4 load)
ProxyModule M3
(S1, S2S3, S4 load)
Load ofServer S1
Load ofServer S2
Load ofServer S3
Load ofServer S4
Resource Monitoring Service
![Page 6: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/6.jpg)
Problems with Existing approaches
• Ad-hoc messaging protocols for exchanging data• May have high overheads• Performance may depend on the system load• May not use the advanced features• May not be scalable
![Page 7: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/7.jpg)
Objective
• Can we design a load resilient substrate (DDSS) for data-center applications and services utilizing advanced features such as RDMA, remote atomic operations?
![Page 8: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/8.jpg)
Presentation Outline
• Introduction and Motivation
• Proposed DDSS Framework
• Experimental Results
• Conclusions and Future Work
![Page 9: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/9.jpg)
Distributed Data Sharing Mechanism
Shared Data
Data-CenterApplication
ResourceAdaptationServices
LoadBalancingServices
Data-CenterApplication
ResourceMonitoringServices
ResourceMonitoringServices
Get
Get load
Get load
Put
Put load
Put load
Lock Data
Provide an effective mechanism to share data across the data-center
![Page 10: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/10.jpg)
Proposed DDSS Framework
InfiniBand 10 GigE High-Speed Interconnects
ProtocolOffload
RDMA Atomic Multicast
�P� Ma�a�e�e��
�o��e���o�M���
Me�oryM���
a�aM���
!a"��Lo�#"
�o$ere��y,�o�"�"�e��yMa���e�a��e
Data-CenterApplications
Data-CenterServices
High-SpeedNetworks
AdvancedNetworkFeatures
DistributedData-Sharing
SubstrateComponents
![Page 11: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/11.jpg)
Proposed Framework Contd…
• Data Management– Local vs Remote, for load
balancing
• Basic Locking– Through atomic operations
(IBA)
• Coherency and Consistency Maintenance– Strict, Write/Read, Null, Delta,
Version– Use of RDMA and atomic
operations
�P� Ma�a�e�e��
�o��e���o�M���
Me�oryM���
a�aM���
!a"��Lo�#"
�o$ere��y,�o�"�"�e��yMa���e�a��e
ProtocolOffload
RDMAAtomic
![Page 12: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/12.jpg)
Proposed Framework Contd…
• Connection Management– Takes care of connection-setup and
teardown for nodes participating in DDSS
• Memory Management– Allocates a pool of memory for
DDSS on each node– Manages allocation, release
operations
• IPC Management– Access for multiple threads– Message Queues
a�a%�e��er&''l��a��o�"
(Serv��e"
Module
IPC
OpenFabricsStack
OtherApplications
OtherModules
TCP/IPStack
![Page 13: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/13.jpg)
DDSS Interface
DDSS Interface• allocate_ss(…)• release_ss(…)• get(…)• put(…)• acquire_lock_ss(…)• release_lock_ss(…)• …
Key = allocate_ss(1024, NONCOHERENT_SS, 5000);
put(key, data, 10);compute();get(key,data, 10);release_ss(key);
Key = allocate_ss(1024, WRITE_COHERENT_SS, 5000);
acquire_lock_ss(key);
put(key, data, 10);release_lock_ss(key);
compute();get(key,data, 10);release_ss(key);
![Page 14: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/14.jpg)
Presentation Outline
• Introduction and Motivation
• Proposed Framework
• Experimental Results
• Conclusions and Future Work
![Page 15: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/15.jpg)
Experimental Testbed
• InfiniBand– Cluster with dual Intel Xeon 3.4 GHz, 1GB memory– MT25128 Mellanox HCA
• iWARP/GigE– Cluster with Intel dual Xeon 3.0 GHz, 512 MB
memory– Ammasso 1100 Gigabit Ethernet NIC
• OpenFabrics stack– IB, Ammasso (iWARP)
![Page 16: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/16.jpg)
Experimental Results Outline
• Microbenchmarks– Performance of put() and get() operations
• Distributed Applications– Distributed STORM– Checkpointing Application
• Data-Center Services– Active Resource Adaptation
– Active Caching
![Page 17: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/17.jpg)
Microbenchmarks
• performance of put() and get() operation for small messages is less than 65 usecs for all coherence models
0
2 0
4 0
6 0
8 0
1 0 0
1 2 0
1 4 0
1 1 6 2 5 6 4 0 9 6 6 5 5 3 6
M e s s a g e S i z e ( b y t e s )
Lat
ency
(u
secs
)
N u l lR e a dW r i t eS t r i c tV e r s i o nD e l t a
0
2 0
4 0
6 0
8 0
1 0 0
1 2 0
1 4 0
1 1 6 2 5 6 4 0 9 6 6 5 5 3 6
M e s s a g e S i z e (b y t e s )
Lat
ency
(u
secs
)
N u l lR e a dW r i t eS t r i c tV e r s i o nD e l t a
put() performance get() performance
![Page 18: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/18.jpg)
Distributed STORM
• Select data of interest and transfer from storage to compute nodes
• Same dataset is processed by multiple STORM applications
� this shared dataset is placed in DDSS
• STORM using DDSS shows close to 19% improvement
01 0 0 02 0 0 03 0 0 04 0 0 05 0 0 06 0 0 07 0 0 08 0 0 0
Q u e r y E x e c u t i o n
T i m e (u s e c s )
1 K 5 K 1 0 K 1 0 0 K
# R e c o r d s
S T O R M S T O R M -D D S S
![Page 19: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/19.jpg)
CR Coordination
• Checkpoint at random time
• Simulates restart from a consistent checkpoint
• Checkpoint uses DDSS for maintaining checkpoint information, locks, versions, etc
• Check-pointing applications using DDSS are highly scalable
050
100150200250300350
2 3 4 5 6 7 8 9 10 11 12
Number of cl ients
Tim
e (u
secs
)Avg Sync Time Avg Total Time
![Page 20: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/20.jpg)
Active Resource Adaptation
• Monitors the load of different websites
• If a website is loaded, shift under-utilized servers to loaded websites
• Software Overhead of DDSS is < 2%
0
5 0
1 0 0
1 5 0
5 1 0 2 0 4 0 6 0 8 0
L o a d (%)
Tim
e (u
secs
)
05 0 01 0 0 01 5 0 02 0 0 02 5 0 03 0 0 0
R e c o n f i g u r a t i o n T i m e s o f t w a r e -o v e r h e a dN o o f R e c o n f i g u r a t i o n s
![Page 21: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/21.jpg)
Active Caching
• Supports Strong Coherency for cached dynamic data
• Checks the back-end for current version using RDMA
• Active cache using DDSS is load-resilient
01 0 02 0 03 0 04 0 05 0 06 0 07 0 0
1 2 4 8 1 6 3 2
N u m b e r o f C o m p u t e /C o m m u n i c a t i o n T h r e a d s
Tim
e (u
secs
)
V e r s i o n C h e c k - D D S S V e r s i o n C h e c k - T C P
![Page 22: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/22.jpg)
Conclusions & Future Work
• Proposed a distributed data sharing substrate• Using DDSS, data-center applications and
services, with very little modification, can get significant benefits in performance and scalability
• Implemented over OpenFabrics – applicable across InfiniBand, iWARP-capable adapters
• Future work on Fault-tolerance, support for large file sizes, advanced resource management schemes.
![Page 23: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/23.jpg)
Acknowledgements
Our research is supported by the following organizations
• Current Funding support by
• Current Equipment support by
![Page 24: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/24.jpg)
Web Pointers
Group Homepage: http://nowlab.cse.ohio-state.edu
Emails: {vaidyana, narravul, panda}@cse.ohio-state.edu
NBC-LAB
![Page 25: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/25.jpg)
Backup Slides
![Page 26: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/26.jpg)
High-Performance Networks in Data-Centers
• InfiniBand, iWARP-capable adapters– Offer several features like RDMA, atomic operations (IB), iWARP
(Ammasso, 10 GigE)
Cluster-Based Data-Center Environment
(InfiniBand, iWARP-capable Ammasso, 10 GigE)
WideArea
Network
Distributed Data-Center Environment
iWARPCluster
iWARPCluster
iWARPCluster
![Page 27: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/27.jpg)
Active Resource Adaptation Design
ServerWebsite A
LoadBalancer
ServerWebsite B
Not Loaded Loaded
Load QueryLoad Query
Successful Atomic (Lock)
Successful Atomic (Update Counter)
Reconfigure Node
Successful Atomic (Unlock)
Load Shared Load Shared
RDMARDMA
P. Balaji, K. Vaidyanathan, S. Narravula and D.K. Panda “Exploiting Remote Memory Operations to Design Efficient Reconfiguration forShared Data-Centers over InfiniBand” presented at RAIT 2004
![Page 28: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/28.jpg)
RDMA based Client Polling Design
Front-End Back-End
Request
Cache Hit
Cache Miss
Response
Version Read
Response
S. Narravla, P. Balaji, K. Vaidyanathan, S. Krishnamoorthy, .. /u and D.K. Panda “Supporting Strong Coherency for Active Caches in Multi-Tier Data-
Centers over InfiniBand” presented at SAN 2004
![Page 29: DDSS: A Low-Overhead Distributed Data Sharing Substrate …mvapich.cse.ohio-state.edu/static/media/publications/slide/... · DDSS: A Low-Overhead Distributed Data Sharing Substrate](https://reader031.fdocuments.net/reader031/viewer/2022021808/5c04c20509d3f291388c4fd9/html5/thumbnails/29.jpg)
Microbenchmarks
• performance of put() and get() operation is less than 50 usecs
0
1 0
2 0
3 0
4 0
5 0
6 0
1 5 9
N u m b e r o f C l i e n t s
Lat
ency
(u
secs
)
N u l lR e a dW r i t eS t r i c tV e r s i o nD e l t a
0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
0 .1 0 .8
L o c k C o n t e n t i o n (% )
Lat
ency
(u
secs
)
N u l lR e a dW r i t eS t r i c tV e r s i o nD e l t a