Distributed Metadata with the AMGA Metadata Catalog
description
Transcript of Distributed Metadata with the AMGA Metadata Catalog
INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org CERNCERN
Distributed Metadata with the AMGA Metadata Catalog
Nuno Santos, Birger Koblitz
20 June 2006Workshop on Next-Generation Distributed Data Management
Workshop on Next-Generation Distributed Data Management - 20 June 2006 2
Enabling Grids for E-sciencE
INFSO-RI-508833
Abstract
• Metadata Catalogs on Data Grids – The case for replication
• The AMGA Metadata Catalog• Metadata Replication with AMGA• Benchmark Results• Future Work/Open Challenges
Workshop on Next-Generation Distributed Data Management - 20 June 2006 3
Enabling Grids for E-sciencE
INFSO-RI-508833
Metadata Catalogs
• Metadata on the Grid– File Metadata - Describe files with application-specific
information Purpose: file discovery based on their contents
– Simplified Database Service – Store generic structured data on the Grid Not as powerful as a DB, but easier to use and better Grid
integration (security, hide DB heterogeneity)
• Metadata Services are essential for many Grid applications
• Must be accessible Grid-wide
But Data Grids can be large…
Workshop on Next-Generation Distributed Data Management - 20 June 2006 4
Enabling Grids for E-sciencE
INFSO-RI-508833
An Example - The LCG Sites
• LCG – LHC Computing Grid– Distribute and process the data generated by the LHC (Large Hadron Collider) at CERN– ~200 sites and ~5.000 users worldwide
Taken from: http://goc03.grid-support.ac.uk/googlemaps/lcg.html
Workshop on Next-Generation Distributed Data Management - 20 June 2006 5
Enabling Grids for E-sciencE
INFSO-RI-508833
Challenges for Catalog Services
• Scalability– Hundreds of grid sites– Thousands users
• Geographical Distribution– Network latency
• Dependability– In a large and heterogeneous system, failures will be common
• A centralized system does not meet the requirements– Distribution and replication required
Workshop on Next-Generation Distributed Data Management - 20 June 2006 6
Enabling Grids for E-sciencE
INFSO-RI-508833
Off-the-shelf DB Replication?
• Most DB systems have DB replication mechanisms– Oracle Streams, Slony for PostgreSQL,
MySQL replication
MetadataCatalog
MetadataCatalog
• Example: 3D Project at CERN (Distributed Deployment of Databases)– Uses Oracle Streams for replication– Being deployed only at a few LCG sites (~10 sites, Tier-0 and Tier-1s)
Requires Oracle ($$$) and expert on-site DBAs ($$$) Most sites don’t have these resources
• Off-the-shelf replication is vendor-specific– But Grids are heterogeneous by nature– Sites have different DB systems available
Only partial solution to the problem of metadata replication
Workshop on Next-Generation Distributed Data Management - 20 June 2006 7
Enabling Grids for E-sciencE
INFSO-RI-508833
Replication in the Catalog
• Alternative we are exploring:
Replication in the Metadata Catalog
• Advantages– Database independent– Metadata-aware replication
More efficient – replicate Metadata commands Better functionality – Partial replication, federation
– Ease of deployment and administration Built-in into the Metadata Catalog No need for dedicated DB admin
• The AMGA Metadata Catalogue is the basis for our work on replication
MetadataCatalog
MetadataCatalog
Workshop on Next-Generation Distributed Data Management - 20 June 2006 8
Enabling Grids for E-sciencE
INFSO-RI-508833
The AMGA Metadata Catalog
• Metadata Catalog of the gLite Middleware (EGEE)
• Several groups of users among the EGEE community:– High Energy Physics– Biomed
• Main features – Dynamic schemas– Hierarchical organization– Security:
Authentication: user/pass, X509 Certs, GSI
Authorization: VOMS, ACLs
AMGAServer
MetadataCommands
Metadata tables
Workshop on Next-Generation Distributed Data Management - 20 June 2006 9
Enabling Grids for E-sciencE
INFSO-RI-508833
AMGA Implementation
• C++ implementation• Back-ends
– Oracle, MySQL, PostgreSQL, SQLite
• Front-end - TCP Streaming– Text-based protocol like
TELNET, SMTP, POP…
Metadata Server
MDServer
TCP Streaming
PostgreSQL
Oracle
SQLite
Client
MySQL
addentry /DLAudio/song.mp3 /DLAudio:Author ‘John Smith’ /DLAudio:Album ‘Latest Hits’
selectattr /DLAudio:FILE /DLAudio:Author /DLAudio:Album‘like(/DLAudio:FILE, “%.mp3")‘
• Examples:Adding data
Retrieving data
Workshop on Next-Generation Distributed Data Management - 20 June 2006 10
Enabling Grids for E-sciencE
INFSO-RI-508833
Standalone Performance
• Single server scales well up to 100 concurrent clients
• Could not go past 100. Limited by the database
• WAN access one to two orders of magnitude slower than LAN
Replication can solve both bottlenecks
Workshop on Next-Generation Distributed Data Management - 20 June 2006 11
Enabling Grids for E-sciencE
INFSO-RI-508833
Metadata Replication with AMGA
Workshop on Next-Generation Distributed Data Management - 20 June 2006 12
Enabling Grids for E-sciencE
INFSO-RI-508833
Requirements of EGEE Communities
• Motivation: Requirements of EGEE’s user communities.– Mainly HEP and Biomed
• High Energy Physics (HEP)– Millions of files, 5.000+ users distributed across 200+ computing
centres– Mainly (read-only) file metadata– Main concerns: scalability, performance and fault-tolerance
• Biomed– Manage medical images on the Grid
Data produced in a distributed fashion by laboratories and hospitals Highly sensitive data: patient details
– Smaller scale than HEP– Main concern: security
Workshop on Next-Generation Distributed Data Management - 20 June 2006 13
Enabling Grids for E-sciencE
INFSO-RI-508833
Metadata Replication
MetadataCommands
RedirectedCommands
Full replication Partial replication
Federation Proxy
Some replication models
Workshop on Next-Generation Distributed Data Management - 20 June 2006 14
Enabling Grids for E-sciencE
INFSO-RI-508833
Architecture
• Main design decisions– Asynchronous replication –
for tolerating with high latencies and fault-tolerance
– Partial replication – Replicate only what is interesting for the remote users
– Master-slave – Writes only allowed on the master But mastership is granted
to metadata collections, not to nodes
AMGAServer
ReplicationDaemon
MetadataCommands
Updatelogs
Local updates
Metadata tables
Remote updates
Workshop on Next-Generation Distributed Data Management - 20 June 2006 15
Enabling Grids for E-sciencE
INFSO-RI-508833
Status
• Initial implementation completed– Available functionality:
Full and partial replication Chained replication (master → slave1 → slave2) Federation - basic support
• Data is always copied to slave Cross DB replication: PostgreSQL → MySQL tested
• Other combinations should work (give or take some debugging)
• Available as part of AMGA
Workshop on Next-Generation Distributed Data Management - 20 June 2006 16
Enabling Grids for E-sciencE
INFSO-RI-508833
Benchmark Results
Workshop on Next-Generation Distributed Data Management - 20 June 2006 17
Enabling Grids for E-sciencE
INFSO-RI-508833
Benchmark Study
• Investigate the following:1) Overhead of replication and scalability of the master
2) Behaviour of the system under faults
Workshop on Next-Generation Distributed Data Management - 20 June 2006 18
Enabling Grids for E-sciencE
INFSO-RI-508833
Scalability
• Small increase in CPU usage as number of slaves increases– 10 slaves, 20% increase from standalone operation
• Number of update logs sent scales almost linearly
• Setup
• Insertion rate at master: 90 entries/s.
• Total: 10,000 entries• 0 slaves - saving replication
updates, but not shipping (slaves disconnected)
10 Slaves
Master
Workshop on Next-Generation Distributed Data Management - 20 June 2006 19
Enabling Grids for E-sciencE
INFSO-RI-508833
Fault Tolerance
• Next test illustrates fault tolerance mechanisms
• Slave fails– Master keeps the updates for the
slave
– Replication log grows
• Slave reconnects– Master sends pending updates
– Eventually system recovers to a steady state with the slave up-to-date
• Test conditions:– Insertion rate at master: 50
entries/s
– Total: 20.000 entries
– Two slaves, both start connected
– Slave1 disconnects temporarily
Setup:
Slaves
Master
Workshop on Next-Generation Distributed Data Management - 20 June 2006 20
Enabling Grids for E-sciencE
INFSO-RI-508833
Fault Tolerance and Recovery
• While slave1 is disconnected, the replication log grows in size– Limited in size. Slave unsubscribed if it does not reconnect in time.
• After slave reconnection, system recovers in around 60 seconds.
Workshop on Next-Generation Distributed Data Management - 20 June 2006 21
Enabling Grids for E-sciencE
INFSO-RI-508833
Future Work/Open Challenges
Workshop on Next-Generation Distributed Data Management - 20 June 2006 22
Enabling Grids for E-sciencE
INFSO-RI-508833
Scalability
• Support hundreds of replicas– HEP use case. Extreme case: one replica catalog per site
• Challenges– Scalability– Fault-tolerance – tolerate failures of slaves and of master
• Current method of shipping updates (direct streaming) might not scale– Chained replication (divide and conquer)
Already possible with AMGA, performance needs to be studied
– Group communication
Workshop on Next-Generation Distributed Data Management - 20 June 2006 23
Enabling Grids for E-sciencE
INFSO-RI-508833
Federation
• Federation of independent catalogs– Biomed use case
• Challenges– Provide a consistent view over the federated catalogs– Shared namespace– Security - Trust management, access control and user
management
• Ideas
Workshop on Next-Generation Distributed Data Management - 20 June 2006 24
Enabling Grids for E-sciencE
INFSO-RI-508833
Conclusion
• Replication of Metadata Catalogues necessary for Data Grids
• We are exploring replication at the Catalogue using AMGA
• Initial implementation completed– First results are promising
• Currently working on improving scalability and on federation
• More information about our current work at:http://project-arda-dev.web.cern.ch/project-arda-dev/metadata/