DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
-
Upload
brendan-sutton -
Category
Documents
-
view
215 -
download
1
Transcript of DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
AGENDA
• Introduction
• Issues and Approaches
• Summary & Resources
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
Objectivity Corporate InformationObject Database Management for:
• Data intensive applications that manipulate complex data • High throughput systems• Very large volumes of data
Main Markets
• Government
• Scientific
• Telecommunications
• Engineering
• Manufacturing
• Complex IT
Product Highlights
• High Performance with complex data
• Scalability and High Availability
• Fully Distributed
• Interoperability
- C++, Java, Smalltalk, SQL and XML
- Linux, LynxOS, Unix and Windows
• Productivity
- Eclipse IDE
- Eliminates the object to DB mapping layer
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
SCALABILITY
• Data Volume - 890 Terabytes [BaBar]
• Throughput – Ingested 32 Terabytes per Day [Benchmark]
In a recent benchmark with Objectivity/DB running on 64 Irix processors (600 MHz), CXFS and a 100 Terabyte SAN we achieved:
• An ingest rate of 32 Terabytes per day (input, correlate and commit)• Simultaneous queries from 32 processors running at near to 100% CPU capacity• Simultaneous movement and deletion of aged data to a long term repository
• Simultaneous Users – 100s of Thousands [SprintPCS]
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
ISSUES
• Describing complex data
• Exponentially increasing data volumes
• Sharing data across sites
• Querying huge datasets
• Cost of Ownership
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
DESCRIBING COMPLEX DATA Approaches:
• Old Way- Definitions buried in header files
- Language-specific schema language (DDL/SQL)
• Current Approaches- Unified Modeling Language [UML]
- XML
• Trends- Java Database Objects [JDO]
- Grid Database Access and Integration Services
- Higher level schemas and ONTOLOGIES
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
DATA VOLUMES Approaches:
• Old Way- Keep data in compressed files and index them in a DBMS- Proprietary tape archives
• Current Approaches- Store everything in an ODBMS (lower overheads than an RDBMS)- Hierarchical storage systems (HPSS etc.)
• Trends- Solid State Disks at the front end, commodity disks at the back end- Heterogeneous Storage Area Networks [SAN], e.g. CXFS- Fiber Optic processor-to-SAN switches- Grid enablement (totally distributed archives)
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
SHARING DATA ACROSS SITES Approaches:
• Old Way- Transfer files/disks/tapes- Filesystem or no security
• Current Approaches- Distributed databases and the World Wide Web- High bandwidth networks- Authentication and secure transport layers
• Trends- Grid enablement- Federated databases- Ultra-high bandwidth networks and remote replication- Flexible, localized security mechanisms
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
Distributed Federations
A2
Replica of A
A Organization X
Organization Y
User X1
User X2
User X3
User Y1
A3
Replica
of A
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
Distributed Federations
A2
Replica of A
A Organization X
Organization Y
User X1 Mobile and Detached
User X2
User X3
User Y1
A3 Replica of A
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
QUERYING HUGE DATASETS Approaches:
• Old Way- Hold metadata (indexes and relationships) in a searchable file
• Current Approaches- Hold metadata in a RDBMS and data in files
- Hold metadata and data in an ODBMS
• Trends- Adaptations of text search engines
- Distributed Parallel Query Engines
- Specialized search accelerators
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
Current ArchitectureQueries run synchronously within the client
Networking & Event Managers
Storage & Transaction Managers
Query & Index Managers
Object & Schema Managers
Language Interfaces
APPLICATIONDBA ToolsLock Server
Lock Server
Data “Page” Server
Mass Storage
Data “Page” Server
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
Parallel Query Engine [PQE]Queries run asynchronously and in parallel, either locally or distributed
Networking & Event Managers
Storage & Transaction Managers
Query & Index Managers
Object & Schema Managers
Language Interfaces
APPLICATIONDBA Tools
Lock ServerLock Server
Data “Page” ServersPQE
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
PQE and Search AcceleratorQueries run asynchronously and in parallel, but with Predicate Management within the Search Accelerator
Networking & Event Managers
Storage & Transaction Managers
Query Manager
Object & Schema Managers
Language Interfaces
APPLICATIONDBA Tools
Lock ServerLock Server
Data ServersPQE
FPGA & RAM
Search Accelerator
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
COST OF OWNERSHIP Approaches:
• Old Way- Build It Yourself (many hidden costs)
- Run It Yourself
• Current Approaches- Use Commercial Off The Shelf [COTS] software
- Open Source
- Commodity hardware & tiered storage
• Trends- Heterogeneous storage
- Grid Enablement
- Resource and Skill Brokers (Future)
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
SUMMARY • Database languages are still evolving• Data throughput and system latency times are decreasing• Sharing data across sites still presents many challenges• Querying vast datasets will become faster and cheaper• Software vendors are wrestling with Open Source issues• Startup costs are still high, but the trends are downward• Grid enablement will help• Keep working on the Standards!
DMW2004 3/16/04Copyright Objectivity, Inc. 2004
RESOURCES
• http://www.objectivity.com• Technical Overview
• Data Sheets and White Papers
• Free downloadable Java and C++ evaluation software and tutorials
• Global Grid Forum• http://www.ggf.org
• Email: [email protected]
ANY QUESTIONS?