© Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of...

19
© Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Automatic Deployment of Application-Specific Application-Specific Metadata and Code in Metadata and Code in MOCHA MOCHA Manuel Rodriguez-Martinez Nick Roussopoulos

Transcript of © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of...

Page 1: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

© Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved

Automatic Deployment of Automatic Deployment of Application-Specific Metadata and Application-Specific Metadata and

Code in MOCHACode in MOCHA

Manuel Rodriguez-Martinez

Nick Roussopoulos

Page 2: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

2

Client

IntroductionIntroduction

Database Middleware Systems: Used to integrate data from

multiple sources. Help to keep clients simple

• thin clients• economic ($$$) to deploy• Web-based GUI

Re-use existing servers• replacing them can be

expensive and dangerous Examples

• TSIMMIS, Garlic, DISCO, Oracle, Sybase, ...

Client

Oracle ImagesXML

Translator Translator Translator

IntegrationServer Catalog

Page 3: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

3

Limitations of this SolutionLimitations of this Solution

Code Deployment ProblemCode Deployment Problem– Code for data types and

operators is user-defined• Polygon • Perimeter()

– Need to manually install the code to:

• clients• integration servers• translators

– Must be ported (C/C++ code)– Security (do not crash system) Does not scale well as the

number of sites increases• hard to deploy, upgrade and

maintain the code

ClientClient

Oracle ImagesXML

Translator Translator Translator

IntegrationServer Catalog

Page 4: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

4

Limitations of this SolutionLimitations of this Solution

Query Processing ProblemQuery Processing Problem– Availability of code limits

operator placement options.• not all sites can evaluate the

operators in a query

– Integration server ends up doing most of the processing.

• data must be shipped to it

– Too much data movement! Does not scale well

• network becomes a major performance bottleneck

• limited bandwidth increases query execution time

ClientClient

Oracle ImagesXML

Translator Translator Translator

IntegrationServer Catalog

100MB

100MB

100MB

Page 5: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

5

The MOCHA SolutionThe MOCHA Solution

Middleware system automatically deploys the code– ship Java classes for data types and operators

– done at run time in dynamic fashion

Provide information on how to use the code– metadata and control in XML and RDF

Exploit these features in query operator placement– place operators at sites that minimize data movement

• remote data sources get operators that filter the data• integration server gets operators that expand the data

– more on this: SIGMOD 2000 paper

Page 6: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

6

MOCHA ArchitectureMOCHA Architecture

Client Client

Network

Oracle 8i InformixXML

RepositoryTextFiles

DAPDAP DAP DAP

QPC CatalogCodeRepository

Page 7: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

7

Automatic Code DeploymentAutomatic Code Deployment

Select location, Composite(image)From RastersWhere week BETWEEN t1 and t2Group By location

QPC

Client

InternetInternet

CodeRepository

Catalog

Texas Virginia Maryland

DAP

Informix

DAP

Oracle

Virginia

Page 8: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

8

InternetInternet

Answering the QueryAnswering the Query

Select location, Composite(image)From RastersWhere week BETWEEN t1 and t2Group By location

QPC

Client

CodeRepository

Catalog

Texas Virginia Maryland

DAP

Informix

DAP

Oracle

Virginia

200MB

tuples

100MB

tuples

results

200KB

results

150KB

results

150KB

results

200KBresults

150KB

results

200KB

results

350KB

results

350KB

Page 9: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

9

Components of MOCHAComponents of MOCHA

• Client Application

• QPC– parsing (SQL)

– optimizing

– catalog management

– code deployment

– query execution

• DAP– data translation

– query execution

• Data Server– storage server

Client

QPC

Catalog

CodeRepository

DAP

Oracle

DAP

XML

DAP

Text

Internet

Page 10: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

10

Catalog OrganizationCatalog Organization

• Holds information describing the structure and proper use of tables, data types and query operators.– Generically referred to as “resources”

• Each resource is uniquely identified by an URI:– mocha://cs1.umd.edu/EarthSci/Polygon

• Metadata is encoded using RDF (an XML derivative)makes it easy to understand, use and exchange metadata

• Each resource has a catalog entry in the form:

(URI, RDF File)

Page 11: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

11

Metadata RequirementsMetadata Requirements

Select location, Composite(image)From RastersWhere week BETWEEN t1 and t2Group By location

locationimageweekband

Table RastersQuery:

1. What kind of metadata are needed?2. How to specified them?

Page 12: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

12

RDF Model: RDF Model: Data TypesData Types

mocha:T

ype

mocha:Class

mocha:Repos ito

ry

mocha:Size

mocha:Creator

mocha://cs1.umd.edu/EarthSci/Raster

Raster

Raster.class cs1.umd.edu/EarthSci 1 megabyte

[email protected]

Page 13: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

13

RDF Model: RDF Model: Query OperatorsQuery Operators

mocha:Aggregate

moch

a:Cl

ass

moc

ha:R

epos

itory

moc

ha:T

ype

moch

a:A

rgum

ents

Composite

mocha:Creator

moch

a:U

RI m

och

a:T

yp

e mocha:U

RI

rdf:type

Composite.class

cs1.umd.edu/EarthSci

. . .Raster rdf:Seq. . .

Raster

mocha:Result

mocha://cs1.umd.edu/EarthSci/Composite

[email protected]

Page 14: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

14

RDF Model: RDF Model: TablesTables

mocha://cs1.umd.edu/EarthSciDB/Rasters

. . .

. . .

mocha:Table Rasters

cs1.umd.edu/EarthSciDB mocha:Database

mocha:Columns

rdf:type rdf:Seq

mocha:Owner [email protected]

moch

a:Colu

mn

moch

a:T

ype

mocha:URI

location Polygon . . .

mocha:Column

mocha:Type

mocha:URI

. . . image Raster

Page 15: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

15

Metadata and Control ExchangeMetadata and Control Exchange

• QPC sends to each DAP: metadata for the data types and

operators they will receive query plan specifying task to do

• Metadata is serialized as XML– RDF serialization syntax

• Plans– XML documents– easy to use and understand– can be mapped to suitable form

• tree, DAG, graph, etc.

– prevents version inconsistencies• changes in Java classes

<rdf:Description about= “mocha://cs1.umd.edu/EarthSci/Raster”> <mocha:Type>Raster</mocha:Type> <mocha:Class> Raster.class </mocha:Class> <mocha:Repository>

cs1.umd.edu/EarthSci </mocha:Repository> <mocha:Size>1MB</mocha:Size> <mocha:Creator>[email protected] </mocha:Creator></rdf:Description>

Page 16: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

16

Processing a Query in MOCHAProcessing a Query in MOCHA

Query Parsing

Resource Discovery

Query Optimization

Metadata and Control

Exchange

Code Deployment Phase

Query Execution

Select location, Composite(image)From RastersWhere week BETWEEN t1 and t2Group By location

locationimageweekband

Table Rasters

Query:

Page 17: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

17

Performance of MOCHAPerformance of MOCHA

0

100

200

300

400

500

600

DB CPU NET MISC

shipping Composite() code to DAP cuts data movement by 99% 4-1 performance

improvement

Ru

nnin

g T

ime

(sec

s)

Select location, Composite(image)From RastersWhere week BETWEEN t1 and t2Group By location

Non-MOCHA MOCHA

Middleware Type

Page 18: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

18

Benefits of MOCHABenefits of MOCHA

Middle-tier solution

Extensible

Java Code Re-usability– across platforms

Automatic Code Deployment – “Plug-n-Play”

Easier to Administer

XML-based Metadata

XML-based Control

Efficient Query Processing– data movement reduction– moving code vs. data

Page 19: © Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.

EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos

19

ConclusionsConclusions

• Identified limitations in existing middleware systems– Code Deployment Problem– Query Processing Problem

• Proposed a new framework to automate the deployment of new functionality:– automatic code deployment– efficient query processing

• Described its implementation in MOCHA,based on well-accepted technologies: Java, XML, RDF.

http://www.cs.umd.edu/projects/mocha/http://www.cs.umd.edu/projects/mocha/