OODBS
-
Upload
minh-tu-tran -
Category
Documents
-
view
225 -
download
0
Transcript of OODBS
-
8/2/2019 OODBS
1/29
Chapter III
Object-Oriented DatabaseSystems
Nguyen Kim AnhDept. of Information Systems, SoICT, HUT
Outline
Object-Oriented Data Model Object-Oriented Database System(OODBS) Object-Oriented Data Definition Language Object-Oriented Query Language Index organizations for OODBS Query Optimization in OODBS
Object-Oriented Data Model
Definition of an object Object Identity Object Structure Object-Oriented Concepts Graphical representation of a complex object Comparisons of the states of two objects for equality Class Schema
-
8/2/2019 OODBS
2/29
Definition of an object
Objects User defined complex data typesAn object has structure or state (variables) and methods(behavior/operations)
An object is described by four characteristicsIdentifier: a system-wide unique id for an objectName: an object may also have a unique name in DB (optional)Lifetime: determines if the object is persistent or transientStructure: Construction of objects using type constructors
Object Identity
unique identity for each independentobject stored in the database
created by a unique, system-generatedobject identifier, or OID
Object Identity
properties of OID
immutable: the OID value of a particular objectshould not change
Each OID is used only once.
OID should not depend on any physical addressattribute values of the object
Most OO database systems allow for the representation ofboth objects and values (having no OIDs)
-
8/2/2019 OODBS
3/29
Object Structure
The state (current value) of a complex object may beconstructed from other objects (or other values) byusing certain type constructors
Can be represented by (i,c,v)i is an unique idc is a type constructorv is the object state
Type constructorsBasic types: atom (integer,real,string,Boolean,)Structured type: tupleCollection type: array vs. list (order), set vs.bag (unorder)
Object Structure
Object statesc=atom ::: an atomic value from the domainc=set ::: a set of object identifiers {i1, i2, , in}c=tuple ::: a tuple of c=list ::: an ordered list [i1, i2, , in]c=array ::: a single-dimensional array of objectidentifiersc=bag
Object-Oriented Concepts
Abstract Data TypesClass definition provides extension to complexattribute types
EncapsulationImplementation of operations and objectstructure hidden
InheritanceSharing of data within hierarchy scope, supportscode reusability
Polymorphism Operator overloading
-
8/2/2019 OODBS
4/29
o8=(i8, tuple, ) :::department 5
o9=(i9, tuple, )
o10=(i10 ,set,{i 12 , i13 , i14}) o11 =(i11 ,set,{i 15 , i16 , i17})
Example 1: Complex Object
Graphical representation ofa complex object
LEGEND: object
tuple
set
NAME NUMBER MANAGER LOCATIONS MEMBERS CONTROL
MANAGER MANAGERSTRATDATE
O5 O4 O9 O7 O10 O11
O8
O1 O2 O3
Houston Bellaire Sugarland
i8:tuple
i5:atom
i4:atom
i9:tuple
i7:set
i10:set
i11:set
i15: .....tuple
i16: .....tuple
i17: .....tuple
i12: .....tuple
i13: .....tuple
i14: .....tuple
i6:atom
O6
1988-05-22
V4V9
V5V7
V10V11
V1 V2 V3
V6
Research 5
Object instance of department type
Comparisons of the states oftwo objects for equality
identical states (deep equality) the graphs representing their states are
identical in every respect, including the OIDs atevery level
equal states (shallow equality) the graph structures must be the same all the corresponding atomic values in the
graphs should be the same allow some corresponding internal nodes in
the two graphs to have objects with different OIDs
(Current Values)
-
8/2/2019 OODBS
5/29
Example 2:Identical vs. Equal Object States
o1=(i1, tuple, )o2=(i2, tuple, )o3=(i3, tuple, )o4=(i4, atom, 10)o5=(i5, atom, 10)o6=(i6, atom, 20)
o1 and o 2 have equal states o1 and o 3 have identical states o4 and o 5 have identical states o4 and o 5 are equal but not identical
Class Schema
Outline
Object-Oriented Data Model Object-Oriented Database System(OODBS) Object-Oriented Data Definition Language Object-Oriented Query Language Index organizations for OODBS Query Optimization in OODBS
-
8/2/2019 OODBS
6/29
Object-Oriented DatabaseSystem (OODBS)
A database system that incorporates allthe important object-oriented concepts
Some additional featuresUnique Object identifiersPersistent object handling
Advantages of OODBS
Designer can specify the structure ofobjects and their behavior (methods)
Better interaction with object-orientedlanguages such as Java and C++
Definition of complex and user-definedtypes
Encapsulation of operations and user-
defined methods
Outline
Object-Oriented Data Model Object-Oriented Database System(OODBS) Object-Oriented Data Definition Language Object-Oriented Query Language Index organizations for OODBS Query Optimization in OODBS
-
8/2/2019 OODBS
7/29
Object-Oriented Data DefinitionLanguage(OODDL)
Using OODDL to define Employee, Date, and Department typesdefine type
tuple (Employee:
name:birthday:address:
sex:salary:
workfor:supervisor:supervisee:
manage:workon
string;date;string;stringint;Department;Employee;set(Employee);Department;set(Project); )
Attributes refer to Employee,Department , Project objects relationship among objects
define typetuple (
define typetuple (
Date:year:
month:day:
Departmentname:
number:manager:
locations:members:
control:
integer ;integer ;integer ; );
string ;integer ;tuple (manager:
startdate:set (string );set (Employee);set (Project); );
Employee;Date; );
Using OODDL to define Employee, Date, andDepartment types ( Cont .)
Inverse reference: dept. of employee
employee of dept.
set of references
Specifying Object Behaviorvia Class Operations
Define the behavior of a type of object based onthe operations that can be externally applied toobject of that type create (insert) or destroy (delete) objects update the object state retrieve parts of the object state apply some calculations combination of retrieval, calculation, and update
In relational model, selecting, inserting, deletingand modifying tuples are generic .
-
8/2/2019 OODBS
8/29
Specifying Object Behaviorvia Class Operations ( Continued )
interface define the name and arguments (parameters)of each operation
signature (included in the class definition) implementation
method (defined using programminglanguages)
it is invoked by sending a message to theobject to execute the corresponding method
Operations 1. object constructors2. object destructor3. object modifier4. retrieval
Using OODDL to define Employeeand Department classes
operationsagecreate_emp:destroy_emp :
end Employee;
integer;Employee;
boolean ;
define classtype tuple (
Employee:name:
birthday:address:
sex:salary:
workfor:supervisor:supervisee:
manage:workon:
string;date;string;stringint;Department;Employee;set(Employee);Department;set(Project);
typedefinition
definitionof
operations
Using OODDL to define Employeeand Department classes (Continued)
define class
type tuple (
Department
name:number:
manager:
locations:members:
control:
string ;integer ;tuple (manager:
startdate:set (string );set (Employee);set (Project); );
Employee;Date; );
operationsnumber_of_emps : integer ;create_dept: Department,destroy_dept: boolean ;assign_emp (e: Employee): boolean ;
(* adds a new employee to the department *)remove_emp (e: Employee): boolean ;
(* removes an employee from the department *)end Department;
typedefinition
definitionof
operations
-
8/2/2019 OODBS
9/29
Class Operations
object constructor create a new object
destructor destroy an object
object modifier modify various attribute of an object
dot notation d.no_of_emps where d is a reference to a
department object and no_of_emps is anoperation
refer to attributes of an object: d.dnumber, d.mgr.startdate
Specifying Object Persistencevia Naming and Reachability
transient object exist in the executing program and disappear
once the program terminates
persistent object stored in the database and persist after
program termination
naming mechanism give an object a unique persistent name
through which it can be retrieved by this andother program
Reachability reachability mechanism
make the object reachable from some persistentobject an object B is said to be reachable from an
object A if a sequence of references in theobject graph lead from object A to object B
e.g., if o 8 is persistent, then all other objects alsobecome persistent (next slide)
N defines a persistent collection of objects ofclass C
create a named persistent object N, whosestate is a set or list of objects of some class C
add objects of C to the set or list and makethem reachable from N
-
8/2/2019 OODBS
10/29 1
Graphical representation ofa complex object
LEGEND: object
tuple
set
NAME NUMBER MANAGER LOCATIONS MEMBERS CONTROL
MANAGER MANAGERSTRATDATE
O5 O4 O9 O7 O10 O11
O8
O1 O2 O3
Houston Bellaire Sugarland
i8:tuple
i5:atom
i4:atom
i9:tuple
i7:set
i10:set
i11:set
i15: .....tuple
i16: .....tuple
i17: .....tuple
i12: .....tuple
i13: .....tuple
i14: .....tuple
i6:atom
O6
1988-05-22
V4V9
V5V7
V10V11
V1 V2 V3
V6
Research 5
Object instance of department type
Creating persistent objects bynaming and reachability
define class DepartmentSet:type set (Department);operations
add_dept(d: Department): boolean ;remove_dept (d: Department): boolean ,create_dept_set: DepartmentSet;destroy_dept_set: boolean ;
end DepartmentSet;
persistent name AllDepartments: DepartmentSet ;(* AllDepartments is a persistent named object of type set DepartmentSet*)
.....d := create_dept ;
..... (* creates a new department object in the variable d *)b := AllDepartments.add_dept (d) ;
(* make d persistent by adding it to the persistent named ob ject AllDepartments *)
AllDepartments object: extent of the class Department
Differences between traditionaldatabases and OO databases
traditional database models when an entity type or class is defined in EER,
it represents both type declaration andpersistent set
OO approaches a class declaration specifies only the type and
operations for a class of objects user must define a persistent object whose
value is the collection of references to allpersistent
-
8/2/2019 OODBS
11/29 1
Type Hierarchies and Inheritance type (or class) hierarchy
define new types based on other predefined types (orclasses)
type type name a number of attributes (instance variables) operations (methods)
TYPE_NAME: function, function, , function PERSON: Name, Address, Birthdate, Age, SSN
EMPLOYEE subtype-of PERSON: Salary,HireDate, Seniority
STUDENT subtype-of PERSON: Major, GPA
functions with zero arguments
functions
Inheritance
multiple inheritance when T is a subtype of two (or more) types, T
inherits the functions (attributes and methods) ofboth supertypes
type lattice instead of type hierarchy if a function is inherited from some common
supertype , it is inherited only once ambiguity resolution
alarm users
system default disallow multiple inheritance
Inheritance ( Continued )
Selective Inheritance a subtype inherits only some of the functions
of a supertype an EXCEPT clause may be used to list the
functions in a super type that are not to beinherited by the subtype
-
8/2/2019 OODBS
12/29 1
Outline
Object-Oriented Data Model Object-Oriented Database System(OODBS) Object-Oriented Data Definition Language Object-Oriented Query Language Index organizations for OODBS Query Optimization in OODBS
Object-Oriented Query Language
Declarative query language Not computationally complete
Syntax based on SQL (select, from,where)
Additional flexibility (queries with userdefined operators and types)
SQL3 Object -oriented SQL
Foundation for several OO database managementsystems ORACLE8, DB2, etc New features relational & Object oriented Relational Features new data types, new
predicates, enhanced semantics, additional securityand an active database
Object Oriented Features support for functions andprocedures
Set-oriented query language
-
8/2/2019 OODBS
13/29 1
Object Query Language (OQL) Syntax based on SQL (select, from, where) :
select
from [,.] where
Path-oriented query languagePath : C 1.A1.A2... . A n-1 .An
C2 C3... .C nPath expression : C 1.A1.A2... . A n-1 .An = v
Example of OQL query
The following is a sample querywhat are the names of the black product?
Select distinct p.nameFrom products pWhere p.color = black
Valid in both SQL and OQL, but results aredifferent.
Result of the query (SQL)
Product no Name ColorP1 Ford Mustang BlackP2 Toyota Celica GreenP3 Mercedes SLK Black
- The statement queries a relationaldatabase.
=> Returns a table with rows.
Name
Ford MustangMercedes SLK
Result
Original table
-
8/2/2019 OODBS
14/29 1
Result of the query (OQL)
Product no Name Color
P1 Ford Mustang Black
P2 Toyota Celica Green
P3 Mercedes SLK Black
- The statementqueries a object-oriented database
=> Returns acollection of objects.
String
Ford Mustang
Result
Original table
String
Mercedes SLK
Comparison
Queries look very similar in SQL and OQL,sometimes they are the sameIn fact, the results they give are verydifferent
Query returns:
OQL SQLObjectCollection of objects
TupleTable
Outline
Object-Oriented Data Model Object-Oriented Database System(OODBS) Object-Oriented Data Definition Language Object-Oriented Query Language Index organizations for OODBS Query Optimization in OODBS
-
8/2/2019 OODBS
15/29 1
Index organizations for OODBS
Path index (PX):a path P = C 1.A1.A2... . A n-1 .Ana path index (PX) on P with C i, 1 i n :
{(v,S)/ v DOM(An )and S = {O i.O i+1..O n / O1.O 2..O n.v is a instantiation of P}}
Index organizations for OODBS
Nested index (NX): a path P = C 1.A1.A2... . A n-1 .Ana nested index (NX) on P:
{(v,S)/ v DOM(An )and S = {O / O 1.O 2..O n.v isa instantiation of P, O i=O, 1i n }}
Index organizations for OODBS
Multi-index (MX):
a path P = C 1.A1.A2... . A n-1 .Ana multi-index (MX) on P:1i n {Ii,1, Ii,2,..., Ii,ni} where I i,j, 1 in , 1 jn i, is asingle index on path C ij.Ai and n i is the number ofsubclasses rooted by C ia single index for C ij.Ai is
{(O,S)/ O DOM(Ai )and S = {O / O. Ai=O} Indexes I i,j, 1i
-
8/2/2019 OODBS
16/29 1
Index organizations for OODBS
Inherited multi-index (IMX): a path P = C 1.A1.A2... . A n-1 .An
a inherited multi-index (IMX) on P:1i n {Ii} where I i is s class-hierarchy index on pathC i.Ai.a class-hierarchy index associates with each valueof an attribute A i the OIDs of instances of a class C iand of all its subclasses.an inherited multi-index differs from the multi-indexin that it maintains a single index for all classesbelonging to same inheritance hierarchy.this technique always requires a number of indexesequal to the path length.
Outline
Object-Oriented Data Model Object-Oriented Database System(OODBS) Object-Oriented Data Definition Language Object-Oriented Query Language Index organizations for OODBS Query Optimization in OODBS
Query Optimization in OODBS
Algebraic Transformation-based queryoptimization
Graph-based query optimization(using path indexes)
Method Materialisation
-
8/2/2019 OODBS
17/29 1
Algebraic Transformation-basedquery optimization
The object algebra is a many-sorted
algebra Algebraic operators are defined for thevarious kinds of value sets.
Operators can be classified asconstructors, projection operatorsperforming access to components of acomplex value, selection, and iteration.
Object algebra
Algebraic optimization rules
algebraic optimization rules: validate the defined operators represent semantically equivalent query
transformations. allow algebraic expressions to be
transformed into semantically equivalent, butmore efficiently executable ones.
-
8/2/2019 OODBS
18/29 1
Algebraic optimization rules
Algebraic optimization rules
Algebraic optimization rules
-
8/2/2019 OODBS
19/29 1
Example of algebraic queryThe following is a sample OQL query
what are the names of employees who work for CS
department?Select distinct p.nameFrom employee pWhere p.workfor.name = CS
Algebraic query:iS[S .name .v(s)]( P[Pname (V(D(workfor (V(p)))))=CS](employee))
Graph-based query optimizationusing path indexes
Access Path Selection Generalized Index Intersection Query Graph Reductions Generation of Least-Cost Evaluation Plan
Access Path Selection Eligible indexes for Q, denoted by EI(Q), are the
indexes that are useful in query processing; Eligible indexses, for the condition path i value,
are the indexes constructed on `any subpath' ofthe path i.
Predicates that can (cannot) be processed byindexes are called index processiblepredicates(IP) (residual predicates(RP))
-
8/2/2019 OODBS
20/29 2
Access Path SelectionQuery Graph
a i/j the link (i.e., the attribute) that connects the classes C i and C j ---- the path index constructed on the corresponding path expression
Access Path Selection
The problem of determining eligibleindexes in the query optimization hasexponential time complexity.
use a simple index selection heuristic: select all eligible indexes and pointers take full advantage of the path indexes not compromised by the proposed index selection
heuristic.
Generalized Index Intersection(for simple indexes)
-
8/2/2019 OODBS
21/29 2
Generalized Index Intersection(for path indexes)
Query Graph Reductions
Objective of Reductions:determine the classes that are replaced by
the index scans and removes them from thequery graph.
use Higraph for modeling the process of thequery graph reduction.
Higraph has one extra element called supernode that contains one or more subnodes (classes).
Query Graph ReductionsThe query graph reduction algorithm consists of the following
three steps:1. For query graph QG, determine the set of eligible indexesEI(QG).2. For each IDX(path i) EI(Q)
1) remove all primitive classes and edges in path i.2) create a new supernode that contains all user-defined classes
in path i; the supernode denotes OID tuples of its subnodesthat satisfy the predicates matched with IDX(path i).
(Note: not remove the user-defined classes on path i sinceresidual predicates may exist for them.
3. If two supernodes (relations) T1 and T2 have a commonsubnode, perform natural join for them. The join result is denotedby another supernode T12 and the nodes T1 and T2 areremoved. We repeat this step until no more supernodes exist inthe query graph that share a subnode.
-
8/2/2019 OODBS
22/29 2
Query Graph Reductions
Query Graph Reductions
Query Graph Reductions
-
8/2/2019 OODBS
23/29 2
Query Graph Reductions
Generation of Least-CostEvaluation Plan
The search algorithm generates all possible join orders (or alternative plans) from the RQG, and then estimates evaluation cost for each join order, and finally chooses the least-cost join order based on the cost model.(1) Generation of Search Tree(2) Cost Estimation and The Least-costEvaluation Plan Generation
Generation of Least-CostEvaluation Plan
Generation of Search Tree
-
8/2/2019 OODBS
24/29 2
Generation of Least-CostEvaluation Plan
Cost EstimationThe joins of the branch < C 1, C 2, ...,C n >can be processed by the sequence of binary
joinsThe cost formula for the binary join of C i and C i+1 (using pointer-based sort-merge
join algorithm):cost(C i JN ai C i+1) = cost(C i) + sort(C i, a i) +cost(C i+1)
Method Materialisation
A method materialisation consists:compute the result of a method once,store the method's result persistently in adatabase,use the persistent result value when themethod is invoked.maintain the materialised results: update the
values of materialised methods when objectsused for computing them change (baseobjects)
Method Materialisation
reduce applications response time for accessing a method's result, especiallywhen its execution takes long time.
add methods maintenance cost in order to improve a system's performance,only the right set of methods should be materialised method materialisation (precomputation,caching) was proposed in the context of indexing techniques and query optimisation.
-
8/2/2019 OODBS
25/29 2
Method Materialisation
Two important issues arise for methodmaterialisation :
(1) what technique to use for method materialisation,and(2) which methods to materialise?
use the dynamic hierarchical methodmaterialisation technique: if the method m i is materialised then other methods
called by m i are materialised. the system decides whether to materialise a given
method or not based on the gathered statistics(method reads and updates of base objects)
Method MaterialisationStorage Structures
Materialised Methods Dictionary (MMD)contains information about all methods:
a method name and class,the array of input arguments,a method return type,a method implementation,and a flag indicating if a method wasmaterialised.
Method MaterialisationStorage Structures
Materialised Method Results Structure (MMRS) stores
the following information about every materialisedmethod:(1) the identifier of a method,(2) an object identifier the method was invoked for,(3) the array of input argument values a method was invoked
with,(4) the value returned by a method while executed for a given
object and for a given array of input argument values. When materialised method m i is required, then MMRS is
searched in order to get the result of m i. If it is not foundthen, the value of m i is computed and stored in MMRS.
When an object used to compute the materialised valueof m i is updated or deleted, then the materialised valuebecomes invalid and is removed from MMRS.
-
8/2/2019 OODBS
26/29 2
Method MaterialisationStorage Structures
GMC stores pairs of values:
the identifier of a calling method andthe identifier of a method being called. Graph of Method Calls (GMC) represent
dependencies between methods, whereone calls another one.
GMC is used by the procedure thatmaintains the materialised results ofmethods.
Method MaterialisationStorage Structures
In order to invalidate dependent methodsthe system must be able to find alsoinverse references in object compositionhierarchy.
The references are maintained in a datastructure called Inverse References Index(IRI).
Method MaterialisationStorage Structures
Method Value Index (MVI) is an index definedon results of methods. Every method of a classhas its own MVI. The index stores the following:
(1) the value of a method input argument,(2) a method result, and(3) an object identifier a method was invoked for.
By using this index, the system is able to quicklyfind answers to queries that use methods. Thecontent of MVI is filled in with data whenmethods are materialised.
-
8/2/2019 OODBS
27/29 2
Dynamic Method Materialisation
The dynamic method materialisation technique consistsin:(1) gathering method usage statistics and based on thestatistics(2) finding methods whose materialisation increasessystem's performance and methods whosematerialisation deteriorates system's performance.
A software module, called the method analyser and optimiser does the final selection of methods formaterialisation and monitors method accesspatterns and gathers execution statistics.
Dynamic Method Materialisation Tuning of a system is performed in two following steps.Step 1:
select the set S M of methods for materialisation.materialise results of these methods for their first calls.monitor the usage of the methods and gather execution statisticsfor the set of transactions using m i and its materialised valuescalled the batch transaction set .
The size of the batch transaction set is parameterised by a system administrator.
Step 2:identify methods whose materialisation increases system'sperformancedematerialise automatically methods whose materialisationdeteriorates the system's performance
Dynamic Method MaterialisationGathering method usage statistics
For a given method m i the execution statistics include:method execution times and the number of disk accessesfor every object and every set of input argument values,the number of base object updates,the number of reads of m i materialised values,method invalidation times and the number of diskaccesses for every object and every set of input argumentvalues,method recomputation times and the number of diskaccesses for every object and every set of input argumentvalues,time and the number of disk accesses required for findingan already materialised value.
-
8/2/2019 OODBS
28/29 2
Dynamic Method MaterialisationSelecting methods for materialisation
Cost Modelr - number of transactions reading the materialised value v of
method m i.u - number of transactions updating a base object of m i.r +u - number of transactions in the batch transaction set.tRMAT - time of reading a materialised value of m i using MMRS.tEXEC - execution time of non-materialised method m i.tREMAT - time of rematerialising value v of m i, after its baseobject was updated.
All the discussed times include I/O as well as CPU times.
Dynamic Method MaterialisationSelecting methods for materialisation
The materialisation of method m i will reducequery response time if the following holds:
represents a coefficient by which an overallsystem's response time is to be reduced. It takesits value from the range of (0, 1) and it isconsidered as a tuning parameter set up by anadministrator.
Dynamic Method MaterialisationSelecting methods for materialisation
In the worst case, i.e. when all branches in theGMC have to be invalidated, therematerialisation time (t REMAT ) includes:
tINV - invalidation time of a materialised resulttEXEC -time of computing of a method resulttWMAT time of writing the materialised result ondisk.
Thus can be expressed as follows:
-
8/2/2019 OODBS
29/29
Dynamic Method MaterialisationSelecting methods for materialisation
Formula 1 and Formula 2 Formula 3 expressthe number of updates to the number of reads.
for a given method m i and a given batch transaction set, if the inequality in formula 3 istrue, then m is materialisation increase system'sperformance. Otherwise, m i has to bedematerialised.
Object Oriented Databases
Advantages Good integration with
Java, C++, etc Can store complex
information Fast to recover whole
objects Has the advantages of
the (familiar) objectparadigm
Disadvantages There is no underlying
theory to match therelational model
Can be more complexand less efficient
OODB queries tend tobe procedural, unlikeSQL
Object Relational Databases
Extend a RDBMSwith object concepts Data values can be
objects of arbitrarycomplexity
These objects haveinheritance etc.
You can query theobjects as well as thetables
An object relationaldatabase Retains most of the
structure of therelational model
Needs extensions toquery languages (SQLor relational algebra)