PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE...

29
Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle Corporation. Oracle Open World 2001 INTRODUCTION Oracle9i introduces a new system-defined datatype called XMLType. The objective of the XMLType datatype is to make it possible for the database to store and manage XML Documents in the same way that it stores and manages other datatypes like Strings and Integers. Traditionally, a programmer is faced with two options when using a relational database to store and manage XML Documents. The first is to use a parser to deconstruct the XML Document and store the information it contains as a set of rows in one or more tables. The second is to store the XML Document intact, as a CLOB (Character Large Object). There are significant advantages and disadvantages to both of these approaches. Deconstructing the document and storing it as a set of rows in relational tables means that the full power of the database can be bought to bear on the information contained in the document. This approach has the disadvantage that that the integrity of the information contained in the original XML Document is lost. Storing the document as a CLOB has the advantage of ensuring that the integrity of the information contained in the original XML Document is maintained. This approach has the disadvantage that little of the power of the database can be bought to bear on the information contained within the XML Document. By introducing the XMLType datatype Oracle offers the developer the best of both worlds. The XMLType datatype allows an XML Document to be stored intact in the database. It also introduces a number of XML Specific methods and functions that allow all the power of the relational database to be bought to bear on the information contained with the document. This paper will provide a brief introduction to the use of the XMLType datatype and its associated methods and functions. The XMLType datatype is a system defined datatype that is a part of the standard Oracle9i Database. As a new datatype, XMLType can be used when defining columns in tables and views. It can also be used to define parameters, return values, and variables when programming in PL/SQL. The XMLType datatype has built-in member functions that provide the developer with powerful mechanisms for creating, processing, and indexing XML data stored in Oracle9i. With the XMLType, and these capabilities, SQL developers can leverage the power of the relational database while working in the context of XML. The XMLType and its methods also offer significant advantages for the XML Developer. The methods defined by XMLType utilize the standards developed by the W3C. This allows the XML Developer to easily manipulate and access the content of XML Documents stored in the database.

Transcript of PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE...

Page 1: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

PPAAPPEERR NNUUMMBBEERR:: 113311 -- XXMMLL IINN TTHHEE DDAATTAABBAASSEE::

SSTTOORRIINNGG XXMMLL WWIITTHH AALLLL YYOOUURR OOTTHHEERR CCRRIITTIICCAALL DDAATTAA

Mark D. Drake / Oracle Corporation. Oracle Open World 2001

INTRODUCTION Oracle9i introduces a new system-defined datatype called XMLType. The objective of the XMLType datatype is to make it possible for the database to store and manage XML Documents in the same way that it stores and manages other datatypes like Strings and Integers. Traditionally, a programmer is faced with two options when using a relational database to store and manage XML Documents. The first is to use a parser to deconstruct the XML Document and store the information it contains as a set of rows in one or more tables. The second is to store the XML Document intact, as a CLOB (Character Large Object). There are significant advantages and disadvantages to both of these approaches. Deconstructing the document and storing it as a set of rows in relational tables means that the full power of the database can be bought to bear on the information contained in the document. This approach has the disadvantage that that the integrity of the information contained in the original XML Document is lost. Storing the document as a CLOB has the advantage of ensuring that the integrity of the information contained in the original XML Document is maintained. This approach has the disadvantage that little of the power of the database can be bought to bear on the information contained within the XML Document. By introducing the XMLType datatype Oracle offers the developer the best of both worlds. The XMLType datatype allows an XML Document to be stored intact in the database. It also introduces a number of XML Specific methods and functions that allow all the power of the relational database to be bought to bear on the information contained with the document. This paper will provide a brief introduction to the use of the XMLType datatype and its associated methods and functions. The XMLType datatype is a system defined datatype that is a part of the standard Oracle9i Database. As a new datatype, XMLType can be used when defining columns in tables and views. It can also be used to define parameters, return values, and variables when programming in PL/SQL. The XMLType datatype has built-in member functions that provide the developer with powerful mechanisms for creating, processing, and indexing XML data stored in Oracle9i. With the XMLType, and these capabilities, SQL developers can leverage the power of the relational database while working in the context of XML. The XMLType and its methods also offer significant advantages for the XML Developer. The methods defined by XMLType utilize the standards developed by the W3C. This allows the XML Developer to easily manipulate and access the content of XML Documents stored in the database.

Page 2: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

THE PURCHASEORDER EXAMPLE This paper will use an XML Document class called PurchaseOrder as the basis for most of the examples. This document class is described by the following DTD.

<!ELEMENT PurchaseOrder (Reference,Actions,Reject?,Requestor,User, CostCenter,ShippingInstructions,SpecialInstructions,LineItems)> <!ELEMENT Reference (#PCDATA)> <!ELEMENT Actions (Action)+> <!ELEMENT Action (User,Date)> <!ELEMENT Date (#PCDATA)> <!ELEMENT Reject (User?,Date?,Comment?)> <!ELEMENT Requestor (#PCDATA)> <!ELEMENT User (#PCDATA)> <!ELEMENT Comment (#PCDATA)> <!ELEMENT CostCenter (#PCDATA)> <!ELEMENT ShippingInstructions (name,address,telephone)> <!ELEMENT name (#PCDATA)> <!ELEMENT address (#PCDATA)> <!ELEMENT telephone (#PCDATA)> <!ELEMENT SpecialInstructions (#PCDATA)> <!ELEMENT LineItems (LineItem)+ > <!ELEMENT LineItem (Description,Part)+ > <!ATTLIST LineItem ItemNumber CDATA #IMPLIED> <!ELEMENT Description (#PCDATA)> <!ELEMENT Part EMPTY> <!ATTLIST Part Id CDATA #REQUIRED UnitPrice CDATA #REQUIRED Quantity CDATA #REQUIRED>

Figure 1. DTD Description of the PurchaseOrder Document Class.

Page 3: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

An example of the PurchaseOrder Document class is

<PurchaseOrder> <Reference>ADAMS-20011116221203351PST</Reference> <Actions> <Action> <User>SCOTT</User> <Date/> </Action> </Actions> <Reject/> <Requestor>Julie P. Adams</Requestor> <User>ADAMS</User> <CostCenter>R20</CostCenter> <ShippingInstructions> <name>Richard J Jones</name> <address>600 Oracle Parkway Redwood Shores CA 94065 USA</address> <telephone>650 506 7600</telephone> </ShippingInstructions> <SpecialInstructions>Next Day Air</SpecialInstructions> <LineItems> <LineItem ItemNumber="1"> <Description>Brief Encounter</Description> <Part Id="037429150726" UnitPrice="39.95" Quantity="2"/> </LineItem> <LineItem ItemNumber="2"> <Description>The Passion of Joan of Arc</Description> <Part Id="037429139820" UnitPrice="39.95" Quantity="1"/> </LineItem> <LineItem ItemNumber="3"> <Description>The Bank Dick</Description> <Part Id="715515010627" UnitPrice="29.95" Quantity="3"/> </LineItem> <LineItem ItemNumber="4"> <Description>The Wages of Fear</Description> <Part Id="037429134924" UnitPrice="29.95" Quantity="2"/> </LineItem> <LineItem ItemNumber="5"> <Description>The 39 Steps</Description> <Part Id="037429135228" UnitPrice="39.95" Quantity="4" /> </LineItem> </LineItems> </PurchaseOrder>

Figure 2. Example Instance Document for the PurchaseOrder Document Class.

Page 4: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

STORING XML DOCUMENTS IN THE DATABASE USING XMLTYPE. In the same way that numeric values are stored as NUMBER, and character data is stored as VARCHAR2 or CLOB, XML Documents are stored as XMLType. The first step in using the Oracle9i Database to store XML Documents is to create the table or tables that will store the documents.. These tables are created just like any other table in the database, using the CREATE TABLE statement.. The columns that will contain XML Documents are declared as being of type XMLType. A table may contain one or more XMLType columns; it may also contain a mixture of XMLType and other SQL datatype columns. In the following example a simple table, called PURCHASEORDER is created. The table consists of a single column, PODOCUMENT. The datatype of the PODOCUMENT column is XMLType.

create table PURCHASEORDER ( PODOCUMENT sys.XMLTYPE ) XMLType column PODOCUMENT store as CLOB ( STORAGE(INITIAL 12000 NEXT 12000) CHUNK 12000 CACHE ) ;

Figure 3. Create Table Statement.

In Oracle9i Release 1, the underlying storage for XMLType is CLOB. The standard clauses that make it possible to fine tune how the database manages LOB datatypes can be specified for each XMLType column in the table.

Page 5: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

STORING AND RETRIEVING INSTANCE DOCUMENTS In order to store an XML Document in the database, the document must be first be converted into an instance of XMLType. XMLType instances are created using the createXML() method. createXML() is a static method provide by the XMLType datatype. The createXML() method expects to be passed a VARCHAR or CLOB containing the XML Document to be converted. It returns an instance of XMLType. If the argument passed to createXML() is not a valid XML Document an error will be generated. The following example shows how load an XML Document from the native file system of the machine hosting the Database Server into an XMLType column.

create directory XMLFILES as 'C:\My XML Files\TestData\Samples'; declare CONTENT CLOB := ' '; SOURCE bfile := bfilename('XMLFILES','Sample1.xml'); begin DBMS_LOB.fileOpen(SOURCE,DBMS_LOB.file_readonly); DBMS_LOB.loadFromFile(CONTENT,SOURCE,DBMS_LOB.getLength(SOURCE),1,1); DBMS_LOB.fileClose(SOURCE); insert into PURCHASEORDER (PODOCUMENT) values(sys.xmltype.createXML(CONTENT)); commit; end;

Figure 4. Inserting an XML Document stored on the Server’s native File System..

In this example the contents of the file “C:\MY XML FILES\TESTDATA\SAMPLES\Sample1.xml” will be stored in the PODOCOUMENT column of the PURCHASEORDER table. The steps required to store this document as an XMLType are as follows:

First, create a SQL Directory object that maps the directory containing the target document. In order to be able to create the SQL Directory object, the user must have been granted the privilege CREATE ANY DIRECTORY. Next, use the DBMS_LOB package to open the target file, and load its content into a temporary CLOB. Finally, use the createXML() method to create an instance of the XMLType datatype from the CLOB, insert the XMLType into the target table and commit the transaction.

Page 6: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

Having stored an XML Document in a table the next step is to be able to retrieve it. The following example shows how to retrieve an XML Document that has been stored in an XMLType column.

set long 10000 set pagesize 0 select p.PODOCUMENT.getClobVal() from PURCHASEORDER p;

Figure 5. Simple Select Statement.

By default, when an XMLType column is selected, SQL:*PLUS displays it as an XMLType object. In order to see the contents of the XMLType object it is necessary to invoke XMLType’s getClobVal() method. Executing these statements generates the following output.

P.PODOCUMENT.GETCLOBVAL() ----------------------------------------------------------------------<PurchaseOrder> <Reference>ADAMS-20011116221203351PST</Reference> <Actions> <Action> <User>SCOTT</User> <Date/> </Action> </Actions> <Reject/> <Requestor>Julie P. Adams</Requestor> <User>ADAMS</User> <CostCenter>R20</CostCenter> <ShippingInstructions> <name>Richard J Jones</name> <address>600 Oracle Parkway Redwood Shores CA 94065 USA</address> <telephone>650 506 7600</telephone> </ShippingInstructions> <SpecialInstructions>Next Day Air</SpecialInstructions> <LineItems> <LineItem ItemNumber="1"> <Description>Brief Encounter</Description> <Part Id="037429150726" UnitPrice="39.95" Quantity="2"/> </LineItem> … </LineItems> </PurchaseOrder>

Figure 6.Output from a Simple Select Statement.

Note, when using SQL*PLUS it is also necessary to set long to a value greater than the length of the rendered document and pagesize to zero in order to view the complete document without page breaks and headers.

Page 7: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

SEARCHING WITHIN A DOCUMENT AND ACROSS MANY DOCUMENTS The W3C standard for navigating an XML Document is XPath. Oracle9i allows XPath expressions to be used for navigating an XMLType instance, and for searching across multiple instances of XMLType. XPath support is provided via XMLType’s extract() and existsNode() methods. Supporting XPath provides the XML Developer with a powerful and familiar metaphor for working with XML Documents. The extract() method returns an instance of XMLType containing the node or set of nodes that matches the specified XPath expression. Depending on the nature of the XPath expression, the XMLType returned may contain a valid XML Document or a Document Fragment. A Document Fragment is an XML containing multiple root nodes. Document Fragments are generated when there are multiple nodes in the document that match the specified XPath expression. If the XPath expression evaluates to a single text node or attribute value the methods getStringVal(), getNumberVal() or getClobVal() can be used to access the appropriate data value. If the Xpath expression evaluates to a set of nodes, or a node that has child nodes, the getClobVal() method can be used to render the nodes as an XML Document. The following examples show how the extract() method can be used in the select list.

select extract(p.PODOCUMENT, '/PurchaseOrder/Reference/text()').getStringVal() "P.O. Reference" from PURCHASEORDER p where rownum = 1; set long 10000; select extract(p.PODOCUMENT, '/PurchaseOrder/LineItems/LineItem[1]').getClobVal() "Line Item #1" from PURCHASEORDER p where rownum = 1;

Figure 7. Use of extract() in the SELECT LIST.

In the first example extract() is used in the select list obtain the value of the text node belonging to the element identified by the XPath expression ‘/PurchaseOrder/Reference’. In the second example extract() is used to obtain the set of nodes belonging to the element identified by the XPath expression ‘/PurchaseOrder/LineItems/LineItem[1].’ The last example show how the extract() method can be used in the where clause to limit which rows are returned by a query.

select p.PODOCUMENT.getClobVal() "Document" from PURCHASEORDER p where extract(p.PODOCUMENT, '/PurchaseOrder/User/text()').getStringVal() = 'SMITH';

Figure 8. Use of extract() in the WHERE CLAUSE.

In the this case the extract operation would ensure that only documents where the element ‘/PurchaseOrder/User’ contains a text node with a text value of ‘SMITH’ would be returned.

INDEXING XMLTYPE COLUMNS The problem with using the extract() method in the where clause of a select statement is that the database has to

Page 8: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

perform the extract operation on each document in the table in order to determine whether or not that document should be included in the result set. In order to perform the extract processing, a DOM has to be constructed for each document. This means that if there are many documents in the table being queried,. using the extract() method in the where clause can be very expensive. When querying data stored in relational tables the standard way to improve query performance is to create indexes on the most commonly queried columns. The same approach is used when the querying data stored in XMLType columns. Indexes are built on the most commonly used XPath expressions These indexes are created using the Oracle9i database’s Functional Index feature in conjunction with XMLType’s extract() method. Creating functional indexes on XMLType columns radically reduces the costs associated with extract() inside the where clause. Once the XMLType column has been indexed the database’s optimizer is able to evaluate whether or not the where clause terms associated with that column can be resolved using the available indexes. If an index can be used, the database does not have to perform the expensive DOM processing on each document in the table in order to determine which of the rows in the table should be included in the result set. An index should be created for each of the most commonly used XPath expressions. Multiple functional indexes can be created on a single XMLType column. Functional Indexes are maintained just like the standard b-tree indexes. The index is dynamically updated to reflect the current state of the table as rows are added, updated and deleted. In order to use functional indexes, the following ALTER SYSTEM statements need to be executed each time the database is restarted. These system settings can also be set in the initSID.ora file for the database instance in question.

alter system set QUERY_REWRITE_ENABLED = TRUE; alter system set QUERY_REWRITE_INTEGRITY = TRUSTED;

Figure 9. Enabling Functional Indexes.

In addition to configuring the database to allow functional indexes the user performing the query must also have been granted QUERY REWRITE privileges. The following examples show how to build XPath based, Functional Indexes on XMLType columns. The XPath expression is used to identify which node in the XML Document should be indexed.

create unique index IPURCHASEORDERREFERENCE on PURCHASEORDER p ( substr(sys.xmltype.getStringVal( sys.xmltype.extract(p.PODOCUMENT, '/PurchaseOrder/Reference/text()')),1,26) )

Figure 10. Example Create Index Statements(I).

This example creates a unique index on the text node belonging to the element ‘/PurchaseOrder/Reference’.

Page 9: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

create index IPURCHASEORDERUSER on PURCHASEORDER p ( substr(sys.xmltype.getStringVal( sys.xmltype.extract(p.PODOCUMENT, '/PurchaseOrder/User/text()')),1,10) )

Figure 11. Example Create Index Statement (II).

This example builds a non-unique index on the text node belonging to the element ‘/PurchaseOrder/User’. In order for the optimizer to recognize that a Functional Index can be used to resolve a component of the where clause, the DML in the where clause must be identical to the DML that was used to create the index. The following examples show the query plan generated for some simple queries against an XMLType column.

select p.PODOCUMENT.getClobVal() "Document" from PURCHASEORDER p where substr(sys.xmltype.getStringVal( sys.xmltype.extract(p.PODOCUMENT, '/PurchaseOrder/CostCenter/text()')), 1,10) = 'D20';

Plan Table ---------------------------------------------------------------------------------- | Operation | Name | Rows | Bytes| Cost | Pstart| Pstop | ---------------------------------------------------------------------------------- | SELECT STATEMENT | | 100 | 190K| 1205 | | | | TABLE ACCESS FULL |PURCHASEORDER | 100 | 190K| 1205 | | | ----------------------------------------------------------------------------------

Figure 12. Non-Indexed select …. where extract(…).

In the first example there is no index that can be used to resolve a query on the contents of the text node associated with the element ‘/PurchaseOrder/CostCenter’. As can be seen from the Explain Plan output, the database is forced to perform a table scan to resolve this query. In order to generate the required result set it will create a DOM and evaluate the results of the extract() operation for every document in the table. This would be a very expensive and poorly performing query.

Page 10: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

select p.PODOCUMENT.getClobVal() "Document" from PURCHASEORDER p where substr(sys.xmltype.getStringVal( sys.xmltype.extract(p.PODOCUMENT, '/PurchaseOrder/User/text()')), 1,10) = 'SMITH';

Plan Table ---------------------------------------------------------------------------------- | Operation | Name | Rows | Bytes| Cost | Pstart| Pstop | ---------------------------------------------------------------------------------- | SELECT STATEMENT | | 1 | 1K| 1 | | | | TABLE ACCESS |PURCHASEORDER | 1 | 1K| 1 | | | | BY INDEX ROW | | | | | | | | INDEX RANGE SCAN |IPURCHASEORDERUSER | 1 | | 1 | | | ------------------------------------------------------------------------

Figure 13. Indexed select …. where extract(…).

In the second example the DML on the right hand side of the where clause is identical to the DML that was used to create the index IPURCHASEORDERUSER. In this case, optimizer is able to use the index to resolve the query on the contents of the text node associated with the element ‘/PurchaseOrder/User’. This means that the full power of the database is bought to bear on resolving this query, and that the correct results are returned cheaply and efficiently.

Page 11: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

RELATIONAL ACCESS TO DATA IN XMLTYPE COLUMNS One of the major advantage of using relational databases is that are many powerful analysis and decision support tools that understand the relational metaphor. There are very few tools available that understand XML. One of the major advantages of using XMLType to store and manage XML Documents is that the methods provided by the XMLType make it possible to expose the content of these documents as relational views.

create or replace view PURCHASEORDERVIEW as select substr(sys.xmltype.getStringVal( sys.xmltype.extract(p.PODOCUMENT, '/PurchaseOrder/Reference/text()')), 1,26) "REFERENCE", substr(sys.xmltype.getStringVal( sys.xmltype.extract(p.PODOCUMENT, '/PurchaseOrder/User/text()')), 1,10) "USERID", substr(sys.xmltype.getStringVal( sys.xmltype.extract(p.PODOCUMENT, '/PurchaseOrder/ShippingInstructions/name/text()')), 1,20) "SHIPTO" from PURCHASEORDER p;

SQL> describe PURCHASEORDERVIEW; Name Null? Type ----------------------------------------- -------- -------------------- REFERENCE VARCHAR2(26) USERID VARCHAR2(10) SHIPTO VARCHAR2(20)

Figure 14. Creating and a Simple Relational View over an XMLType column.

The example above shows how XMLType’s extract() method can be used to create a simple view which exposes the contents of the text nodes belonging to the elements ‘/PurchaseOrder/Reference’, ‘PurchaseOrder/User’ and ‘PurchaseOrder/ShippingInstructions/name’ as the columns REFERENCE, USERID and SHIPTO.

Page 12: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

Once the view has been created it looks like, and behaves like a standard database View. This means that any tool that can access data via Views can now access data from the underlying XML Documents.

SQL> select * from PURCHASEORDERVIEW 2 where rownum < 10; REFERENCE USERID SHIPTO -------------------------- ---------- ------------------- SMITH-20011116221203541PST SMITH Mark D. Smith ADAMS-20011116221203351PST ADAMS Richard J Jones ADAMS-20011116194107148PST ADAMS Thomas D. Martin ADAMS-20011116194108851PST ADAMS Julie P. Adams ADAMS-20011116194108891PST ADAMS Julie P. Adams ADAMS-2001111619410921PST ADAMS Julie P. Adams ADAMS-20011116194109522PST ADAMS Julie P. Adams ADAMS-20011116194109812PST ADAMS Julie P. Adams ADAMS-20011116194110103PST ADAMS Julie P. Adams

Figure 15. Querying against a view defined on an XMLType column.

The next example shows how by using a PL/SQL function, information from multiple nodes within the XML Document can be aggregated and exposed via a simple view.

create or replace function getTotalValue(PURCHASEORDER in sys.XMLTYPE) return number is LINEITEMS sys.XMLTYPE; LINEITEM sys.XMLTYPE; ELEMENT sys.XMLTYPE; COST number; QUANTITY number; TOTAL number := 0; I binary_integer := 1; begin LINEITEMS := PURCHASEORDER.extract('//LineItem'); loop LINEITEM := LINEITEMS.extract('/LineItem['||I||']'); exit when LINEITEM is null; ELEMENT := LINEITEM.extract('/LineItem/Part/@UnitPrice'); COST := ELEMENT.getNumberVal(); ELEMENT := LINEITEM.extract('/LineItem/Part/@Quantity'); QUANTITY := ELEMENT.getNumberVal(); TOTAL := TOTAL + (COST * QUANTITY ); I := I + 1; end loop; return TOTAL; end getTotalValue;

Figure 16. Simple PL/SQL function on XMLType.

This function calculates the total value of a PurchaseOrder document. The first step is to create a Document Fragment containing all of the LineItem elements. The next step it to iterate through the collection of LineItem elements, using an XPath expression that will return each LineItem in turn. The procedure then uses additional XPath expressions to obtain the values of the Cost and Quantity attributes from the Part element associated with the LineItem. These values are used to calculate the total value of each LineItem. The total value of the PurchaseOrder is calculated by summing the value of each of the LineItem elements. This value becomes the return value for the function.

Page 13: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

The next step is to incorporate the function into a view.

create or replace view PURCHASEORDERVIEW as select substr(sys.xmltype.getStringVal( sys.xmltype.extract(p.PODOCUMENT, '/PurchaseOrder/Reference/text()')), 1,26) "REFERENCE", substr(sys.xmltype.getStringVal( sys.xmltype.extract(p.PODOCUMENT, '/PurchaseOrder/User/text()')), 1,10) "USERID", substr(sys.xmltype.getStringVal( sys.xmltype.extract(p.PODOCUMENT, '/PurchaseOrder/ShippingInstructions/name/text()')), 1,20) "SHIPTO", getTotalValue(p.PODOCUMENT) "TOTAL" from PURCHASEORDER p;

SQL> select * from PURCHASEORDERVIEW 2 where rownum < 10; REFERENCE USERID SHIPTO TOTAL -------------------------- ---------- -------------------- ---------- SMITH-20011116221203541PST SMITH Mark D. Smith 229.7 ADAMS-20011116221203351PST ADAMS Richard J Jones 429.4 ADAMS-20011116194107148PST ADAMS Thomas D. Martin 1278.15 ADAMS-20011116194108851PST ADAMS Julie P. Adams 469.3 ADAMS-20011116194108891PST ADAMS Julie P. Adams 599.15 ADAMS-2001111619410921PST ADAMS Julie P. Adams 529.2 ADAMS-20011116194109522PST ADAMS Julie P. Adams 1158.34 ADAMS-20011116194109812PST ADAMS Julie P. Adams 489.3 ADAMS-20011116194110103PST ADAMS Julie P. Adams 1018.6

Figure 17. Use of Simple PL/SQL function.

The above example shows how the view that was created in earlier example can be modified to include the value returned by the PL/SQL function.

Page 14: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

A slightly more complex approach is required in order to expose a collection of elements as a set of rows. The first step is to create a SQL’99 Object Type that defines the set row that will be returned for each element in the collection. The second step is to create a Table Object that allows the set of rows created from a given document to be returned as a single aggregation.

create or replace type PURCHASEORDER_LINEITEM_ROW as object ( LINENO number, DESCRIPTION varchar2(128), PARTID varchar2(14), QUANTITY number, UNITCOST number, TOTAL number ); create or replace type PURCHASEORDER_LINEITEM_TABLE as table of PURCHASEORDER_LINEITEM_ROW;

Figure 18. SQL’99 Objects required in order to return a set of values from a collection of elements.

In this example, the PURCHASEORDER_LINEITEMS_ROW object defines the row that will be generated for each LineItem element. The PURCHASEOREDER_LINEITEM_TABLE object represents the set of rows that will be generated from a given PurchaseOrder document.

Page 15: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

The next step is to create a PL/SQL Function that will return the aggregation of row objects, given the source document. This function generates a row object for each element in the target collection. The row object is returned to the calling program using the PIPE ROW statement. By declaring the function as PIPELINED and using PIPE ROW, the function is able to return the generated row objects asynchronously. This had the advantage of improving initial response time (the time taken to return the first row) and reducing the overall memory usage of the function.

create or replace function getLineItems(PURCHASEORDER in sys.XMLTYPE) return PURCHASEORDER_LINEITEM_TABLE pipelined is LINEITEMS sys.XMLTYPE; LINEITEM sys.XMLTYPE; ELEMENT sys.XMLTYPE; COST number; QUANTITY number; TOTAL number :=0; I binary_integer := 1; DESCRIPTION varchar2(128); PARTID varchar2(14); begin LINEITEMS := PURCHASEORDER.extract('//LineItem'); loop LINEITEM := LINEITEMS.extract('/LineItem['||I||']'); exit when LINEITEM is null; ELEMENT := LINEITEM.extract('/LineItem/Description/text()'); DESCRIPTION := ELEMENT.getStringVal(); PARTID := LINEITEM.extract('/LineItem/Part/@Id').getStringVal(); ELEMENT := LINEITEM.extract('/LineItem/Part/@UnitPrice'); COST := ELEMENT.getNumberVal(); ELEMENT := LINEITEM.extract('/LineItem/Part/@Quantity'); QUANTITY := ELEMENT.getNumberVal(); TOTAL := COST * QUANTITY; pipe row (PURCHASEORDER_LINEITEM_ROW(I,DESCRIPTION, PARTID,QUANTITY, COST,TOTAL)); I := I + 1; end loop; return; end getLineItems;

Figure 19. PL/SQL Pipelined Table function over XMLType.

This example shows how to return a collection of PURCHASEORDER_LINEITEM_ROWS from an XML Document containing a PurchaseOrder. The processing in this function is very similar to the processing in the previous example, with respect to the technique used to iterate through the collection of LineItem elements. Once set of the values for a particular LineItem have been obtained they are returned to the calling program as an instance of the PURCHASEORDER_LINEITEM_ROW object.

Page 16: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

The final step is to create a view that exposes the result set generated by the function as a relational view. The PL/SQL TABLE operator is used to cast the return value generated by the function into a table that can be joined with other data.

create or replace view LINEITEMSVIEW as select substr(sys.xmltype.getStringVal( sys.xmltype.extract(p.PODOCUMENT, '/PurchaseOrder/Reference/text()')), 1,26) "REFERENCE", F.LINENO,F.DESCRIPTION,F.PARTID,F.QUANTITY,F.UNITCOST,F.TOTAL from PURCHASEORDER p , table ( getLineItems ( p.PODOCUMENT ) ) F;

Figure 20. View based on a PL/SQL Pipelined Table function.

In this case the result set returned by the function is joined with the value of the text node belonging to the element ‘/PurchaseOrder/Reference’ to create the view LINEITEMSVIEW. Adding the REFERENCE column makes it possible to join rows in the LINEITEMSVIEW with the corresponding rows in the PURCHASEORDERVIEW view. As can be seen from the following query LINEITEMSVIEW looks and behaves just like a normal relational view.

SQL> select REFERENCE, LINENO, PARTID, TOTAL 2 from LINEITEMSVIEW 3 where rownum < 11; REFERENCE LINENO PARTID TOTAL -------------------------- ---------- -------------- ---------- SMITH-20011116221203541PST 1 715515010122 159.8 SMITH-20011116221203541PST 2 37429155820 29.95 SMITH-20011116221203541PST 3 37429128022 39.95 ADAMS-20011116221203351PST 1 037429150726 79.9 ADAMS-20011116221203351PST 2 037429139820 39.95 ADAMS-20011116221203351PST 3 715515010627 89.85 ADAMS-20011116221203351PST 4 037429134924 59.9 ADAMS-20011116221203351PST 5 037429135228 159.8 ADAMS-20011116194107148PST 1 037429121726 39.95 ADAMS-20011116194107148PST 2 037429124529 79.9 10 rows selected.

Figure 21. Select from a view based on a PL/SQL Table Function.

Page 17: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

RETRIEVING RELATIONAL DATA AS XML The previous section of this paper explained how information stored using an XMLType, can be exposed as data in a set of relational views. This next section addresses the opposite problem, how to efficiently generate XML from data stored in relational tables. Oracle9i introduced a number of new features, explicitly designed to improve the generation of XML from SQL queries. Two of the most import features are the functions SYS_XMLGEN and SYS_XMLAGG. SYS_XMLGEN generates a valid XML Document from a scalar variable or SQL Object Type. If the argument passed to SYS_XMLGEN is a scalar value, the function generates a simple XML Document consisting of a single root element with a text node containing the value passed. If the argument passed to SYS_XMLGEN is an instance of a SQL’99 object type, the function generates a complex document, consisting of a root node containing one child node for each attribute defined by the object type. This behavior allows SYS_XMLGEN to be used to generate complex, multi-level XML Documents; The names of the nodes in the generated document are derived from the names of the attributes defined by the SQL’99 Object Type. If the name of the attribute starts with a ‘@’ sign, the value is treated as an attribute of the root node of the generated document, otherwise the attribute is treated as a child of the root node. SYS_XMLAGG allows a set of documents created using SYS_XMLGEN, to be aggregated into a single document. Both SYS_XMLGEN and SYS_XMLAGG are implemented as native ‘C’ functions in Oracle9i. This means that they provide the SQL developer with an extremely efficient method of generating XML from SQL result sets. The following example will show how to SYS_XMLAGG and SYS_XMLGEN to generate XML Documents from data stored in relational tables. Take the following relational table

SQL> describe DVDTITLES; Name Null? Type ----------------------------------------------- -------- -------------- UPCCODE NOT NULL VARCHAR2(14) TITLE VARCHAR2(64) CONTRIBUTOR VARCHAR2(256) YEAR VARCHAR2(4) STUDIO VARCHAR2(32) WIDESCREEN RAW(1) ANAMORPHIC RAW(1) BLACKANDWHITE RAW(1) DUBBED RAW(1) SUBTITLED RAW(1) AUDIOFORMAT VARCHAR2(12) UNITCOST NUMBER(10,2) COUNTRY VARCHAR2(64) SPINE NUMBER(6) NOTES VARCHAR2(2048)

Figure 22. Describe of Relational Table DVDTITLES.

The table contains information about currently available DVDs, including the item’s UPC code, Title, and the Studio that publishes it. The requirement is to publish the information in this table as a set of XML Documents. There should be one document for each Studio, containing a collection of DVD elements, representing the titles published by that Studio.

Page 18: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

In order to generate the required XML the following step are required. First, create the SQL’99 Object types that define the structure of the required XML. Typically one object type will be required for each complex type (type with child nodes) in the required output. Additional Object Types will also be required for each collection that appears in the generated document.

create or replace type DVD as object "@UPC" VARCHAR2(14), "Title" VARCHAR2(64), "PrimaryContributor" VARCHAR2(256), "Year" VARCHAR2(4), "WideScreen" VARCHAR2(5), "Anamorpic" VARCHAR2(5), "BlackAndWhite" VARCHAR2(5), "Dubbed" VARCHAR2(5), "Subtitled" VARCHAR2(5), "Audio" VARCHAR2(12), "Price" NUMBER(10,2), "Country: VARCHAR2(64), "Spine" NUMBER(6), "Notes" VARCHAR2(2048) ) create or replace type TITLES_TYPE AS VARRAY(32767) of DVD; create or replace type STUDIOCATALOG as object ( "@Studio" VARCHAR2(64), "Titles" TITLES_TYPE );

Figure 23. XML generation from Relational Tables using SYS_XMLGEN.

In this example three object types need to be defined in order to generate the required XML Documents: The first, DVD, defines the set of elements that will be generated for each DVD. Note how the attribute ‘@UPC’ starts with an @. This means that the UPC information will appear as an attribute of the DVD element, rather than one of its child elements. The second, TITLE_TYPE, defines a collection of DVD objects that will represent the DVDs published by a particular Studio. The third, STUDIOCATALOG defines the XML Document that will contain the set of DVDs published by each Studio. Again, note that the use of the ‘@’ in the definition of the Studio attribute. This will force the studio name to be generated as an attribute of the Catalog element.

Page 19: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

Next, create a view that will provide access to the generated XML Documents. The select statement instantiates the appropriate SQL’99 Objects. SYS_XMLGEN and SYS_XMLGEN are then used to render the required XML Documents from the objects that were created by the select statement.

create or replace view STUDIOCATALOGVIEW as select STUDIO, sys_xmlgen(STUDIOCATALOG(STUDIO, CAST(MULTISET(select DVD(UPCCODE,TITLE, CONTRIBUTOR,YEAR, decodeBoolean(WIDESCREEN), decodeBoolean(ANAMORPHIC), decodeBoolean(BLACKANDWHITE), decodeBoolean(DUBBED), decodeBoolean(SUBTITLED), AUDIOFORMAT,UNITCOST, COUNTRY,SPINE,NOTES) from DVDTITLES y where X.STUDIO = Y.STUDIO) as TITLES_TYPE)), sys.xmlgenformattype.createFormat('Catalog')) "CATALOG" from DVDTITLES x GROUP BY STUDIO;

Figure 24. Using SYS_XMLGEN to generate XML from Relation Tables.

In this example, the following processing is required. First, instantiate an instance of DVD object type for each of DVD published by the target studio. Next, aggregate the collection of DVDs objects for a given studio into an instance of TITLES_TYPE using the CAST(MULTISET(….) AS VARRAY OBJECT TYPE operator. Next combine the TITLES_TYPE object with the STUDIO column to create an instance of the STUDIOCATALOG object. Finally, use the SYS_XMLGEN function to generate an instance of XMLType from the information contained in the STUDIOCATALOG object.

Page 20: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

As can be seen from the example below, when a query is performed against the view the SYS_XMLGEN function generates an XMLType containing the required XML. The contents of the XMLType can be rendered as an XML Document by invoking XMLType’s getClobVal() method.

SQL> select CATALOG 2 from STUDIOCATALOGVIEW 3 where STUDIO = 'Dreamworks Home Video'; CATALOG() --------------------------------------------- XMLTYPE() SQL> select c.CATALOG.getCLOBVal() "Catalog Document" 2 from STUDIOCATALOGVIEW c 3 where STUDIO = 'Dreamworks Home Video'; Catalog Document ------------------------------------------------------------------------<?xml version="1.0"?> <Catalog Studio="Dreamworks Home Video"> <Titles> <DVD UPC="667068416121"> <Title>Small Soldiers</Title> <PrimaryContributor>Joe Dante</PrimaryContributor> <Year>1998</Year> <WideScreen>true</WideScreen> <Anamorpic>true</Anamorpic> <Audio>DD 5.1</Audio> </DVD> … <DVD UPC="667068443325"> <Title>Saving Private Ryan</Title> <PrimaryContributor>Steven Spielberg</PrimaryContributor> <Year>1998</Year> <WideScreen>true</WideScreen> <Anamorpic>true</Anamorpic> <Audio>DD 5.1</Audio> </DVD> </Titles> </Catalog>

Figure 25. Sample XML output generated using SYS_XMLGEN.

Page 21: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

USING XML TO UPDATE RELATIONAL TABLES The last section explained how to generate XML Documents from data stored in Relational Tables. This next section explains how to use the information contained in an XML file to modify the contents of Relational Tables. This makes it possible to insert or update rows in relational tables by inserting XML Documents into the appropriate views. In order to update a relational table based on an attempt to insert or update a view, INSTEAD OF TRIGGERS need to be created on the view in question. The following show how to create an INSTEAD OF INSERT TRIGGER on a view.

create or replace trigger EXPLODECATALOG instead of insert on STUDIOCATALOGVIEW for each row declare UPCCODE VARCHAR2(14); TITLE VARCHAR2(64); CONTRIBUTOR VARCHAR2(256); YEAR VARCHAR2(4); STUDIO VARCHAR2(32); COSTCENTER VARCHAR2(3); WIDESCREEN RAW(1); ANAMORPHIC RAW(1); BLACKANDWHITE RAW(1); DUBBED RAW(1); SUBTITLED RAW(1); AUDIOFORMAT VARCHAR2(6); UNITCOST NUMBER(10,2); COUNTRY VARCHAR2(64); SPINE NUMBER(6); NOTES VARCHAR2(2048); NOT_A_CATALOG exception; I binary_integer; DOCUMENT sys.XMLTYPE; FRAGMENT sys.XMLTYPE; DVDELEMENT sys.XMLTYPE; ELEMENT sys.XMLTYPE; begin DOCUMENT := :new.CATALOG; if (DOCUMENT.existsNode('/Catalog') = 0) then raise NOT_A_CATALOG; end if; STUDIO := DOCUMENT.extract('/Catalog/@Studio').getStringVal(); FRAGMENT := DOCUMENT.extract('/Catalog//DVD'); if (FRAGMENT is not null) then I := 1; loop := FRAGMENT.extract('/['||I||']'); exit when ELEMENT is null; ELEMENT := DVD.extract('//@UPC'); UPCCODE := getStringVal(ELEMENT);

Page 22: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

ELEMENT := DVD.extract('//Title/text()'); TITLE := getStringVal(ELEMENT); ELEMENT := DVD.extract('//PrimaryContributor/text()'); CONTRIBUTOR := getStringVal(ELEMENT); ELEMENT := DVD.extract('//Year/text()'); YEAR := getStringVal(ELEMENT); ELEMENT := DVD.extract('//WideScreen/text()'); WIDESCREEN := encodeBoolean(ELEMENT); ELEMENT := DVD.extract('//Anamorphic/text()'); ANAMORPHIC := encodeBoolean(ELEMENT); ELEMENT := DVD.extract('//BlackAndWhite/text()'); BLACKANDWHITE := encodeBoolean(ELEMENT); ELEMENT := DVD.extract('//Dubbed/text(ELEMENT)'); DUBBED := encodeBoolean(ELEMENT); ELEMENT := DVD.extract('//SubTitled/text(ELEMENT)'); SUBTITLED := encodeBoolean(ELEMENT); ELEMENT := DVD.extract('//Audio/text(ELEMENT)'); AUDIOFORMAT := getStringVal(ELEMENT); ELEMENT := DVD.extract('//Price/text(ELEMENT)'); UNITCOST := getNumberVal(ELEMENT); ELEMENT := DVD.extract('//Country/text(ELEMENT)'); COUNTRY := getStringVal(ELEMENT); ELEMENT := DVD.extract('//Spine/text(ELEMENT)'); SPINE := getNumberVal(ELEMENT); ELEMENT := DVD.extract('//Notes/text(ELEMENT)'); NOTES := getStringVal(ELEMENT); insert into DVDTITLES values (UPCCODE,TITLE,CONTRIBUTOR, YEAR,STUDIO,WIDESCREEN, ANAMORPHIC,BLACKANDWHITE, DUBBED,SUBTITLED,AUDIOFORMAT, UNITCOST,COUNTRY,SPINE,NOTES); I := I + 1; end loop; end if; exception when NOT_A_CATALOG then raise_application_error(-20005, 'Only DVD Catalog documents can be stored in this table.'); end EXPLODECATALOG;

Figure 26. Relational Insert of XML Content using extract().

This example shows how by placing an INSTEAD OF INSERT TRIGGER on the view created in the previous it is possible to allow the DVDTITLES table to be updated by inserting new instances of Catalog Document class into the

Page 23: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

STUDIOCATALOG view. When an attempt is made to insert a new Catalog Document, the trigger fires and the Studio Name and collection of DVD elements are extracted from the XML Document. The trigger then iterates through the collection of DVD elements creating a new row in the DVDTITLES table for each DVD element in the catalog. If the root element of the XML Document being inserted is not an instance of Catalog, then the appropriate error message is raised.

Page 24: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

DATABASE VALIDATION OF XMLTYPE One problem that an organization encounters when deciding to use XML as a method of managing mission critical data is how to ensure the integrity of these documents, and how to protect them against accidental or malicious damage. DTD and XMLSchema based validation provides the ability to enforce rules relating to the structure of a document. However these techniques offer little that can be used to enforce rules relating to the content of the document, beyond what can be achieved with simple pattern matching. Something as simple as ensuring that the value of a attribute or text node is unique across all the members of a Document Class is beyond both XMLSchema and DTD validation. Neither XML Schema nor DTDs make it possible to ensure that a text node or attribute value matches a value contained in another XML Document or some other data source outside the document. Even with XMLSchema or DTD based validation it is difficult to protect against a simple, accidental modification of the structure of an XML Document using a text editor like notepad or vi, if the documents are simple stored in a file system. By using Oracle9i to store and manage XML Documents an organization can ensure that the structure and content of their documents is valid. Storing XML Documents as instances of XMLType make it possible to use database integrity checking to enforce rules relating to the structure and content of XML Documents. For example, the uniqueness of a text node or attribute value can be enforced by creating the appropriate unique index. Database trigger s can be used to enforce more complex validation rules, such as structure validation, based on DTD or XMLSchema, and content validation based on cross-checking attribute and node values against other data sources. This section shows how, by storing documents as instances of XMLType, database triggers and indexes can be used to ensure the validity of the documents in question.

Page 25: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

The first example shows how, by adding a BEFORE INSERT TRIGGER to the PURCHASEORDER table it is possible to ensure that all of the documents stored in the PODOCUMENT column conform to the PurchaseOrder DTD. The trigger will use the Oracle9i XML Parser for PL/SQL to perform validation of the documents being inserted. Since the PL/SQL implementation of the XML Parser is based on the Oracle XML Parser for Java, the user invoking the Parser must have been granted JAVASYSPRIV privilege. They must also have been granted EXECUTE permission on the packages SYS.XMLPARSER and SYS.XMLDOC.

create or replace trigger PURCHASEORDERVALIDATION before insert on PURCHASEORDER for each row declare PARSER xmlparser.PARSER; DTD_SOURCE clob; DTD_DOCUMENT xmldom.DOMDocumentType; begin select DTD into DTD_SOURCE from DTDSTORE where DOCTYPE = 'PurchaseOrder'; PARSER := xmlparser.newParser; xmlparser.setValidationMode( PARSER , false); xmlparser.parseDTDClob( PARSER , DTD_SOURCE , 'PurchaseOrder' ); DTD_DOCUMENT := xmlparser.getDoctype( PARSER ); xmlparser.setValidationMode( PARSER , true ); xmlparser.setDoctype( PARSER , DTD_DOCUMENT ); xmlparser.parseClob( PARSER , :new.PODOCUMENT.getClobVal() ); end;

Figure 27. DTD Validation using PL/SQL XML Parser.

When a row is inserted into the PURCHASEORDER table the trigger fires. The DTD is retrieved from the DTDSTORE table. In this example, the relationship between the DTD and column is hard-coded into the trigger. The information as to which DTD is to be used could just as easily be obtained from the instance document or the insert statement itself. The PL/SQL XML Parser is then instantiated. The parser is used to create instance of the DTDDocumentType object from the CLOB that was retrieved from the DTDSTORE table. The DTDDocumentType is then used to validate the XML Document that is being inserted into the PurchaseOrder Table. If the document being inserted does not conform to the DTD the appropriate error message will be generated.

Page 26: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

In the following example the XML Document contained in the CLOB content does not conform to the PurchaseOrder DTD. The document is missing the mandatory element ‘/PurchaseOrder/User’.

insert into PURCHASEORDER (PODOCUMENT) values(sys.xmltype.createXML(CONTENT)); commit; ERROR at line 1: ORA-20100: Error occurred while parsing: Invalid element 'CostCenter' in content of 'PurchaseOrder', expected elements '[User]'. ORA-06512: at "SYS.XMLPARSER", line 22 ORA-06512: at "SYS.XMLPARSER", line 98 ORA-06512: at "XMLUSER.PURCHASEORDERVALIDATION", line 20 ORA-04088: error during execution of trigger 'XMLUSER.PURCHASEORDERVALIDATION' ORA-06512: at line 8

Figure 28. DTD Validation failing at Runtime.

When an attempt is made to insert the XMLType into the PurchaseOrder table, the trigger is fired. The parser is used to validate the document. The validation of the document fails, the appropriate error message is generated and the insert operation fails. This means that the integrity of the data in the PODOCUMENT column is enforced. In the next example the PurchaseOrder validation trigger is replaced with a trigger that enforces more traditional database style referential integrity operations on the contents of the XML Document that is being inserted.

create or replace trigger PURCHASEORDERVALIDATION before insert on PURCHASEORDER for each row declare USERID varchar2(24); UPC varchar2(14); LINEITEMS sys.XMLTYPE; LINEITEM sys.XMLTYPE; USER_ELEMENT sys.XMLTYPE; UPC_ATTRIBUTEVALUE sys.XMLTYPE; NOT_A_PURCHASEORDER exception; MISSING_USERID exception; MISSING_LINEITEMS exception; INVALID_USERID exception; INVALID_LINEITEM exception; cursor VALIDUSER (USERID in varchar2) is select 'TRUE' from SCOTT.EMP where ENAME = USERID; cursor VALIDDVD(UPC in varchar2) is select 'TRUE' from DVDTITLES where UPCCODE = UPC; VALIDVALUE varchar2(4); I binary_integer; Begin if (:new.PODOCUMENT.existsnode('/PurchaseOrder') = 0) then

Page 27: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

raise NOT_A_PURCHASEORDER; end if; if (:new.PODOCUMENT.existsnode('/PurchaseOrder/User') > 0) then USER_ELEMENT := :new.PODOCUMENT.extract('/PurchaseOrder/User/text()'); else raise MISSING_USERID; end if; USERID := USER_ELEMENT.getStringVal(); open VALIDUSER(USERID); fetch VALIDUSER into VALIDVALUE; if (VALIDUSER%notfound) then close VALIDUSER; raise INVALID_USERID; end if; close VALIDUSER; LINEITEMS := :new.PODOCUMENT.extract('//LineItem'); if (LINEITEMS is null ) then raise MISSING_LINEITEMS; end if; I := 1; loop LINEITEM := LINEITEMS.extract('/LineItem['||I||']'); exit when LINEITEM is null; UPC_ATTRIBUTEVALUE := LINEITEM.extract('//LineItem/Part/@Id'); UPC := UPC_ATTRIBUTEVALUE.getStringVal(); open VALIDDVD(UPC); fetch VALIDDVD into VALIDVALUE; if (VALIDDVD%notfound) then close VALIDDVD; raise INVALID_LINEITEM; end if; close VALIDDVD; I := I + 1; end loop; exception when NOT_A_PURCHASEORDER then raise_application_error(-20005, 'Only PurchaseOrder documents can be stored in this column.'); when MISSING_USERID then raise_application_error(-20001, 'PurchaseOrder must include Requestor'); when MISSING_LINEITEMS then raise_application_error(-20002, 'PurchaseOrder must include one or more Line Items.'); when INVALID_USERID then raise_application_error(-20003, 'Invalid Requestor ('|| USERID ||') encountered.');

Page 28: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

when INVALID_LINEITEM then raise_application_error(-20004, 'Invalid UPC Code ('|| UPC || ') encountered.'); end PURCHASEORDERVALIDATION;

Figure 29.Referential Integrity Checking using PL/SQL and extract().

This example enforces the following validation on the documents that are being inserted into the Purchase Order Table. This trigger uses PL/SQL code to enforce the following rules:

1. The root node of the document must be PurchaseOrder. 2. The document must contain the node ‘/PurchaseOrder/User’. 3. The value of the text node associated with the node ‘/PurchaseOrder/User’ must match one of the

values in the ENAME column in the table SCOTT.EMP 4. The document must contain the node ‘PurchaseOrder/LineItems’. 5. The ID attribute of each Part element must match one of the values in the UPCCODE column in

the DVDTITLES table. If any of these conditions fails to be met the trigger will raise an appropriate error condition and the

document will not be stored in the database.

Page 29: PAPER NUMBER: 131 - XML DATABASE · Fast track to Oracle9i Paper 131 PAPER NUMBER: 131 - XML IN THE DATABASE: STORING XML WITH ALL YOUR OTHER CRITICAL DATA Mark D. Drake / Oracle

Fast track to Oracle9i

Paper 131

APPENDIX I – HELPER FUNCTIONS Here are the helper functions used by the code examples in this paper. They are provided here for reference purposes.

create or replace function encodeBoolean(input sys.xmlType) return raw deterministic is begin if (input is null) then return null; end if; if (input.getStringVal() = 'true') then return hexToRaw('01'); end if; return hexToRaw('00'); end; create or replace function getStringVal(input sys.xmlType) return string deterministic is begin if (input is null) then return null; end if; return input.getStringVal(); end; create or replace function getNumberVal(input sys.xmlType) return number deterministic is begin if (input is null) then return null; end if; return input.getNumberVal(); end; create or replace function decodeBoolean(input raw) return varchar2 deterministic is begin if (input is null) then return null; end if; if (input = hexToRaw('01')) then return 'true'; end if; return 'false'; end;

Figure 30.Helper Functions used in the code samples.