SQL Server 2005: Deep SQL Server 2005: Deep Dive Dive On XML And XQueryOn XML And XQuery
Michael RysMichael RysDAT405 DAT405 Program Manager, SQL Server XML Program Manager, SQL Server XML TechnologiesTechnologiesMicrosoft CorporationMicrosoft Corporation
2
File SystemFile SystemFile SystemFile System
XMLXMLX
ML
XM
LXMLXML
XM
LX
ML
XML And Relational Data XML And Relational Data TodayToday
Rel
atio
nal
Rel
atio
nal
Dat
aD
ata
RelationalRelationalDataData
RelationalRelationalDataData
Rel
atio
nal
Rel
atio
nal
Dat
aD
ata
Query and CombineQuery and Combine
3
XML ScenariosXML Scenarios
Data ExchangeData ExchangeBusiness-to-business (B2B), business-to-consumer (B2C), Business-to-business (B2B), business-to-consumer (B2C), application-to-application (A2A)application-to-application (A2A)
XML is ubiquitous, extensible, platform independent transport XML is ubiquitous, extensible, platform independent transport formatformat
Document ManagementDocument ManagementXHTML, Office XML DocumentsXHTML, Office XML Documents
MessagingMessagingSimple Object Access Protocol (SOAP), RSSSimple Object Access Protocol (SOAP), RSS
Mid-Tier CollaborationMid-Tier Collaboration
Ad-hoc modeling of semistructured dataAd-hoc modeling of semistructured datastoring objects with sparse or multi-valued properties that do not storing objects with sparse or multi-valued properties that do not fit well into the traditional relational schematafit well into the traditional relational schemata
→→Transport, Store, and Query XML dataTransport, Store, and Query XML data
4
XML Or Relational?XML Or Relational?
Data Data CharacteristicsCharacteristics
XMLXML RelationalRelational
Flat Structured Flat Structured DataData
Hierarchical Hierarchical Structured DataStructured Data
Not First Class: Not First Class: PK-FK with PK-FK with cascading cascading deletedelete
Semi-structured Semi-structured DataData
Not First ClassNot First Class
Mark-up DataMark-up Data Not First Class: Not First Class: FTSFTS
Order Order preservationpreservation
Not First ClassNot First Class
RecursionRecursion (Recursive (Recursive query)query)
5
XML And Relational!XML And Relational!
ScenariosScenarios XMLXML RelationalRelationalRelational Data Relational Data ExchangeExchange
Use as transport, Use as transport, shred to shred to relationalrelational
Storage and QueryStorage and Query
Document Document ManagementManagement
Use as markup, Use as markup, store nativelystore natively
Provides Provides framework to framework to manage collections manage collections and relationships; and relationships; provides Full-text provides Full-text searchsearch
Semi-structured Semi-structured DataData
Represent semi-Represent semi-structured partsstructured parts
Represent Represent structured partsstructured parts
Message auditMessage audit Store nativelyStore natively Used for querying Used for querying over promoted over promoted propertiesproperties
Object serializationObject serialization Store nativelyStore natively Used for querying Used for querying over promoted over promoted propertiesproperties
6
SQL Server 2005 XML SQL Server 2005 XML ArchitectureArchitecture
XML ParserXML ParserXML ParserXML ParserXMLXML
ValidationValidationValidationValidation
XML data typeXML data type(binary XML)(binary XML)
XML data typeXML data type(binary XML)(binary XML)
SchemaSchemaCollectionCollectionSchemaSchema
CollectionCollection
XMLXML RelationalRelational
XML SchemataXML Schemata
OpenXML/nodes()OpenXML/nodes()
FOR XML with FOR XML with TYPE directiveTYPE directive
RowsetsRowsetsRowsetsRowsets
XQueryXQuery
XML-DMLXML-DMLNode Node TableTableNode Node TableTable
PATH PATH Index Index PATH PATH Index Index
PROP PROP Index Index PROP PROP Index Index
VALUE VALUE Index Index
VALUE VALUE Index Index
PRIMARYPRIMARYXML INDEXXML INDEX
XQueryXQuery
7
Why XQuery? Why XQuery?
SQL does not understand XMLSQL does not understand XML
XPath 1.0XPath 1.0W3C RecommendationW3C Recommendation
Used in SQL Server 2000: SQLXML and OpenXMLUsed in SQL Server 2000: SQLXML and OpenXML
Navigation, no reshapingNavigation, no reshaping
Limited knowledge about typesLimited knowledge about types
XSLTXSLTW3C RecommendationW3C Recommendation
Data-driven reshaping (uses XPath)Data-driven reshaping (uses XPath)
MSXML, System.XMLMSXML, System.XML
Hard to author and optimize for large amount of dataHard to author and optimize for large amount of data
No XML data modification language (DML)No XML data modification language (DML)
8
What Is XQuery? What Is XQuery?
Queries and transforms trees Queries and transforms trees
Functional, declarative query languageFunctional, declarative query language
Combines XPath with node constructionCombines XPath with node construction
Operates on (XML Schema-)typed and Operates on (XML Schema-)typed and unconstrained XMLunconstrained XML
Designed to operate on large amounts of Designed to operate on large amounts of datadata
OptimizableOptimizable
Current Status: In final Last CallCurrent Status: In final Last Call
Recommendations in H2 CY2006Recommendations in H2 CY2006
Fulltext and DML extensions will follow Fulltext and DML extensions will follow laterlater
10
Key XQuery FeaturesKey XQuery Features
FLWOR: FOR / LET / WHERE / ORDER BY / FLWOR: FOR / LET / WHERE / ORDER BY / RETURNRETURN
Includes XPath 2.0 (/doc[@id = 123])Includes XPath 2.0 (/doc[@id = 123])
Element constructors (<topic>{…}</topic>)Element constructors (<topic>{…}</topic>)
Order-preserving operatorsOrder-preserving operatorsInput order (FLWR)Input order (FLWR)
Document order (XPath, union)Document order (XPath, union)
Statically (or dynamically) typedStatically (or dynamically) typed
Strong typing with schema, weak typing Strong typing with schema, weak typing without schemawithout schema
FLWOR: FOR / LET / WHERE / ORDER BY / FLWOR: FOR / LET / WHERE / ORDER BY / RETURNRETURN
Includes XPath 2.0 (/doc[@id = 123])Includes XPath 2.0 (/doc[@id = 123])
Element constructors (<topic>{…}</topic>)Element constructors (<topic>{…}</topic>)
Order-preserving operatorsOrder-preserving operatorsInput order (FLWR)Input order (FLWR)
Document order (XPath, union)Document order (XPath, union)
Statically (or dynamically) typedStatically (or dynamically) typed
Strong typing with schema, weak typing Strong typing with schema, weak typing without schemawithout schema
SQL: SELECT FROM WHERE ORDER BYWITH
FOR LET WHERE ORDER BY
& SET
RETURN
11
XQuery Type SystemXQuery Type System
3 Classes of Item Types: 3 Classes of Item Types: Node types: element(), attribute(), comment() etc.Node types: element(), attribute(), comment() etc.
Element content types: xs:anyType, user-defined (e.g., Element content types: xs:anyType, user-defined (e.g., my:CustomerT)my:CustomerT)
Atomic types: built-in and user-defined (e.g., xs:int, my:hatSize)Atomic types: built-in and user-defined (e.g., xs:int, my:hatSize)
XQuery uses XML Schema for content and atomic types XQuery uses XML Schema for content and atomic types
““Untyped” data have special types (e.g., xdt:untypedAtomic)Untyped” data have special types (e.g., xdt:untypedAtomic)
XML Schema (W3C standard)XML Schema (W3C standard)Rich mechanism for type definitions and validation constraintsRich mechanism for type definitions and validation constraints
Can be used to constrain XML documentsCan be used to constrain XML documents
XML Schema Collections will be used for typing (meta-data)XML Schema Collections will be used for typing (meta-data)
Benefits of typed data Benefits of typed data Guarantees shape of dataGuarantees shape of data
Provide type specific semanticsProvide type specific semantics
Allows storage and query optimizationsAllows storage and query optimizations
12
Static Typing In XQueryStatic Typing In XQuery
Type Inference: Infers type of Expression during Type Inference: Infers type of Expression during compilationcompilation
Type Check: Inferred Type is subtype of expected Type Check: Inferred Type is subtype of expected typetype
Benefits:Benefits:Compile-time type error discoveryCompile-time type error discovery
Guarantees correct type at runtimeGuarantees correct type at runtime
More efficient executionMore efficient execution
Costs:Costs:Sometimes type inference is less precise than data will Sometimes type inference is less precise than data will be (inferring list on /a[1]/b, but there will always be only be (inferring list on /a[1]/b, but there will always be only 1 b)1 b)
Requires more explicit casts and “pick first” (/a[1]/b[1])Requires more explicit casts and “pick first” (/a[1]/b[1])
13
XML Data ModificationXML Data Modification
XQuery extensions: Insert, update, XQuery extensions: Insert, update, and deleteand delete
XML sub-tree modification:XML sub-tree modification:Add or delete XML sub-treesAdd or delete XML sub-trees
Update valuesUpdate values
Generate consistent stateGenerate consistent state
14
XML-DML:XML-DML:
CustomerCustomerCustomerCustomer
name: xs:stringname: xs:stringname: xs:stringname: xs:string OrderOrderOrderOrder
id: xs:intid: xs:intid: xs:intid: xs:int““Janine”Janine”
4242
insertinsertdeletedeletereplace value ofreplace value of
insert <notes/>into /Customer
insert <notes/>as lastinto /Customerinsert <notes/>as firstinto /Customerinsert <notes/>before /Customer/nameinsert <notes/>after/Customer/name
notesnotesnotesnotes
notesnotesnotesnotes notesnotesnotesnotes
delete /Customer/Order[id = 42]
Target needs to be statically one nodeTarget needs to be statically one node
““Nils”Nils”
replace value of(/Customer/name)[1]with “Nils”
16
XQuery And XML-DML In XQuery And XML-DML In SQL Server 2005SQL Server 2005
Subset of XQuery implementedSubset of XQuery implementedIs aligned with July 2004 XQuery working draftIs aligned with July 2004 XQuery working draftAdded XML Data ModificationAdded XML Data ModificationApplies to single XML data type instanceApplies to single XML data type instanceMethods on XML data type: Methods on XML data type:
query(), value(), exist(), modify(), nodes()query(), value(), exist(), modify(), nodes()
Use SQL to iterate over collection of instances Use SQL to iterate over collection of instances (XML-typed column)(XML-typed column)Can refer to relational data Can refer to relational data Take advantage of Schema-collection information Take advantage of Schema-collection information to operate on typed XML datato operate on typed XML dataWill make use of XML indices for optimizationWill make use of XML indices for optimization
17
query()query() creates new, untyped creates new, untyped XML data type instanceXML data type instancevalue()value() extracts an XQuery value into extracts an XQuery value into the SQL value and type spacethe SQL value and type space
Expression has to statically be a singleton Expression has to statically be a singleton String value of atomized XQuery item is String value of atomized XQuery item is cast cast to SQL typeto SQL typeSQL type has to be SQL scalar type SQL type has to be SQL scalar type (no XML or CLR UDT)(no XML or CLR UDT)
exist()exist() returns 1 if the XQuery returns 1 if the XQuery expression returns at least one item, expression returns at least one item, 0 otherwise0 otherwise
XQuery MethodsXQuery Methods
18
XQuery: nodes()XQuery: nodes()
Provides OpenXML-like functionality on Provides OpenXML-like functionality on XML data type column in SQL Server 2005XML data type column in SQL Server 2005
Returns a row per selected nodeReturns a row per selected node
Each row contains a special XML data Each row contains a special XML data type instance thattype instance that
References one of the selected nodesReferences one of the selected nodes
Preserves the original structure and typesPreserves the original structure and types
Can only be used with the XQuery methods Can only be used with the XQuery methods (not modify()), count(*), and IS (NOT) NULL(not modify()), count(*), and IS (NOT) NULL
19
Map SQL value and type into XQuery values Map SQL value and type into XQuery values and types in context of XQuery or XML-DMLand types in context of XQuery or XML-DMLsql:variable():sql:variable(): accesses a SQL accesses a SQL variable/parametervariable/parameterdeclare @value int set @value=42select * from T where T.x.exist(‘/a/b[@id=sql:variable(“@value”)]’)=1
sql:column():sql:column(): accesses another column value accesses another column valuetables: T(key int, x xml), S(key int, val int)
select * from T join S on T.key=S.keywhere T.x.exist(‘/a/b[@id=sql:column(“S.val”)]’)=1
Restrictions in SQL Server 2005: Restrictions in SQL Server 2005: No XML, CLR UDT, datetime, or deprecated No XML, CLR UDT, datetime, or deprecated text/ntext/imagetext/ntext/image
sql:column()/sql:variable()sql:column()/sql:variable()
20
Used with SET:Used with SET:
declare @xdoc xmlset @xdoc.modify(‘delete /a/b[@id=“42”]’)
update T set T.xdoc.modify(‘insert <b/> into /a’)where T.id=1
Relational row-level concurrency: whole XML Relational row-level concurrency: whole XML instance is lockedinstance is locked
XQuery: modify()XQuery: modify()
21
Combined SQL And XQuery/DML ProcessingCombined SQL And XQuery/DML Processing
XQuery ParserXQuery ParserXQuery ParserXQuery Parser
Static TypingStatic TypingStatic TypingStatic Typing
AlgebrizationAlgebrizationAlgebrizationAlgebrization
XML XML SchemaSchema
CollectionCollection
XML XML SchemaSchema
CollectionCollection
MetadataMetadataStatic Static PhasePhase
Runtime Optimization Runtime Optimization and Execution of and Execution of physical Op Treephysical Op Tree
Runtime Optimization Runtime Optimization and Execution of and Execution of physical Op Treephysical Op Tree
Dynamic Dynamic PhasePhase
XML and XML and rel.rel.
IndicesIndices
XML and XML and rel.rel.
IndicesIndices
Static Optimization of Static Optimization of combined Logical and combined Logical and
Physical Operation TreePhysical Operation Tree
Static Optimization of Static Optimization of combined Logical and combined Logical and
Physical Operation TreePhysical Operation Tree
SELECT x.query(‘…’), y FROM T WHERE …SELECT x.query(‘…’), y FROM T WHERE …
SQL ParserSQL ParserSQL ParserSQL Parser
AlgebrizationAlgebrizationAlgebrizationAlgebrization
Static TypingStatic TypingStatic TypingStatic Typing
22
XML IndicesXML Indices
Create XML index on XML columnCreate XML index on XML columnCREATE PRIMARY XML INDEX idx_1 ON docs (xDoc)
Create secondary indexes on tags, Create secondary indexes on tags, values, pathsvalues, paths
Speed up queriesSpeed up queriesResults can be served directly from indexResults can be served directly from index
SQL’s cost based optimizer will consider SQL’s cost based optimizer will consider indexindex
Primary and Secondary Indices will be Primary and Secondary Indices will be efficiently maintained during updatesefficiently maintained during updates
Only subtree that changes will be updatedOnly subtree that changes will be updated
23
Example Index ContentsExample Index Contentsinsert into Person values (42,
'<book ISBN=”1-55860-438-3”><section> <title>Bad Bugs</title> Nobody loves bad bugs.</section><section> <title>Tree Frogs</title>
All right-thinking people <bold>love</bold> tree frogs.
</section></book>')
24
Primary XML IndexPrimary XML IndexCREATE PRIMARY XML INDEX CREATE PRIMARY XML INDEX PersonIdx PersonIdx ON ON Person Person ((PdescPdesc))
Assumes typed data; Columns and Values are simplified, see VLDB 2004 paper for detailsAssumes typed data; Columns and Values are simplified, see VLDB 2004 paper for details
PKPK XIDXID TAG IDTAG ID NodeNode Type-IDType-ID VALUE VALUE HIDHID
4242 11 1 (book)1 (book) ElemenElementt
1 (bookT)1 (bookT) nullnull #book#book
4242 1.11.1 2 (ISBN)2 (ISBN) AttributAttributee
2 2 (xs:st(xs:string)ring)
1-55860-1-55860-438-3438-3
#@ISBN#book#@ISBN#book
4242 1.31.3 3 3 (sec(section)tion)
ElemenElementt
3 3 (secti(sectionT)onT)
nullnull #section#book#section#book
4242 1.3.1.3.11
4 4 (TIT(TITLE)LE)
ElemenElementt
2 2 (xs:st(xs:string)ring)
Bad BugsBad Bugs #title#section#b#title#section#bookook
4242 1.3.1.3.33
---- TextText ---- Nobody Nobody loves bad loves bad bugs.bugs.
#text()#section##text()#section#bookbook
4242 1.51.5 3 3 (sec(section)tion)
ElemenElementt
3 3 (secti(sectionT)onT)
nullnull #section#book#section#book
4242 1.5.1.5.11
4 (title)4 (title) ElemenElementt
2 2 (xs:st(xs:string)ring)
Tree frogsTree frogs #title#section#b#title#section#bookook
4242 1.5.1.5.33
---- TextText ---- All right-All right-thinking thinking peoplepeople
#text()#section##text()#section#bookbook
4242 1.5.1.5.55
7 (bold)7 (bold) ElemenElementt
4 (boldT)4 (boldT) lovelove #bold#section#b#bold#section#bookook
4242 1.5.1.5.77
---- TextText ---- tree frogstree frogs #text()#section##text()#section#bookbook
25
PPKK
XIXIDD
NINIDD
TITIDD
VALUVALUEE
LVALULVALUEE
HIDHID xsinixsinill
……
11
11
11
22
22
22
33
33
33
Architectural Blueprint: Architectural Blueprint: Indexing Indexing
idid xx
11 Binary XMLBinary XML
22 Binary XMLBinary XML
33 Binary XMLBinary XML
XML ColumnXML Columnin table T(id, x)in table T(id, x)
Primary XML Index (1 per XML column)Primary XML Index (1 per XML column)Clustered on Primary Key (of table T), XIDClustered on Primary Key (of table T), XID
Non-clustered Secondary Indices (n per primary Index)Non-clustered Secondary Indices (n per primary Index)
Value IndexValue IndexValue IndexValue Index Path IndexPath IndexPath IndexPath IndexProperty IndexProperty IndexProperty IndexProperty Index
33 11 2211 2244 3333 1122
27
Take-Away: XML Indices Take-Away: XML Indices
PRIMARY XML Index PRIMARY XML Index – use when lot’s of – use when lot’s of XQueryXQuery
FOR VALUEFOR VALUE –– useful for queries where useful for queries where values are more selective than paths such values are more selective than paths such as //*[.=“Seattle”]as //*[.=“Seattle”]
FOR PATHFOR PATH –– useful for Path expressions: useful for Path expressions: avoids joins by mapping paths to avoids joins by mapping paths to hierarchical index (HID) numbers. hierarchical index (HID) numbers. Example: /person/address/zipExample: /person/address/zip
FOR PROPERTYFOR PROPERTY –– useful when optimizer useful when optimizer chooses other index (e.g., on relational chooses other index (e.g., on relational column, or FT Index) in addition so row is column, or FT Index) in addition so row is already known already known
29
Session SummarySession Summary
SQL Server 2005 provides XQuery and XML SQL Server 2005 provides XQuery and XML DML on XML datatype DML on XML datatype
XQuery subset based on July 2004 WDXQuery subset based on July 2004 WD
Typing provided by XML Schema collections Typing provided by XML Schema collections
on XML datatypeon XML datatype
Node-based Data Manipulation Language Node-based Data Manipulation Language (DML)(DML)
Integrates with relational processingIntegrates with relational processing
Optimization:Optimization:Using extended relational algebra and query Using extended relational algebra and query optimizeroptimizer
Indexing of XML datatypeIndexing of XML datatype
30
Community ResourcesCommunity Resources
At PDCAt PDCDAT Track lounge: I’ll be there dailyDAT Track lounge: I’ll be there daily
After PDCAfter PDCMSDN dev center: http://msdn.microsoft.com/SQL/2005 MSDN dev center: http://msdn.microsoft.com/SQL/2005 XML and Databases whitepapers: XML and Databases whitepapers: http://msdn.microsoft.com/XML/BuildingXML/XMLandDatahttp://msdn.microsoft.com/XML/BuildingXML/XMLandDatabase/ base/ Online WebCasts: Online WebCasts: http://msdn.microsoft.com/sql/2005/2005webcasts/ http://msdn.microsoft.com/sql/2005/2005webcasts/ Newsgroups & Forum: Newsgroups & Forum: news:microsoft.public.sqlserver.xml news:microsoft.public.sqlserver.xml http://forums.microsoft.com/msdn/ShowForum.aspx?http://forums.microsoft.com/msdn/ShowForum.aspx?ForumID=89 ForumID=89 My E-mail: [email protected] My E-mail: [email protected] My Weblog: My Weblog: http://www.sqljunkies.com/weblog/mryshttp://www.sqljunkies.com/weblog/mrys
Please fill out Session EvaluationPlease fill out Session Evaluation
Top Related