Using UML to define XML document types
-
Upload
timothy212 -
Category
Documents
-
view
555 -
download
0
Transcript of Using UML to define XML document types
Using UML To Define XML Document Types
W. Eliot KimberJohn D. Heintz
Agenda Problem Definition
– What we are and are not building– General system and document modeling– Modeling information structures (e.g, DTDs)– Difficult to Integrate DTDs with rest of system model
Solution– Catalysis-style refinement– DTD models as implementation refinement of abstract
business objects– Types and Stereotypes
Simple Example Summary Future Work
Transcend Syntax
Problem Definition
How do we integrate traditional system engineering modeling practice with traditional
SGML and XML document analysis and modeling?
What Are We Building We build standards-based information
management systems, primarily SGML and XML-based
E.g., documentation authoring, production, and delivery
Often integrated with other core business processes:– Product engineering– Marketing support– Legislation
XML-based data is primary work product of system users
Typical System Requirements Must support many document types:
– Reflect complex (and often arcane) business rules– Reflect distinct cultures and practices of authors– Form families of related document types– May need to integrate with industry standards (ATA 2100,
Docbook, etc.)
Tens or hundreds of thousands of individual documents Hyperlinking and use by reference Must integrate with other information systems and
business processes Multiple outputs from a single source: print, HTML, etc. Long life cycle documents (20-100+ years)
What We Are Not Building
Not using XML just for simple object marshalling
Not using XML just for messaging among system components
System and Document Modeling Want to use UML-based data and object
modeling to define our systems Traditional document analysis does not
use formal data modeling Impedence mismatch similar to storing a
program in a relational database Two ways to solve:
– Define mapping mechanism from DTDs to object models
– Define mapping mechanism from object models to DTDs
Mapping from DTDs to Object Models This approach is problematic:
– DTDs are not really data models…– …only weak syntax constraints– DTDs provide no way to capture abstraction across
models– DTDs are implementation views of some higher-level
abstraction– May be many ways to interpret a given XML structure
as objects– May need multiple related DTDs for the same
business object– Authoring vs. delivery, different languages or
cultures, etc.
Practical Difficulties of DTD Development
Tools for developing DTDs not integrated with other system design tools
No standard graphical representation Difficult to engineer system objects and
models from DTDs Difficult to integrate DTD documentation
with DTD definition DTDs are not modular, making
management of related DTDs difficult… …DTDs are not inherently shareable.
We Had To Reject Traditional Approach
We wanted to apply formal system modeling to XML-based systems
With focus on DTDs, XML document type components could be bound to implementation definition only through documentation strings
No automated tracability from requirements to XML rules:– Difficult to define relationships among XML data and related code
objects– Difficult to define relationships among different XML components
(architectures provide some but not all)
Difficult to capture re-usable XML parts of designs DTD documentation became unmanagable
Solution: Map Object Models to XML Document Types
Where we realize that DTDs are just implementation refinements of
higher-level abstractions
Mapping Objects to Document Type Definitions
DTD becomes implementation view of higher-level abstractions…
…Design focus is on business objects not data representation details
Traceability from system objects and formal requirements to DTD implementation
Can use facilities of modeling language not available in DTDs
Can bind documentation directly to model Can use formal constraints to define semantic
and syntactic constraints
Refinement: Relating Layers of Abstraction
A complete system model will have several layers of abstraction:– High-level system model– Functional requirements model– Implementation design model
Objects in one level will be reflected in other levels, but not necessarily directly
Design tracability requires formal mapping from objects in one level to objects in the adjacent models
Any number of implementation models can refine a given functional model
Refinement from High to Low Abstraction
A B
Abstract System Design
System Implementation 1
A B
D
C
RefinementA->A,C,DB->B
BE
F
System Implementation 2
RefinementA->E,FB->B
Problem: How To Bind Types to XML Syntax?
UML data models define types Must have formal, computer-sensible way to
map UML types to XML DTD syntactic constructs: elements, attributes, content models, notations
A fixed mapping from UML graphical components to XML components won’t work:– Some types will be element types– Some types will be attributes– Some types will be notations
No direct analog of content models in UML language
Solution: UML Stereotypes Stereotypes characterize UML syntactic
components to add specialized semantics:
UML does not define how a set of stereotypes is formally defined
<<element>>Book
<<attribute>>Author
Components of Our Solution We had to define the set of stereotypes needed to
enable mapping to XML DTD syntax Had to define the semantics of those stereotypes:
– Defined the stereotypes as UML types in their own package– Formal constraints on these types plus prose documentation
defines the semantics– The XML stereotypes in turn map back to the formal models for
XML as defined by ISO and/or W3C
The stereotypes reflect the abstract model for XML DTD declarations (element type, attribute, notation, etc.)…
…therefore, can map to any XML DTD representation syntax (markup declarations, XML Schema, etc.)
Document Analysis Produces Business Object Design
Document analysis now results in document business object models
For us, document analysis is part of the larger system analysis task…
…documents are just another kind of business object…
…may or may not be represented in XML in implementation.
Document Analysis (cont) Focus of document analysis stays on business
requirements, not syntax details Document analysis results in abstract
information model for business objects that are documents (in the everyday sense)
From this model, multiple implementations (“DTDs”) may be refined
XML syntax details defined as part of implementation task, not system analysis task
Can have multiple XML or non-XML implementations of same document business objects with full design traceability
Simple Example
A trivial but representative example of an abstract
information model and an implementation refinement
Document Business Object Model
Report
Report_Metadata
+Title: String+Revision_Date: Date
Division
Division_Metadata
+title: String+change_status: Status_Type
Div_or_Div_Components<<select>>Div_Components
Paragraph Table
metadata
body 1..*frontmatter * backmatter*
metadata body 1..*introduction 0..1
XML Implementation Model Top Level
Report«element»
«attribute» +revision_date: DateTitle
«element»
Front«element»
Body«element»
Back_Matter«element»
Sect«element»
+change_status: Nmtoken
Title_Text«pcdata»
1..* 1..* 1..*
0..1 0..1
<P> Element Type
P«element»
Paragraph_Components«model_group»
Text«pcdata»
Part_Number_Reference«element»
XRef«element»
Report_Data_Model.Paragraph
0..*
«refine»
Re-Use of Oasis Table Model
Paragraph_Stuff«model_group»
P«element»
OASIS_Table_Model.Table«element»
+pgwide: Boolean_Att+frame: Name_Group
Full DTD ViewReport
«element»
«attribute» +revision_date: Date
Title«element»
Body«element»
Back_Matter«element»
Front«element»
Title_Text«pcdata»
Sect«element»
+change_status: Nmtoken
Sect_Content«model_group»
Sect_Body«element»
Intro«element»
Intro_and_Subsects«model_group»
Sect«element»
+change_status: Nmtoken
href«reference»
My_DB_Query«notation»
+get_value_by_key()
P«element»
Paragraph_Components«model_group»
Paragraph_Stuff«model_group»
part_number«value_reference_attribute»
Part_Number_Reference«element»
Part_Number_Value«pcdata»
Target_Title«pcdata»
Text«pcdata» URI
«notation»
XRef«element»
OASIS_Table_Model.Table«element»
+pgwide: Boolean_Att+frame: Name_Group
1..* 1..*1..*
0..*
1..*
+source
«value_reference»
+governing_notation
«source»
+governing_notation
Cross_Reference
+subject
+mark
«anchor_address»
Summary
Benefits for System Design and Implementation
Offers traceability from abstract system modeling to XML implementation
Offers rich set of features for managing DTD definitions:– Provides modularity through UML packages– Provides all of UML’s typing to XML components– Provides formal syntactic and semantic constraints through UML
object constraint language (or equivalent)
Focus of document analysis is on business objects, not on implementation technology or representation syntax
Same business models can be refined into XML, CORBA IDL, Java objects, RDBMS tables, etc….
Benefits for XML Practitioner XML design completely integrated into larger system
design Can use existing tools to develop and maintain DTD
definition (e.g., Rose, ObjectDomain, etc.) Provides design documentation in form easily
understood by implementors Get graphical representations of DTDs for free Documentation can be bound directly to model Elevates XML to first-class citizen in system design No need to choose a particular DTD representation
syntax (e.g., DTD declarations vs. XML Schema)… …both are simply generated from UML model
Work to Be Done Implement DTD syntax output generators Better understand how UML packages, Catalysis
refinement, OO frameworks, and SGML architectures interact
Understand how to map these models to groves at different levels of abstraction (e.g., groves that reflect the business object model, not the XML syntax model)
Expand model to include hyperlink representation
Apply approach in practice