Using UML to define XML document types

31
Using UML To Define XML Document Types W. Eliot Kimber John D. Heintz

Transcript of Using UML to define XML document types

Page 1: Using UML to define XML document types

Using UML To Define XML Document Types

W. Eliot KimberJohn D. Heintz

Page 2: Using UML to define XML document types

Agenda Problem Definition

– What we are and are not building– General system and document modeling– Modeling information structures (e.g, DTDs)– Difficult to Integrate DTDs with rest of system model

Solution– Catalysis-style refinement– DTD models as implementation refinement of abstract

business objects– Types and Stereotypes

Simple Example Summary Future Work

Page 3: Using UML to define XML document types

Transcend Syntax

Page 4: Using UML to define XML document types

Problem Definition

How do we integrate traditional system engineering modeling practice with traditional

SGML and XML document analysis and modeling?

Page 5: Using UML to define XML document types

What Are We Building We build standards-based information

management systems, primarily SGML and XML-based

E.g., documentation authoring, production, and delivery

Often integrated with other core business processes:– Product engineering– Marketing support– Legislation

XML-based data is primary work product of system users

Page 6: Using UML to define XML document types

Typical System Requirements Must support many document types:

– Reflect complex (and often arcane) business rules– Reflect distinct cultures and practices of authors– Form families of related document types– May need to integrate with industry standards (ATA 2100,

Docbook, etc.)

Tens or hundreds of thousands of individual documents Hyperlinking and use by reference Must integrate with other information systems and

business processes Multiple outputs from a single source: print, HTML, etc. Long life cycle documents (20-100+ years)

Page 7: Using UML to define XML document types

What We Are Not Building

Not using XML just for simple object marshalling

Not using XML just for messaging among system components

Page 8: Using UML to define XML document types

System and Document Modeling Want to use UML-based data and object

modeling to define our systems Traditional document analysis does not

use formal data modeling Impedence mismatch similar to storing a

program in a relational database Two ways to solve:

– Define mapping mechanism from DTDs to object models

– Define mapping mechanism from object models to DTDs

Page 9: Using UML to define XML document types

Mapping from DTDs to Object Models This approach is problematic:

– DTDs are not really data models…– …only weak syntax constraints– DTDs provide no way to capture abstraction across

models– DTDs are implementation views of some higher-level

abstraction– May be many ways to interpret a given XML structure

as objects– May need multiple related DTDs for the same

business object– Authoring vs. delivery, different languages or

cultures, etc.

Page 10: Using UML to define XML document types

Practical Difficulties of DTD Development

Tools for developing DTDs not integrated with other system design tools

No standard graphical representation Difficult to engineer system objects and

models from DTDs Difficult to integrate DTD documentation

with DTD definition DTDs are not modular, making

management of related DTDs difficult… …DTDs are not inherently shareable.

Page 11: Using UML to define XML document types

We Had To Reject Traditional Approach

We wanted to apply formal system modeling to XML-based systems

With focus on DTDs, XML document type components could be bound to implementation definition only through documentation strings

No automated tracability from requirements to XML rules:– Difficult to define relationships among XML data and related code

objects– Difficult to define relationships among different XML components

(architectures provide some but not all)

Difficult to capture re-usable XML parts of designs DTD documentation became unmanagable

Page 12: Using UML to define XML document types

Solution: Map Object Models to XML Document Types

Where we realize that DTDs are just implementation refinements of

higher-level abstractions

Page 13: Using UML to define XML document types

Mapping Objects to Document Type Definitions

DTD becomes implementation view of higher-level abstractions…

…Design focus is on business objects not data representation details

Traceability from system objects and formal requirements to DTD implementation

Can use facilities of modeling language not available in DTDs

Can bind documentation directly to model Can use formal constraints to define semantic

and syntactic constraints

Page 14: Using UML to define XML document types

Refinement: Relating Layers of Abstraction

A complete system model will have several layers of abstraction:– High-level system model– Functional requirements model– Implementation design model

Objects in one level will be reflected in other levels, but not necessarily directly

Design tracability requires formal mapping from objects in one level to objects in the adjacent models

Any number of implementation models can refine a given functional model

Page 15: Using UML to define XML document types

Refinement from High to Low Abstraction

A B

Abstract System Design

System Implementation 1

A B

D

C

RefinementA->A,C,DB->B

BE

F

System Implementation 2

RefinementA->E,FB->B

Page 16: Using UML to define XML document types

Problem: How To Bind Types to XML Syntax?

UML data models define types Must have formal, computer-sensible way to

map UML types to XML DTD syntactic constructs: elements, attributes, content models, notations

A fixed mapping from UML graphical components to XML components won’t work:– Some types will be element types– Some types will be attributes– Some types will be notations

No direct analog of content models in UML language

Page 17: Using UML to define XML document types

Solution: UML Stereotypes Stereotypes characterize UML syntactic

components to add specialized semantics:

UML does not define how a set of stereotypes is formally defined

<<element>>Book

<<attribute>>Author

Page 18: Using UML to define XML document types

Components of Our Solution We had to define the set of stereotypes needed to

enable mapping to XML DTD syntax Had to define the semantics of those stereotypes:

– Defined the stereotypes as UML types in their own package– Formal constraints on these types plus prose documentation

defines the semantics– The XML stereotypes in turn map back to the formal models for

XML as defined by ISO and/or W3C

The stereotypes reflect the abstract model for XML DTD declarations (element type, attribute, notation, etc.)…

…therefore, can map to any XML DTD representation syntax (markup declarations, XML Schema, etc.)

Page 19: Using UML to define XML document types

Document Analysis Produces Business Object Design

Document analysis now results in document business object models

For us, document analysis is part of the larger system analysis task…

…documents are just another kind of business object…

…may or may not be represented in XML in implementation.

Page 20: Using UML to define XML document types

Document Analysis (cont) Focus of document analysis stays on business

requirements, not syntax details Document analysis results in abstract

information model for business objects that are documents (in the everyday sense)

From this model, multiple implementations (“DTDs”) may be refined

XML syntax details defined as part of implementation task, not system analysis task

Can have multiple XML or non-XML implementations of same document business objects with full design traceability

Page 21: Using UML to define XML document types

Simple Example

A trivial but representative example of an abstract

information model and an implementation refinement

Page 22: Using UML to define XML document types

Document Business Object Model

Report

Report_Metadata

+Title: String+Revision_Date: Date

Division

Division_Metadata

+title: String+change_status: Status_Type

Div_or_Div_Components<<select>>Div_Components

Paragraph Table

metadata

body 1..*frontmatter * backmatter*

metadata body 1..*introduction 0..1

Page 23: Using UML to define XML document types

XML Implementation Model Top Level

Report«element»

«attribute» +revision_date: DateTitle

«element»

Front«element»

Body«element»

Back_Matter«element»

Sect«element»

+change_status: Nmtoken

Title_Text«pcdata»

1..* 1..* 1..*

0..1 0..1

Page 24: Using UML to define XML document types

<P> Element Type

P«element»

Paragraph_Components«model_group»

Text«pcdata»

Part_Number_Reference«element»

XRef«element»

Report_Data_Model.Paragraph

0..*

«refine»

Page 25: Using UML to define XML document types

Re-Use of Oasis Table Model

Paragraph_Stuff«model_group»

P«element»

OASIS_Table_Model.Table«element»

+pgwide: Boolean_Att+frame: Name_Group

Page 26: Using UML to define XML document types

Full DTD ViewReport

«element»

«attribute» +revision_date: Date

Title«element»

Body«element»

Back_Matter«element»

Front«element»

Title_Text«pcdata»

Sect«element»

+change_status: Nmtoken

Sect_Content«model_group»

Sect_Body«element»

Intro«element»

Intro_and_Subsects«model_group»

Sect«element»

+change_status: Nmtoken

href«reference»

My_DB_Query«notation»

+get_value_by_key()

P«element»

Paragraph_Components«model_group»

Paragraph_Stuff«model_group»

part_number«value_reference_attribute»

Part_Number_Reference«element»

Part_Number_Value«pcdata»

Target_Title«pcdata»

Text«pcdata» URI

«notation»

XRef«element»

OASIS_Table_Model.Table«element»

+pgwide: Boolean_Att+frame: Name_Group

1..* 1..*1..*

0..*

1..*

+source

«value_reference»

+governing_notation

«source»

+governing_notation

Cross_Reference

+subject

+mark

«anchor_address»

Page 27: Using UML to define XML document types

Summary

Page 28: Using UML to define XML document types

Benefits for System Design and Implementation

Offers traceability from abstract system modeling to XML implementation

Offers rich set of features for managing DTD definitions:– Provides modularity through UML packages– Provides all of UML’s typing to XML components– Provides formal syntactic and semantic constraints through UML

object constraint language (or equivalent)

Focus of document analysis is on business objects, not on implementation technology or representation syntax

Same business models can be refined into XML, CORBA IDL, Java objects, RDBMS tables, etc….

Page 29: Using UML to define XML document types

Benefits for XML Practitioner XML design completely integrated into larger system

design Can use existing tools to develop and maintain DTD

definition (e.g., Rose, ObjectDomain, etc.) Provides design documentation in form easily

understood by implementors Get graphical representations of DTDs for free Documentation can be bound directly to model Elevates XML to first-class citizen in system design No need to choose a particular DTD representation

syntax (e.g., DTD declarations vs. XML Schema)… …both are simply generated from UML model

Page 30: Using UML to define XML document types

Work to Be Done Implement DTD syntax output generators Better understand how UML packages, Catalysis

refinement, OO frameworks, and SGML architectures interact

Understand how to map these models to groves at different levels of abstraction (e.g., groves that reflect the business object model, not the XML syntax model)

Expand model to include hyperlink representation

Apply approach in practice