Document Interchange Specification

1901
1 2 3 Office Open 4 XML 5 6 Document Interchange Specification 7 8 9 10 Base Document 11 12 Submitted to Ecma TC45 by Microsoft 13 14 December 2005 15 16 17 18

Transcript of Document Interchange Specification

  • 1 2

    3

    Office Open 4

    XML 5

    6 Document Interchange Specification 7

    8

    9 10

    Base Document 11 12

    Submitted to Ecma TC45 by Microsoft 13 14

    December 2005 15 16

    17

    18

  • Table of Contents

    iii

    Table of Contents 1

    Introduction .................................................................................................................................................. viii 2

    1. Scope ............................................................................................................................................................. 1 3

    2. Conformance ............................................................................................................................................... 2 4

    3. Normative references .................................................................................................................................. 3 5

    4. Definitions .................................................................................................................................................... 4 6

    5. Notational conventions ................................................................................................................................ 6 7

    6. Acronyms and abbreviations...................................................................................................................... 7 8

    7. General description ..................................................................................................................................... 8 9

    8. Overview ...................................................................................................................................................... 9 10

    8.1 Packages and parts ................................................................................................................................... 9 11

    8.2 Consumers and producers ........................................................................................................................ 9 12

    8.3 WordprocessingML ................................................................................................................................. 9 13

    8.4 SpreadsheetML ...................................................................................................................................... 10 14

    8.5 PresentationML ..................................................................................................................................... 10 15

    8.6 Supporting MLs ..................................................................................................................................... 11 16

    8.6.1 DrawingML .................................................................................................................................... 11 17

    8.6.2 OMML ............................................................................................................................................ 12 18

    8.6.3 VML................................................................................................................................................ 12 19

    8.6.4 Other namespaces ........................................................................................................................... 12 20

    9. Packages ..................................................................................................................................................... 13 21

    9.1 Parts ....................................................................................................................................................... 13 22

    9.1.1 Part naming rules ............................................................................................................................ 13 23

    9.1.2 Part Addressing ............................................................................................................................... 14 24

    9.2 Relationships ......................................................................................................................................... 14 25

    9.2.1 Bidirectional relationship traversal ................................................................................................. 15 26

    9.2.2 Relationship markup ....................................................................................................................... 15 27

    9.2.3 Representing relationships .............................................................................................................. 16 28

    9.3 Content type........................................................................................................................................... 17 29

    9.3.1 Content-type item markup .............................................................................................................. 17 30

    9.3.2 Setting the ContentType value of a part.......................................................................................... 18 31

    9.3.3 Getting the ContentType value of a part ......................................................................................... 18 32

    9.4 ZIP archive mapping ............................................................................................................................. 19 33

    9.4.1 Mapping part names to ZIP archive item names ............................................................................. 19 34

    9.4.2 Mapping ZIP archive item names to part names ............................................................................. 19 35

    9.4.3 Limitations ...................................................................................................................................... 19 36

    10. WordprocessingML ................................................................................................................................ 20 37

    10.1 Glossary of WordprocessingML-specific terms .................................................................................. 20 38

    10.2 Package structure ................................................................................................................................. 20 39

    10.3 Part summary ....................................................................................................................................... 21 40

    10.3.1 Alternative Format Import part ..................................................................................................... 23 41

    10.3.2 Attached Toolbar Data part ........................................................................................................... 24 42

    10.3.3 Comments part .............................................................................................................................. 24 43

    10.3.4 Document Building Blocks part .................................................................................................... 25 44

    10.3.5 Document Settings part ................................................................................................................. 26 45

    10.3.6 Endnotes part ................................................................................................................................ 26 46

    10.3.7 Font Table part .............................................................................................................................. 26 47

    10.3.8 Footer part ..................................................................................................................................... 26 48

  • Table of Contents

    iv

    10.3.9 Footnotes part ............................................................................................................................... 26 1

    10.3.10 Glossary Document part ............................................................................................................. 26 2

    10.3.11 Header part .................................................................................................................................. 26 3

    10.3.12 Keyboard and Toolbar Customizations part ............................................................................... 26 4

    10.3.13 List Definitions part .................................................................................................................... 26 5

    10.3.14 Main Document part ................................................................................................................... 26 6

    10.3.15 Main Template part ..................................................................................................................... 26 7

    10.3.16 Style Definitions part .................................................................................................................. 26 8

    10.3.17 Web Settings part ........................................................................................................................ 26 9

    10.4 Master documents and subdocuments ................................................................................................. 26 10

    10.5 Framesets ............................................................................................................................................. 26 11

    11. SpreadsheetML ....................................................................................................................................... 26 12

    11.1 Glossary of SpreadsheetML-specific terms ......................................................................................... 26 13

    11.2 Package structure ................................................................................................................................. 26 14

    11.3 Part summary ....................................................................................................................................... 26 15

    11.3.1 Attached Toolbar Data part ........................................................................................................... 26 16

    11.3.2 Background Image part ................................................................................................................. 26 17

    11.3.3 Calculation Chain part .................................................................................................................. 26 18

    11.3.4 Chartsheet part .............................................................................................................................. 26 19

    11.3.5 Comments part .............................................................................................................................. 26 20

    11.3.6 Connections part ........................................................................................................................... 26 21

    11.3.7 Custom Property part .................................................................................................................... 26 22

    11.3.8 Custom XML Mappings part ........................................................................................................ 26 23

    11.3.9 Dialogsheet part ............................................................................................................................ 26 24

    11.3.10 Drawings part .............................................................................................................................. 26 25

    11.3.11 External Workbook References part ........................................................................................... 26 26

    11.3.12 Metadata part .............................................................................................................................. 26 27

    11.3.13 Pivot Table part ........................................................................................................................... 26 28

    11.3.14 Pivot Table Cache Definition part .............................................................................................. 26 29

    11.3.15 Pivot Table Cache Records part .................................................................................................. 26 30

    11.3.16 Printer Settings part..................................................................................................................... 26 31

    11.3.17 Query Table part ......................................................................................................................... 26 32

    11.3.18 Shared String Table part ............................................................................................................. 26 33

    11.3.19 Shared Workbook Revision Headers part ................................................................................... 26 34

    11.3.20 Shared Workbook Revision Log part.......................................................................................... 26 35

    11.3.21 Shared Workbook User Data part ............................................................................................... 26 36

    11.3.22 Single Cell Table Definitions part .............................................................................................. 26 37

    11.3.23 Styles part ................................................................................................................................... 26 38

    11.3.24 Table Definition part ................................................................................................................... 26 39

    11.3.25 Volatile Dependencies part ......................................................................................................... 26 40

    11.3.26 Workbook part ............................................................................................................................ 26 41

    11.3.27 Worksheet part ............................................................................................................................ 26 42

    11.4 External workbooks ............................................................................................................................. 26 43

    12. PresentationML ....................................................................................................................................... 26 44

    12.1 Glossary of PresentationML-specific terms ........................................................................................ 26 45

    12.2 Package structure ................................................................................................................................. 26 46

    12.3 Part summary ....................................................................................................................................... 26 47

    12.3.1 Comment Authors part .................................................................................................................. 26 48

    12.3.2 Comments part .............................................................................................................................. 26 49

    12.3.3 Handout Master part ..................................................................................................................... 26 50

    12.3.4 Notes Master part .......................................................................................................................... 26 51

    12.3.5 Notes Slide part ............................................................................................................................. 26 52

    12.3.6 Presentation part ........................................................................................................................... 26 53

  • Table of Contents

    v

    12.3.7 Presentation Properties part .......................................................................................................... 26 1

    12.3.8 Slide part ....................................................................................................................................... 26 2

    12.3.9 Slide Layout part ........................................................................................................................... 26 3

    12.3.10 Slide Master part ......................................................................................................................... 26 4

    12.3.11 View Properties part ................................................................................................................... 26 5

    13. DrawingML ............................................................................................................................................. 26 6

    13.1 Glossary of DrawingML-specific terms .............................................................................................. 26 7

    13.2 Part summary ....................................................................................................................................... 26 8

    13.2.1 Chart part ...................................................................................................................................... 26 9

    13.2.2 Chart Drawing part ....................................................................................................................... 26 10

    13.2.3 Diagram Colors part ...................................................................................................................... 26 11

    13.2.4 Diagram Data part ......................................................................................................................... 26 12

    13.2.5 Diagram Layout Definition part .................................................................................................... 26 13

    13.2.6 Diagram Style part ........................................................................................................................ 26 14

    13.2.7 Theme part .................................................................................................................................... 26 15

    13.2.8 Theme Override part ..................................................................................................................... 26 16

    13.2.9 Table Styles part ........................................................................................................................... 26 17

    14. Shared parts ............................................................................................................................................. 26 18

    14.1 Glossary of shared part-specific terms ................................................................................................ 26 19

    14.2 Part summary ....................................................................................................................................... 26 20

    14.2.1 ActiveX Control Binary Data part ................................................................................................ 26 21

    14.2.2 ActiveX Control Persistence part .................................................................................................. 26 22

    14.2.3 Audio part ..................................................................................................................................... 26 23

    14.2.4 Custom XML Data Storage part ................................................................................................... 26 24

    14.2.5 Custom XML Data Storage Properties part .................................................................................. 26 25

    14.2.6 Embedded Object part................................................................................................................... 26 26

    14.2.7 Embedded Package part ................................................................................................................ 26 27

    14.2.8 File properties ............................................................................................................................... 26 28

    14.2.9 Font part ........................................................................................................................................ 26 29

    14.2.10 Legacy Diagram Text part .......................................................................................................... 26 30

    14.2.11 Legacy Drawing part................................................................................................................... 26 31

    14.2.12 Legacy Drawing Text Information part ...................................................................................... 26 32

    14.2.13 Video part ................................................................................................................................... 26 33

    14.3 Thumbnails .......................................................................................................................................... 26 34

    14.4 Images.................................................................................................................................................. 26 35

    14.5 Hyperlinks ........................................................................................................................................... 26 36

    15. XML Schema References........................................................................................................................ 26 37

    15.1 DrawingML - Charts ........................................................................................................................... 26 38

    15.1.1 Top-Level Elements ...................................................................................................................... 26 39

    15.1.2 Element List .................................................................................................................................. 26 40

    15.1.3 Simple Types ................................................................................................................................ 26 41

    15.2 DrawingML - Chart Drawings ............................................................................................................ 26 42

    15.2.1 Element List .................................................................................................................................. 26 43

    15.2.2 Simple Types ................................................................................................................................ 26 44

    15.3 DrawingML - Legacy Drawing Compatability ................................................................................... 26 45

    15.4 Dublin Core Metadata ......................................................................................................................... 26 46

    15.5 Dublin Core Metadata ......................................................................................................................... 26 47

    15.6 Dublin Core Metadata ......................................................................................................................... 26 48

    15.7 Application Specific File Properties .................................................................................................... 26 49

    15.7.1 Top-Level Elements ...................................................................................................................... 26 50

    15.7.2 Element List .................................................................................................................................. 26 51

    15.8 Custom File Properties ........................................................................................................................ 26 52

    15.8.1 Top-Level Elements ...................................................................................................................... 26 53

  • Table of Contents

    vi

    15.8.2 Element List .................................................................................................................................. 26 1

    15.9 File Properties - Variant Types ............................................................................................................ 26 2

    15.9.1 Top-Level Elements ...................................................................................................................... 26 3

    15.9.2 Element List .................................................................................................................................. 26 4

    15.9.3 Simple Types ................................................................................................................................ 26 5

    15.10 DrawingML - Diagrams .................................................................................................................... 26 6

    15.10.1 Top-Level Elements .................................................................................................................... 26 7

    15.10.2 Element List ................................................................................................................................ 26 8

    15.10.3 Simple Types .............................................................................................................................. 26 9

    15.11 Custom XML Data Properties ........................................................................................................... 26 10

    15.11.1 Top-Level Elements .................................................................................................................... 26 11

    15.11.2 Element List ................................................................................................................................ 26 12

    15.12 ActiveX Control Properties ............................................................................................................... 26 13

    15.12.1 Top-Level Elements .................................................................................................................... 26 14

    15.12.2 Element List ................................................................................................................................ 26 15

    15.12.3 Simple Types .............................................................................................................................. 26 16

    15.13 DrawingML - Main ........................................................................................................................... 26 17

    15.13.1 Top-Level Elements .................................................................................................................... 26 18

    15.13.2 Element List ................................................................................................................................ 26 19

    15.13.3 Simple Types .............................................................................................................................. 26 20

    15.14 Office Document Legacy Drawing .................................................................................................... 26 21

    15.14.1 Top-Level Elements .................................................................................................................... 26 22

    15.14.2 Element List ................................................................................................................................ 26 23

    15.14.3 Simple Types .............................................................................................................................. 26 24

    15.15 OMML ............................................................................................................................................... 26 25

    15.15.1 Top-Level Elements .................................................................................................................... 26 26

    15.15.2 Element List ................................................................................................................................ 26 27

    15.15.3 Simple Types .............................................................................................................................. 26 28

    15.16 Package Content Types Item ............................................................................................................. 26 29

    15.16.1 Top-Level Elements .................................................................................................................... 26 30

    15.16.2 Element List ................................................................................................................................ 26 31

    15.17 Core File Properties ........................................................................................................................... 26 32

    15.17.1 Top-Level Elements .................................................................................................................... 26 33

    15.17.2 Element List ................................................................................................................................ 26 34

    15.18 Package Relationship Items ............................................................................................................... 26 35

    15.18.1 Top-Level Elements .................................................................................................................... 26 36

    15.18.2 Element List ................................................................................................................................ 26 37

    15.18.3 Simple Types .............................................................................................................................. 26 38

    15.19 Package Relationship References ...................................................................................................... 26 39

    15.19.1 Simple Types .............................................................................................................................. 26 40

    15.20 PresentationML ................................................................................................................................. 26 41

    15.20.1 Top-Level Elements .................................................................................................................... 26 42

    15.20.2 Element List ................................................................................................................................ 26 43

    15.20.3 Simple Types .............................................................................................................................. 26 44

    15.21 DrawingML - Picture......................................................................................................................... 26 45

    15.21.1 Element List ................................................................................................................................ 26 46

    15.22 Legacy Drawing (VML) .................................................................................................................... 26 47

    15.22.1 Top-Level Elements .................................................................................................................... 26 48

    15.22.2 Element List ................................................................................................................................ 26 49

    15.22.3 Simple Types .............................................................................................................................. 26 50

    15.23 DrawingML - WordprocessingML Drawing ..................................................................................... 26 51

    15.23.1 Top-Level Elements .................................................................................................................... 26 52

    15.23.2 Element List ................................................................................................................................ 26 53

    15.23.3 Simple Types .............................................................................................................................. 26 54

    15.24 WordprocessingML ........................................................................................................................... 26 55

  • Table of Contents

    vii

    15.24.1 Top-Level Elements .................................................................................................................... 26 1

    15.24.2 Element List ................................................................................................................................ 26 2

    15.24.3 Simple Types .............................................................................................................................. 26 3

    15.25 WordprocessingML Document Legacy Drawing .............................................................................. 26 4

    15.25.1 Top-Level Elements .................................................................................................................... 26 5

    15.25.2 Element List ................................................................................................................................ 26 6

    15.25.3 Simple Types .............................................................................................................................. 26 7

    15.26 SpreadsheetML .................................................................................................................................. 26 8

    15.26.1 Top-Level Elements .................................................................................................................... 26 9

    15.26.2 Element List ................................................................................................................................ 26 10

    15.26.3 Simple Types .............................................................................................................................. 26 11

    15.27 DrawingML - SpreadsheetML Drawing ............................................................................................ 26 12

    15.27.1 Element List ................................................................................................................................ 26 13

    15.28 Custom XML Schema References .................................................................................................... 26 14

    15.28.1 Top-Level Elements .................................................................................................................... 26 15

    15.28.2 Element List ................................................................................................................................ 26 16

    Annex A. Non-normative references ........................................................................................................... 26 17

    Annex B. Index .............................................................................................................................................. 26 18

    19

  • Introduction

    viii

    Introduction

    1

    This Standard is based on a submission from Microsoft Corporation. It describes a family of XML schemas, 2

    collectively called Office Open XML, which define the XML vocabularies consumed and produced by the 3

    "Office 12" versions of the Microsoft Office products Microsoft Word, Microsoft Excel, and Microsoft 4

    PowerPoint. It also describes the packaging of documents that conform to these schemas. 5

  • Non-normative references

    1

    1. Scope

    1

    This Standard defines Office Open XML's vocabularies and document representation and packaging. It also 2

    specifies requirements for consumers and producers of XML that is valid according to Office Open XML's 3

    schemas. 4

  • Non-normative references

    2

    2. Conformance

    1

    Office Open XML conformance is of interest to the following audiences: 2

    Those designing, implementing, or maintaining Office Open XML consumers or producers. 3

    Governmental or commercial entities wishing to procure Office Open XML consumers or producers. 4

    Testing organizations wishing to provide an Office Open XML conformance test suite. 5

    Programmers wishing to interact programmatically with Office Open XML consumers or producers. 6

    Educators wishing to teach about Office Open XML consumers or producers. 7

    Authors wanting to write about Office Open XML consumers or producers. 8

    As such, conformance is most important, and the bulk of this Standard is aimed at specifying the 9

    characteristics that make Office Open XML consumers or producers strictly conforming ones. 10

    The text in this Standard that specifies requirements is considered normative. All other text in this 11

    specification is informative; that is, for information purposes only. Unless stated otherwise, all text is 12

    normative (see 7). 13

    This Standard does not contain any undefined or unspecified behavior (see 4). 14

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", 15

    "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this standard are to be interpreted 16

    as described in RFC 2119 (Annex A). 17

    A strictly conforming consumer or producer SHALL use only those features of Office Open XML specified 18

    in this Standard as being REQUIRED. It SHALL NOT act in a manner that is dependent on any unspecified, 19

    undefined, or implementation-defined behavior. A strictly conforming consumer SHALL accept any valid 20

    Office Open XML document. The Office Open XML documents generated by a strictly conforming 21

    producer SHALL be valid. 22

    A strictly conforming consumer or producer SHALL interpret characters in conformance with the Unicode 23

    Standard and ISO/IEC 10646-1 as required by the XML 1.0 standard. A strictly conforming consumer or 24

    producer SHALL accept Unicode source files encoded with either the UTF-8 or UTF-16 encoding forms as 25

    required by the XML 1.0 standard. 26

    A strictly conforming consumer SHALL produce at least one diagnostic message if its input document is 27

    invalid. A document is invalid if any of its contents violate any rule of syntax or any negative requirement in 28

    this Standard. 29

    A (non-strictly) conforming consumer or producer is one having capabilities that are a superset of those 30

    described in this Standard, provided these capabilities do not alter the behavior that is required by a strictly 31

    conforming consumer or producer. Conforming consumers and producers are REQUIRED to diagnose 32

    Office Open XML documents containing extensions that are outside the scope of this Standard. However, 33

    having done so, they MAY continue to consume or produce such documents. 34

    A conforming consumer or producer SHALL be accompanied by a document that defines all 35

    implementation-defined characteristics, and all extensions. 36

  • Non-normative references

    3

    3. Normative references

    1

    The following normative documents contain provisions, which, through reference in this text, constitute 2

    provisions of this Standard. For dated references, subsequent amendments to, or revisions of, any of these 3

    publications do not apply. However, parties to agreements based on this Standard are encouraged to 4

    investigate the possibility of applying the most recent editions of the normative documents indicated below. 5

    For undated references, the latest edition of the normative document referred to applies. Members of ISO 6

    and IEC maintain registers of currently valid International Standards. 7

    8

    ISO/IEC 2382.1:1993, Information technology Vocabulary Part 1: Fundamental terms. 9

    ISO/IEC 10646 (all parts), Information technology Universal Multiple-Octet Coded Character Set (UCS). 10

  • Non-normative references

    4

    4. Definitions

    1

    For the purposes of this Standard, the following definitions apply. Other terms are defined where they appear 2

    in italic type or on the left side of a syntax rule. Terms explicitly defined in this Standard are not to be 3

    presumed to refer implicitly to similar terms defined elsewhere. 4

    5

    behavior External appearance or action. 6

    behavior, implementation-defined Unspecified behavior where each implementation documents how 7

    the choice is made. 8

    behavior, undefined Behavior, upon use of a non-portable or erroneous construct or of erroneous data, 9

    for which this Standard imposes no requirements. [Possible handling of undefined behavior ranges from 10

    ignoring the situation completely with unpredictable results, to behaving during translation or execution in a 11

    documented manner characteristic of the environment (with or without the issuance of a diagnostic 12

    message), to terminating a translation or execution (with the issuance of a diagnostic message)]. 13

    behavior, unspecified Behavior where this Standard provides two or more possibilities and imposes no 14

    further requirements on which is chosen in any instance. 15

    component A unit in a package. The two kinds of component are relationship item and part. (See 16

    package-relationship item and part-relationship item.) 17

    consumer A tool that can read and parse a package. 18

    content type A description of the type of content stored in a part. A content type defines a media type, a 19

    subtype, and an optional set of parameters. (See also, content-type item.) 20

    content-type item An XML representation of the mappings from part names to content types, stored as 21

    an item in a package. A content-type item is not itself a part. 22

    item In the context of a package, item is a synonym for ZIP item. 23

    Office Open XML document A rendition of a data stream formatted using any of the following 24

    Microsoft Office interchange file formats: Microsoft Word, Microsoft Excel, and Microsoft PowerPoint. 25

    Such a document is represented as a package. 26

    package A ZIP archive that contains all the relationship items and parts of an Office Open XML 27

    document, such that those parts are reachable via a set of relationships defined in the relationship items. 28

    package relationship A relationship whose target is a part and whose source is the package as a whole. 29

    (See also, package-relationship item.) 30

    package-relationship item An XML representation of one or more package relationships, stored as an 31

    item in a package. A package-relationship item is not itself a part. 32

    part A package component that has associated common properties. A part corresponds to an item in a 33

    package. 34

    part, well-known A part with a well-known relationship, enabling that part to be found without knowing 35

    the location of other parts. 36

    part relationship A relationship whose target is a part and whose source is another part. (See also, part-37

    relationship item.) 38

    part-relationship item An XML representation of one or more part relationships, stored as an item in a 39

    package. A part-relationship item is not itself a part. 40

  • Non-normative references

    5

    PresentationML An environment for consuming or producing any package defining a Microsoft 1

    PowerPoint Office Open XML document. 2

    producer A tool that can create a package. 3

    relationship A representation of the kind of connection between a source and a target part in a package. 4

    Relationships make the connections between parts directly discoverable without looking at the content in the 5

    parts, and without altering the parts themselves. (See package relationship and part relationship.) 6

    SpreadsheetML An environment for consuming or producing any package defining a Microsoft Excel 7

    Office Open XML document. 8

    WordprocessingML An environment for consuming or producing any package defining a Microsoft 9

    Word Office Open XML document. 10

    ZIP archive A ZIP file as defined in the ZIP file format specification but excluding all elements of that 11

    specification related to encryption or decryption. A ZIP archive contains ZIP items. 12

    ZIP item Any file stored in a ZIP archive. (See also, item.) 13

  • Non-normative references

    6

    5. Notational conventions

    1

    The following typographical conventions are used in this standard: 2

    1. First occurrence of a term: is considered normative. 3

    2. Definition: behavior External 4

    3. XML element: The root element is wordDocument. 5

    4. XML element attribute: an Id attribute. 6

    5. XML element attribute value: value of CommentReference. 7

    6. XML element type: as values of the xsd:anyURI data type. 8

  • Non-normative references

    7

    6. Acronyms and abbreviations

    1

    This clause is informative 2

    The following acronyms and abbreviations are used throughout this Standard: 3

    IEC the International Electrotechnical Commission 4

    ISO the International Organization for Standardization 5

    W3C World Wide Web Consortium 6

    End of informative text 7

  • Non-normative references

    8

    7. General description

    1

    This Standard is intended for use by implementers, academics, and application programmers. As such, it 2

    contains a considerable amount of explanatory material that, strictly speaking, is not necessary in a formal 3

    language specification. 4

    This standard is divided into the following subdivisions: 5

    1. Front matter (clauses 17); 6

    2. Overview (clause 8); 7

    3. Main body (clauses 915); 8

    4. Annexes 9

    Examples are provided to illustrate possible forms of the constructions described. References are used to 10

    refer to related clauses. Notes are provided to give advice or guidance to implementers or programmers. 11

    Rationale provides explanatory material as to why something is or is not in this standard. Annexes provide 12

    additional information and summarize the information contained in this Standard. 13

    Clauses 15, 7, 915, and Annex A form a normative part of this standard; and Introduction, clauses 6 and 8, 14

    annexes, notes, examples, rationale, and the index, are informative. 15

    Except for whole clauses or annexes that are identified as being informative, informative text that is 16

    contained within normative text is indicated in the following ways: 17

    1. [Example: code fragment, possibly with some narrative end example] 18

    2. [Note: narrative end note] 19

    3. [Rationale: narrative end rationale] 20

  • Non-normative references

    9

    8. Overview

    1

    This clause is informative. 2

    This clause contains an overview of Office Open XML. Clauses 915 contain corresponding normative 3

    definitions. 4

    8.1 Packages and parts 5 An Office Open XML document is represented as a series of related parts that are stored in a container 6

    called a package. Information about the relationships between a package and its parts is stored in the 7

    package's package-relationship item. Information about the relationships between two parts is stored in the 8

    part-relationship item for the source part. A package is an ordinary ZIP archive whose items correspond 9

    directly to those related parts. (Packages are discussed in detail in 9.) 10

    A WordprocessingML document contains a part for the body of the text; it might also contain a part for an 11

    image referenced by that text, and parts defining document characteristics, styles, and fonts. A 12

    SpreadsheetML document contains a separate part for each worksheet; it might also contain parts for images. 13

    A PresentationML document contains a separate part for each slide. In each case, the document is 14

    represented by a ZIP archive whose items are parts and relationship items. 15

    8.2 Consumers and producers 16 A tool that can read and understand a package is called a consumer, while one that can create a package is 17

    called a producer. An application can be a consumer, a producer, or both. For example, when a word 18

    processor creates a new document, it is a producer. When it is used to open an existing document for reading 19

    or search purposes, it acts as a consumer. When it is used to open an existing document, edit it, and save the 20

    result, it acts as both consumer and producer. Similar scenarios exist for spreadsheet and presentation 21

    applications. 22

    8.3 WordprocessingML 23 This subclause introduces the overall form of a WordprocessingML package, and identifies some of its main 24

    element types. 25

    A WordprocessingML package has a relationship of type officeDocument, which specifies the location of 26

    the main part in the package. For a WordprocessingML document, that part contains the contents of the 27

    document. 28

    A WordprocessingML packages main part, the document part, starts with a word processing root element. 29

    That element contains a body, which, in turn, contains one or more paragraphs (as well as tables, pictures, 30

    and the like). A paragraph contains one or more runs, where a run is a container for one or more pieces of 31

    text having a consistent set of properties. Like many collection element types, each run and paragraph can 32

    have associated with it a set of properties. For example, a run might have the property bold, which indicates 33

    that run's text is to be displayed in a bold typeface. 34

    A WordprocessingML document is organized into sections, and the layout of a page on which the text 35

    appears within a section is controlled by that section's properties. For example, each section can have its own 36

    headers and footers. 37

    One relationship from the document part specifies the documents styles. A style defines a text display 38

    format. A style can have properties, which can be applied to individual paragraphs or runs. Styles reduce the 39

    amount of text that has to be produced and the amount of work required to make changes to the document's 40

    appearance. With styles, the appearance of all the pieces of text that share a common style can be changed in 41

    only one place, in that style's definition. 42

    A series of paragraphs can be organized into a list by applying to them a list style. 43

  • Non-normative references

    10

    Data in a WordprocessingML document can be organized in a table, a two-dimensional grid of cells 1

    organized into rows and columns. Cells and whole tables can have associated properties. A cell can contain 2

    text, paragraphs, and lists, for example. 3

    A WordprocessingML document can be linked via bookmarks and hyperlinks. A bookmark is a way to mark 4

    a particular place in a document. A hyperlink has two components: the hyperlink itselfthe text the user 5

    will clickand the target for the link. Potential targets include external files, e-mail addresses, web sites, 6

    and bookmarks. 7

    Other features that a WordprocessingML document can contain include, but are not limited to, are the 8

    following: boiler-plate text and graphics, borders and shading, cross-references, charts and figures with 9

    captions, colors, fonts, index, reviewer comments, table of contents, and watermarks. 10

    A WordprocessingML document is not stored as one large body in a single part; instead, the elements that 11

    implement certain groupings of functionality are stored in separate parts. For example, all footnotes in a 12

    document are stored in one footnote part, while each section can have up to three different header parts and 13

    three different footer parts, to support headers and footers on odd-numbered pages, even-numbered pages, 14

    and the first page. 15

    8.4 SpreadsheetML 16 This subclause introduces the overall form of a SpreadsheetML package, and identifies some of its main 17

    element types. 18

    A SpreadsheetML package has a relationship of type officeDocument, which specifies the location of the 19

    main part in the package. For a SpreadsheetML document, that part contains the workbook definition. 20

    A SpreadsheetML packages main part, the workbook part, starts with a spreadsheet root element. That 21

    element is a workbook, which refers to one or more worksheets, which, in turn, contain the data. A 22

    worksheet is a two-dimensional grid of cells that are organized into rows and columns. 23

    The cell is the primary place in which data is stored and operated on. A cell can have a number of 24

    characteristics, such as numeric, text, date, or time formatting; alignment; font; color; and a border. Each 25

    cell is identified by a cell reference, a combination of its column and row headings. 26

    Each horizontal set of cells in a worksheet is called a row, and each row has a heading numbered 27

    sequentially, starting at 1. Each vertical set of cells in a worksheet is called a column, and each column has 28

    an alphabetic heading named sequentially from AZ, then AAAZ, BABZ, and so on. 29

    Instead of data, a cell can contain a formula, which is a recipe for calculating a value. Some formulas30

    called functionsare predefined, while others are user-defined. Examples of predefined formula are 31

    AVERAGE, MAX, MIN, and SUM. A function takes one or more arguments on which it operates, 32

    producing a result. For example, in the formula =SUM(B1:B4), there is one argument, B1:B4, which is the 33

    range of cells B1B4, inclusive. 34

    Other features that a SpreadsheetML document can contain include the following: charts, comments, 35

    hyperlinks, images, and sorted and filtered tables. 36

    A SpreadsheetML document is not stored as one large body in a single part; instead, the elements that 37

    implement certain groupings of functionality are stored in separate parts. For example, all the data for a 38

    worksheet is stored in that worksheet's part, all string literals from all worksheets are stored in a single 39

    shared string part, and each worksheet having comments has its own comments part. 40

    8.5 PresentationML 41 This subclause introduces the overall form of a PresentationML package, and identifies some of its main 42

    element types. 43

    A PresentationML package has a relationship of type officeDocument, which specifies the location of the 44

    main part in the package. For a PresentationML document, that part contains the presentation definition. 45

    A PresentationML packages main part, the presentation part, starts with a presentation root element. That 46

    element contains a presentation, which, in turn, refers to a slide list, a slide master list, a notes master list, 47

  • Non-normative references

    11

    and a handout master list. The slide master list refers to all of the slides in the presentation; the notes master 1

    contains information about the formatting of notes pages; and the handout master describes how a handout 2

    should look. A handout is a printed set of slides that can be handed out to an audience for future reference. 3

    As well as text and graphics, each slide can contain comments and notes, can have a layout, and can be part 4

    of one or more custom presentations. (A comment is an annotation intended for the person maintaining the 5

    presentation slide deck. A note is a reminder or piece of text intended for the presenter or the audience.) 6

    Other features that a PresentationML document can contain include the following: charts and diagrams 7

    (with or without animation), audio, video, and transitions between slides. 8

    A PresentationML document is not stored as one large body in a single part; instead, the elements that 9

    implement certain groupings of functionality are stored in separate parts. For example, all comments in a 10

    document are stored in one comment part while each slide has its own part. 11

    8.6 Supporting MLs 12 This subclause introduces the set of markup languages used across package types. The three markup 13

    languages described above define the structure of a package that is either a document (WordprocessingML), 14

    a spreadsheet (SpreadsheetML), or a presentation (PresentationML). However, there is also a set of shared 15

    markup languages used for common elements such as charts, diagrams, and drawing objects, which are 16

    defined using a set of standard markup styles. These MLs are discussed below. 17

    8.6.1 DrawingML 18

    This subclause introduces the overall form of DrawingML, and identifies some of its main element types. 19

    DrawingML specifies the location and appearance of drawing elements in a package. For example, these 20

    elements could be, but are not limited to, shapes, pictures, and tables. The root element of a DrawingML 21

    XML fragment specifies the presence of a drawing at this location in the document. 22

    A shape is a geometric object such as a circle, square, or rectangle; a picture is an image file presented 23

    inside the document; and a table is a two-dimensional grid of cells organized into rows and columns. Cells 24

    and whole tables can have associated properties. A cell can contain text, for example. 25

    DrawingML also specifies the location and appearance of charts in a package. The root element of a 26

    ChartML part is chart, and specifies the appearance of the chart at this location in the document. 27

    A chart is a presentation of data in a graphical fashion, such as a pie chart, bar chart, line chart, in order to 28

    make trends and exceptions in the data more visually apparent. 29

    DrawingML also specifies the location and appearance of diagrams in a document. A diagram is a 30

    presentation of content in a graphical fashion, in order to present this information in a manner, which is 31

    clearer to the consumer of the information by using a visual metaphor to create relationships between 32

    individual pieces of this information. 33

    Together, the following four parts (9.1) define a diagram: 34

    The data part (13.2.4) specifies the individual pieces of information that is presented in this 35

    diagram. Typically, each of these pieces is a simple line of text, but based on the diagram, each may 36

    also be an image. 37

    The layout part (13.2.5) specifies how the data is laid out to create the resulting diagram. The part 38

    describes the layout of shapes which, when presented as specified, result in the desired diagram. 39

    The colors part (13.2.3) specifies the color which is applied to each individual shape in the diagram. 40

    The styles part (13.2.6) defines how each individual shape in the diagram maps to the document's 41

    theme. 42

    In addition, DrawingML specifies package-wide appearance characteristics, such as the package's theme. 43

    The theme of a document specifies the color scheme, fonts, and effects, which can be referenced by parts of 44

    the documentsuch as text, drawings, charts, and diagramsin order to create a consistent visual 45

    presentation. 46

  • Non-normative references

    12

    8.6.2 OMML 1

    This subclause introduces the overall form of OMML, and identifies some of its main element types. 2

    OMML specifies the structure and appearance of equations in a document; it is specified with a root element 3

    of math. 4

    8.6.3 VML 5

    This subclause introduces the overall form of VML, and identifies some of its main element types. 6

    VML specifies the appearance and content of legacy shapes in a document. This is used for shapes such as 7

    text boxes, as well as shapes which must be stored to maintain compatibility with earlier versions of 8

    consumer/producer applications. 9

    A shape definition is typically specified using two elements: shapeData, which stores information about 10

    the legacy shape, and shape, which stores the shape definition and appearance directly. 11

    8.6.4 Other namespaces 12

    A number of additional namespaces are shared by elements across WordprocessingML, SpreadsheetML, and 13

    PresentationML. These namespaces are introduced in the following subclauses. 14

    8.6.4.1 ActiveX control properties 15

    These properties pertain to ActiveX controls in a package when such controls are made to persist. 16

    8.6.4.2 Custom XML data properties 17

    A user can store arbitrary XML in a package, along with schema information used by that XML. 18

    8.6.4.3 File properties 19

    The core file properties of a package enable users to discover, get, and set well-known and common sets of 20

    properties from within that package, regardless of whether its a WordprocessingML, SpreadsheetML, or 21

    PresentationML package. Such properties include creator name, creation date, title, and description. 22

    Application-defined file properties are specific to the ML type of the package. For example, for a 23

    WordprocessingML package, these properties include the number of characters, words, lines, paragraphs, 24

    and pages in the document. For a SpreadsheetML package, these properties include worksheet titles. For a 25

    PresentationML package, these properties include presentation format, the number of slides, the number of 26

    notes, and whether or not any slides are hidden. 27

    Custom file properties are defined by the user. Examples include the name of the client for whom the 28

    document was prepared, a date/time on which some event happened, a document number, or some Boolean 29

    status flag. Each custom file property has a value, and that value has a type. 30

    End of informative text. 31

  • Non-normative references

    13

    9. Packages

    1

    A package is a container for a collection of components, which are composed, processed, and persisted 2

    according to a set of rules. There are two kinds of components: parts and relationship items. Parts can have 3

    relationships to each other, as well as to the package itself. These relationships are defined using XML in 4

    one or more relationship items. Each part has a content type and is unambiguously addressed using well-5

    defined naming guidelines. Content-type information is recorded in a content-type item. 6

    A package is implemented as a ZIP archive, with each component in that package corresponding to an item 7

    in that archive. A ZIP archive is a ZIP file as defined in the ZIP file format specification, but excluding all 8

    elements of that specification related to encryption or decryption. 9

    The purpose of a package is to aggregate all of the pieces of a document into a single object. [Example: A 10

    package holding a simple WordprocessingML document with a picture might contain a number of parts: an 11

    XML markup part representing the document, a part containing page header information, a part containing 12

    footnotes, and a part representing the picture in jpeg form. end example] A package provides a convenient 13

    way to distribute a document with all of its component pieces, such as images, fonts, and data. 14

    All XML content of components defined in this Standard MUST be encoded using either UTF-8 or UTF-16. 15

    If any such component includes an encoding declaration (as defined in 4.3.3 of the XML specification), 16

    that declaration MUST NOT name any encoding other than UTF-8 or UTF-16. 17

    Parts are defined further in 9.1, relationships in 9.2, and content type in 9.3. Further information on ZIP 18

    archive processing is in 9.4. 19

    9.1 Parts 20 Each part has a part name. Part names refer to parts within a package, typically as part of a URI reference. 21

    Like file names in a file system and URIs, part names are hierarchical. Part names consist of segments, each 22

    representing a level in the hierarchy. For example, the part name /hello/world/doc.xml contains three 23

    segments: hello, world, and doc.xml. Segments form a tree structure. This is similar to file systems, 24

    where all of the non-leaf nodes in the tree are folders and the leaf nodes are files, which contain actual 25

    content. The folder (that is, non-leaf) nodes in the tree serve a similar function: they organize the parts of the 26

    package. 27

    Folder nodes exist only as a concept in the naming hierarchy. No actual folders, or directory thereof, need 28

    exist in the package. However, a producer MAY create a package having an explicit representation of 29

    folders, and this representation MAY be in the form of a hierarchical directory. [Example: Heres an 30

    example of such a hierarchical folder structure for a simple WordprocessingML document: 31 /[Content_Types].xml

    32

    /_rels/.rels 33

    /docProps/app.xml 34

    /docProps/core.xml 35

    /word/_rels/document.xml.rels 36

    /word/document.xml 37

    /word/fontTable.xml 38

    /word/settings.xml 39

    /word/styles.xml 40

    /word/theme/theme1.xml 41

    end example] 42

    9.1.1 Part naming rules 43

    A package allows unrelated software systems to manipulate their own parts without colliding with each 44

    other. To allow this, all producers and consumers MUST adhere to the following part naming rules: 45

  • Non-normative references

    14

    The part name MUST NOT be derived from another part name by appending segments to it. 1

    [Example: If a package contains a part named /segment1/segment2/.../segmentN, then other parts in 2

    that package MUST NOT have the following names: /segment1, segment1/segment2, or 3

    /segment1/segment2/.../segmentN-1. end example] 4

    It is REQUIRED that producers adding parts to an existing package do so in a new folder of the 5

    naming hierarchy, rather than placing parts directly in the root or in a pre-existing folder (unless the 6

    producer created that folder). In this way, the possibility of naming collisions is limited to the first 7

    segment of the part name. Parts created within the new folder can be named without risking a 8

    collision with existing parts. 9

    In the event that the preferred name for the folder is already used by an existing part, producers 10

    MUST adopt a strategy for choosing alternative folder names. Producers MAY append digits to the 11

    preferred name until an available folder name is found. 12

    Consumers MUST NOT attempt to locate a part via a well-known part name (9.2). Instead, 13

    producers MUST create a package relationship to at least one part in each folder they create. 14

    Consumers MUST use these package relationships to locate parts. [Note: While a user can store any 15

    kind of file in a package, consumers and producers that process such a package are free to ignore any 16

    files that are not reachable via a relationship. When such a package is written out on save, those 17

    extraneous files need not be retained. end note] 18

    Once a consumer has found at least one part in a folder (via one of the package relationships) it MAY 19

    use conventions about well-known part names to find other parts in that folder or its subfolders. 20

    Provided a part name is well formed, this Standard imposes no requirements on the spelling of a part name. 21

    As such, a producer is free to have its own hierarchical spelling conventions. A consumer MUST be able to 22

    handle any well-formed part name. 23

    9.1.2 Part Addressing 24

    Parts often contain references to other parts. For example, a package may contain two parts: an XML 25

    markup file and an image. The markup file holds a reference to the image so that when the markup file is 26

    processed, the associated image can be identified and located. 27

    The terms relative reference and base URI are used in accordance with RFC 3986. 28

    A relative reference in a package is a reference to a part expressed so that the address of the referenced part 29

    is determined relative to the part containing the reference. 30

    Relative references from a part are interpreted relative to the base URI of that part. By default, the base URI 31

    of a part is derived from that part's name. 32

    Parts MAY contain Unicode strings representing references to other parts. In particular, XML markup may 33

    contain such Unicode strings as values of the xsd:anyURI data type. These Unicode strings MUST be 34

    converted to ASCII strings before resolving them relative to the base URI of the part containing the Unicode 35

    string. 36

    Some types of content provide a way to override the default base URI by specifying a different base in the 37

    content; for example, XML Base or HTML. In the presence of one of these overrides, the specified base URI 38

    MUST be used instead of the default. 39

    [Example: Consider a package that has parts having the following names: 40 /markup/page.xml

    41

    /images/picture.jpg 42

    If /markup/page.xml contains a reference to ../images/picture.jpg, then this reference MUST be interpreted 43

    as referring to the part name /images/picture.jpg. end example] 44

    9.2 Relationships 45 Parts often contain references to other parts in a package and to resources outside of the package. However, 46

    in general, these references are represented inside the referring part in ways that are specific to the content 47

  • Non-normative references

    15

    type of the part; that is, in arbitrary markup or an application-specific encoding. This effectively hides the 1

    internal and external linkages between parts from consumers that do not understand the content types of the 2

    parts containing such references. 3

    The package uses relationships, a higher-level mechanism to describe references from parts to other internal 4

    or external resources. A relationship represents the kind of connection between a source and a target 5

    resource. If the source is a part, the relationship is referred to as a part relationship. If the source is the 6

    package itself, the relationship is referred to as a package relationship. Relationships make the connections 7

    directly discoverable without looking at the content in the parts, so they are independent of content-specific 8

    schema and faster to resolve. A well-known part is a part with a well-known relationship, enabling that part 9

    to be found without knowing the location of other parts. 10

    Certain relationships SHALL exist only if theyre explicitly referenced in the XML of the source part. 11

    [Example: A document part can only have a relationship to an image if that image is actually referenced by 12

    the document parts XML. end example] Such relationships are called explicit relationships. All other 13

    relationships are implicit relationships and SHALL NOT have such explicit XML references. 14

    Relationships provide a second important function: relating parts without modifying them. Sometimes this 15

    information acts as a label where the content type of the labeled part does not define a way to attach the 16

    given information. 17

    Package relationships are represented in XML in a package-relationship item. A package SHALL have 18

    exactly one package-relationship item. Part relationships are represented in XML in a part-relationship item. 19

    A package MAY have one or more part-relationship items, as needed. A relationship item itself MUST NOT 20

    have relationships to any part. Although a relationship item is not itself a part, it does have some part-like 21

    characteristics; specifically, it is URI-addressable, and it can be opened, read, and deleted. 22

    Package relationships are used to identify the starting parts in a package for a given context. This approach 23

    avoids relying on naming conventions for finding parts in a package. 24

    The naming conventions for package and part-relationship items are described in 9.2.3. 25

    9.2.1 Bidirectional relationship traversal 26

    Most relationships represent a directed connection between two parts within a package. Because of the way 27

    in which it is represented, it is efficient to traverse a relationship from its source part. (It is trivial to find the 28

    relationship item for any given part.) However, it is not efficient to traverse relationships backward from the 29

    target of the relationship, since the only way to find all of the relationships to a part is to look through all of 30

    the relationships in the package. 31

    In order to make backward traversal of a relationship possible, a new relationship SHOULD be used to 32

    represent the other (traversable) direction. 33

    9.2.2 Relationship markup 34

    Relationships are represented using one or more Relationship elements nested in a single Relationships 35

    element. These elements are defined in the Relationships namespace. 36

    Every Relationship element MUST have an Id attribute, the value of which must be unique within the 37

    relationship item. The Id type is xsd:ID and its value MUST conform to the naming restrictions for that 38

    type. 39

    Relationship elements are identified by their Type attribute. These types are defined in the same way that 40

    namespaces are defined for XML namespaces. Specifically, by using types patterned after the Internet 41

    domain-name space, non-coordinating parties can safely create non-conflicting relationship types. 42

    Relationship types may be compared to determine whether two Relationships belong to the same type. This 43

    comparison is conducted in the same way as when comparing URIs identifying XML namespaces: the two 44

    URIs are treated as strings and considered identical if and only if the strings have the same sequence of 45

    characters. The comparison is case-sensitive and no triplet-encoded escaping is done or undone. 46

  • Non-normative references

    16

    The Target attribute of the Relationship element holds a URI that points to a target resource. Where the 1

    URI is expressed as a relative reference, it is resolved against the base URI of the Relationships source part. 2

    The xml:base attribute SHALL NOT be used to specify a base URI for relationship XML content. 3

    The namespace for a relationship item SHALL be 4

    "http://schemas.openxmlformats.org/package/2006/relationships". 5

    [Example: Here is an example of a package-relationship item, /_rels/.rels, for a simple WordprocessingML 6

    document; for brevity, the leading part of each Type value has been omitted: 7

    8

    10

    12

    14

    15

    Here is the part-relationship item, /word/_rels/document.xml.rels, for the part word/document.xml: 16

    17

    19

    21

    23

    25

    26

    end example] 27

    9.2.3 Representing relationships 28

    Relationships are represented in XML in a relationship item. Each part in a package that is the source of one 29

    or more relationships has an associated part-relationship item. This item holds the list of relationships for the 30

    source part. In the case of a package, the package-relationship item holds the list of relationships for that 31

    package. 32

    Relationship items use the following naming convention: First, the relationship item is stored in a sub-folder 33

    called _rels, which is directly subordinate to the folder of the source of the relationship item. (The source 34

    of the package-relationship item is the root folder.) Second, the name of the relationship item is formed by 35

    appending ".rels" to the name of the original part. [Example: A WordprocessingML package contains a 36

    package-relationship item called /_rels/.rels, and a main part called /word/document.xml. As such, the name 37

    of the part relationships item for that part is made of the following segments: the part folder name, "_rels", 38

    the original part name, and ".rels"; that is, /word/_rels/document.xml.rels. end example] 39

    Items with names that conform to this naming convention MUST have the content type for a relationship 40

    item. 41

    Relationships can target resources outside of the package at some absolute location and resources located 42

    relative to the current location of the package. [Example: The following part-relationship item specifies 43

    relationships that connect a part to pic1.jpg at an absolute external location, and to my_house.jpg at an 44

    external location relative to the location of the package: 45

    46

    51

    http://www.custom.com/images/pic1.jpg
  • Non-normative references

    17

    5

    6

    end example] 7

    Two or more relationships, each using unique Ids, MAY share the same Target and Type. 8

    9.3 Content type 9 Every part has a content type, which identifies the type of content that is stored in that relationship item or 10

    part. A content type defines a media type, a subtype, and an optional set of parameters, as defined in 11

    RFC 2045. Examples of content types include image/jpeg and application/xml. 12

    The attribute syntax for the content type of a package part follows the definition for content types for 13

    Hypertext Transfer Protocol (RFC 2616); that is, a content type is a well-structured ASCII string using a 14

    limited set of characters. 15

    A content type MAY include comments, which have no semantic content and SHOULD be ignored during 16

    processing. 17

    All of the content types used by a package SHALL be contained in a content-type item, which is stored in 18

    the ZIP archive as an item called /[Content_Types].xml. A package SHALL have exactly one content-type 19

    item. 20

    The content type of a relationship item SHALL be "application/vnd.openxmlformats-21

    package.relationships+xml". 22

    The namespace for a content-type item SHALL be 23

    "http://schemas.openxmlformats.org/package/2006/content-types". 24

    9.3.1 Content-type item markup 25

    The content-type item contains XML with a top-level Types element, and one or more Default and 26

    Override child elements. The Default elements define default mappings from the extensions of part names 27

    (that is, file extensions) to content types. This takes advantage of the fact that file extensions often 28

    correspond to content type. Override elements are used to specify content types on parts that are not covered 29

    by, or are not consistent with, the default mappings. Package producers MAY use pre-defined Default 30

    elements to reduce the number of Override elements on a part, but are not required to do so. 31

    The content-type item maps content types and package part names. As both content types and part names are 32

    ASCII strings, the values of the Extension, ContentType and PartName attributes are ASCII strings. 33

    For every part in the package, the content-type item MUST contain one of the following: 34

    One matching Default element, 35

    one matching Override element , or 36

    both a matching Default element and a matching Override element, in which case, the Override 37

    element takes precedence. 38

    There MUST NOT be more than one Default element for any given extension, and there MUST NOT be 39

    more than one Override element for any given part name. 40

    The order of Default and Override elements in the content-type item is not significant. 41

    Default content-type mappings MAY be defined in the content-type item even though no parts use them. 42

    [Example: Heres a sample of content-type item markup: 43

  • Non-normative references

    18

    1

    2

    3

    4

    5

    6

    Based on this markup, for the following list of parts, the corresponding content types are: 7

    Part Name Content Type /a/b/sample1.txt text/plain /a/b/sample2.jpeg image/jpeg /a/b/sample3.picture image/gif /a/b/sample4.picture image/jpeg end example] 8

    Content-type elements and attributes and content-type types are defined in Error! Reference source not 9

    found.. 10

    9.3.2 Setting the ContentType value of a part 11

    When adding a new part to a package, the following steps MUST be performed to fill the content-type item: 12

    1. Convert the part name to a normalized Unicode string. 13

    2. Create a part name from the Unicode normalized string. The resulting part name can be considered a 14

    normalized part name, as it uniquely represents the entire set of equal part names. 15

    3. Get the extension from the resulting part name by taking the substring to the right of the rightmost 16

    occurrence of the dot character (.) from the rightmost segment. 17

    4. Compare the resulting extension with the values specified for the Extension attributes of the 18

    Default elements in the content-type item. The comparison must be performed in the following way: 19

    Convert the value of the Extension attribute to a normalized Unicode string. 20

    Convert the resulting normalized Unicode string to an ASCII string. 21

    Compare the resulting ASCII string with the extension obtained in Step 3. The comparison 22

    MUST be case-sensitive and locale-invariant. 23

    5. If there is a Default element with a matching Extension attribute, then the content type of the new 24

    part MUST be compared with the value of the ContentType attribute. The comparison MUST be 25

    case-sensitive and locale-invariant. The content-types value comparison MUST NOT take into 26

    account the Content-Type grammar as defined in RFC 2616. 27

    If the content types match, no further action is required. 28

    If the content types do not match, then a new Override element MUST be added to the 29

    content-type item. 30

    6. If there is no Default element with a matching Extension attribute, then a new Default element or 31

    Override element MUST be added to the content-type item. 32

    9.3.3 Getting the ContentType value of a part 33

    To get the content-type value for a specified part, the following steps MUST be performed: 34

    1. Convert the part name to a normalized Unicode string. 35

    2. Create a part name from the normalized Unicode string. 36

    3. Get the extension from the resulting part name by taking the substring to the right of the rightmost 37

    occurrence of the dot character (.) from the rightmost segment. 38

  • Non-normative references

    19

    4. Compare the normalized part name obtained in Step 2 with the values specified for the PartName 1

    attribute of the Override elements. The comparison MUST be performed in the following way: 2

    Convert the value of the PartName attribute to a normalized Unicode string. 3

    Convert the resulting normalized Unicode string to an ASCII string. 4

    Compare the resulting ASCII string with the normalized part name. The comparison MUST 5

    be case-sensitive and locale-invariant. 6

    5. If there is an Override element with a matching PartName attribute, then return the value of its 7

    ContentType attribute. No further action is required. 8

    6. If there is no Override element with a matching PartName attribute, then look through the Default 9

    elements of the content-type item, comparing the extension obtained in Step 3 with the value of the 10

    Extension attribute. The comparison MUST be performed as it is defined in Step 4 of 9.3.2. 11

    7. If there is a Default element with a matching Extension attribute, then return the value of its 12

    ContentType attribute. No further action is required. 13

    8. If neither Override nor Default elements with matching attributes were found for the specified part, 14

    the implementation MUST consider this an error case. 15

    9.4 ZIP archive mapping 16 ZIP archive item names are case-sensitive Unicode strings that MUST conform to the ZIP archive file names 17

    grammar. Item names MUST be unique within a given archive. 18

    In order to support very large packages, producers and consumers SHALL provide Zip64 support. 19

    9.4.1 Mapping part names to ZIP archive item names 20

    To map part names to ZIP item names the following steps MUST be performed in order: 21

    1. Convert the part name to a logical item name. 22

    2. Remove the leading forward slash (/) from the logical item name. 23

    9.4.2 Mapping ZIP archive item names to part names 24

    To map ZIP item names to part names the following steps MUST be performed in order: 25

    1. Map the ZIP archive item name to a logical item name by adding a forward slash (/) to the ZIP 26

    archive item name. 27

    2. Map the obtained logical item name to a part name. 28

    9.4.3 Limitations 29

    The combined length of the item name, Extra field, and Comment fields MUST NOT exceed 65,535 bytes in 30

    the ZIP archive. Accordingly, part names stored in ZIP archives are limited to some length less than 65,535 31

    characters, depending on the size of the Extra and Comment fields. 32

    Producers SHOULD accommodate limitations of file systems when creating names for parts that may be 33

    stored in ZIP files. Two examples of these limitations are: 34

    On one popular file system, the asterisk (*) and colon (:) characters are invalid, so parts named with 35

    these characters will not unzip successfully. 36

    Many programs dealing with one popular file system can only handle file names that are less than 37

    256 characters, including the full path; parts with longer names might not behave properly once 38

    unzipped. 39

  • Non-normative references

    20

    10. WordprocessingML

    1

    This clause contains specifications for relationship items and parts that are specific to WordprocessingML. 2

    Parts that can occur in a WordprocessingML document, but are not WordprocessingML-specific, are 3

    specified in 13. Unless stated explicitly, all references to relationship items, content-type items, and parts in 4

    this clause refer to WordprocessingML items and parts. 5

    10.1 Glossary of WordprocessingML-specific terms 6 The following terms are used in the context of a Wordprocessin