epicenter2010 Open Xml
-
Upload
craig-murphy -
Category
Documents
-
view
483 -
download
0
Transcript of epicenter2010 Open Xml
Trinity College, Dublin: 8 – 11 June 2010
An Introduction to Open XML
CRAIG MURPHY
An Introduction to Open XML
Housekeeping
Mobile ‘phones
Fire Exits
Toilets
3
An Introduction to Open XML 4
Session Overview
This session will provide an explanation and demonstration of how we can programmatically create and use WordML and ExcelML documents
I will be using the Open XML SDK to make life easier No manual creation and management of .zip files / containers Let System.IO.Packaging, etc. take care of that
Avoids a discussion about code bloat, XML bloat and performance (which is actually very good)
It won’t be a political view of the “document wars” debate There will be no XPS vs PDF vs Open XML vs ODF / OpenDocument
content!
An Introduction to Open XML
If you learn one thing from my session…
On this day…June 8th…
1978: Woman takes world sailing record
Yachtswoman Naomi James breaks the solo round-the-world sailing record by two days.
An Introduction to Open XML
Office 2010 – First Run
6
An Introduction to Open XML
Disclaimer
This session includes some content from Microsoft slide decks
Not going to be an in-depth look at the Open XML API Code and demonstrations to get you started
Simplified version of the methods I use to generate custom reports in a non-production version of production application!
I’m a developer, not a designer! No flashy graphics or fancy documents
Let’s ignore the i4i injunction a Judge in Texas imposed on Microsoft Word!
7
An Introduction to Open XML
About Me
60+ presentations delivered: IMTC 2008, epicenter 2009 NRW06, NRW07 DeveloperDeveloperDeveloper (UK / Ireland Community Events) Scottish Developers Agile Scotland British Computer Society (BCS) UK Borland User Group (DDG) Visual Basic User Group (VBUG) VBUG .net Winter 2001 conference XML One 2001
60+ articles/book reviews published: The Delphi Magazine developers’ magazine (Dotnet Developers’ Group - DDG) ASPToday.com (now Wiley, previously Wrox) ASP.NET Pro, International Developer CSharpCorner, DeveloperFusion
8
Open XML
XML
XSLT
XQuery
XML Schema
SOAP
WML
IntraWeb
Web Services
C# InterOp with Delphi
RUP
UML
TDD in C#, VB.net and Delphi 8
Scrum
An Introduction to Open XML 9
Agenda
Motivation
The Tools
What: Open XML SDK 2, API Design
How: Demos, Code Generation, Injection, Content Controls
Why: Summary
Resources
An Introduction to Open XML 10
Motivation
There are times when we are too focused on application development New/useful tools techniques are passed by
60-90 minute sessions like these, personally, help me save time by: Identifying new/useful tools and techniques
Demonstrating new/useful tools and techniques
Your takeaway: is Open XML something you should be investigating further, or not as the case may be
I have been using Excel automation (COM type libraries) for report creation…since 1999 Gone through the “macro dilemma” – to use macros or not?
For Win32 Borland Delphi applications
For Win32 .net C# applications
An Introduction to Open XML 11
The Tools
Visual Studio 2010 Professional
Open XML SDK 2 RTM (March 2010)
Sits inside the .NET 3.5 SP1 space (more about this later on); SDK makes use of LINQ
Office 2010 Standard Only required for viewing documents Unlike COM-based automation, an Office client is not required
A boon if you are preparing reports server-side
Previously used Visual Studio 2008 Professional Office 2007 Open XML SDK CTPs
An Introduction to Open XML 12
Agenda
Motivation
The Tools
What: Open XML SDK 2, API Design
How: Demos - Manual, Code Generation, Injection
Why: Summary
Resources
An Introduction to Open XML
Open XML SDK 2
Productivity Tool
DocumentReflector for code generation
OpenXMLClassExplorer explore the Open XML markup and the ECMA 376 specification
OpenXMLDiff graphically compare Open XML files
OpenXMLValidator to validate entire documents or “document parts” against Office 2007 or Office 2010 file formats
13
An Introduction to Open XML
What is Open XML?
…an open standard for word-processing documents, presentations, and spreadsheets that can be freely implemented by multiple applications on different platforms
…faithful representation of existing word-processing documents, presentations, and spreadsheets that are encoded in binary formats defined by Microsoft® Office applications, i.e. tightly coupled
…purpose of the Open XML standard is to de-couple documents created by Microsoft Office applications so that they can be manipulated by other applicationsindependent of proprietary formats and without the loss of datahttps://connect.microsoft.com/content/content.aspx?ContentID=9521&SiteID=589&wa=wsignin1.0
14
An Introduction to Open XML
Before…Open XML SDK V2
Namespaces, element names and attributes were irksome to remember and to get right
Generally, constants were used to make managing namespaces, etc. that bit easier
Lack of strong typing Code would compile
May produce incorrect results at run-time
15
<w:document xmlns:w='http://schemas.openxmlformats.org/wordprocessingml/2006/main\'><w:body><w:p><w:r><w:t>some text</w:t></w:r></w:p></w:body></w:document>
An Introduction to Open XML
Now…Open XML SDK V2
Strongly Typed Object Model Node identification using strings is a thing of the past
Loosely typed System.Xml.Linq.XElement usage can be replaced
e.g. DocumentFormat.OpenXml.WordProcessing.Paragraph Spelling mistakes are caught by compile-time type checking
Obviously strong typing is preferable
16
AFTER
var paragraphs = doc.MainDocumentPart.Document.Body.Elements<Paragraph>().Select
BEFOREvar paragraphs = doc.MainDocumentPart
.GetXDocument()
.Element(w + "document")
.Element(w + "body")
.Elements(w + "p")
.Select
An Introduction to Open XML
API DesignSystem Support
18
.Net Framework 3.5 – The Open XML SDK leverages the advanced technology provided by .Net Framework 3.5, especially LINQ To XML, which makes manipulating XML much easier and more intuitive
System.IO.Packaging – The Open XML SDK needs to be able to add/remove parts contained within the Open XML Format packages. Included as part of .Net Framework 3.0 were a set of generic packaging APIs capable of adding and removing parts of OPC (Open Package Convention) conforming packages. Given that Open XML Formats are based on OPC, the SDK uses System.IO.Packaging APIs to open, edit and save Open XML Packages
Open XML Schemas – The Open XML SDK is based on Open XML Formats, which are represented and described as schemas. These schemas make up the foundation of the Open XML SDK, since the SDK enables Open XML developers to build solutions on top of Open XML Formats
An Introduction to Open XML
API DesignOpen XML File Format Base Level Stream Reading/Writing
includes stream reader and writer interfaces targeting Open XML elements and attributes
similar to XmlReader/XmlWriter, easier to use as the interfaces are Open XML aware
Open XML Low Level DOM Manipulate the Open XML tree directly by working with strongly typed objects and classes
instead of traditional XML nodes
Awareness of namespaces as well as element/attribute names is reduced
Intellisense for properties, etc.
Leverages LINQ
Open XML Packaging API Sits above System.IO.Packaging (.NET 3.0)
allows developers to manipulate Open XML parts with strongly typed classes and objects
Shipped in Open XML SDK v1.0
19
An Introduction to Open XML
API DesignValidation & Helpers Validation Layer
Open XML base layer does not guarantee creation of valid Open XML documents!
Our reliance on XML Schema, XSD files, is reduced if not removed
The SDK takes care of it on our behalf
Helper Functions Work directly on the XML elements and are functionally limited
by the file format standard
e.g. deletion of a WordML paragraph – a helper function may ensure that all additional steps are taken to leave the document is a valid state…
20
An Introduction to Open XML
The Importance of Validation
http://blogs.msdn.com/brian_jones/archive/2009/04/08/announcing-the-release-of-the-open-xml-sdk-version-2-april-2009-ctp.aspx
21
<w:body> <w:p> <w:r> <w:t>hello world</w:t> </w:r> </w:p> ... </w:body>
<w:body> <w:p> <w:t>hello world</w:t> </w:p> ... </w:body>
An Introduction to Open XML 22
Agenda
Motivation
The Tools
What: Open XML SDK 2, API Design
How: Demos, Code Generation, Injection, Content Controls
Why: Summary
Resources
An Introduction to Open XML
WordMLDocument Structure
23
Take a .docx, an .xlsx or a .pptx file, rename it as a .zip file
Open using Compressed Folders or your favourite zip utility
Very readable, but without the SDK, difficult to manage, especially in code
An Introduction to Open XML
Document Parts
A document part is… analogous to a file on the file system
stored inside the package in a specific location reachable via a URI
stored with a specific content type
mainly XML but other native types as well
Images, sounds, video, OLE objects
Content type is enforced Example: cannot tag JPEG part as GIF
[Open Excel - sample file – look for the image]
24
An Introduction to Open XML
ExcelMLDocument Structure
25
Relationships are stored in XML streams in the package Ties elements inside the
package to each other
Allows navigation of document without parsing parts
Package relationships stream URI: /_rels/.rels
Part relationships stream URI: _rels/[partname].rels
An Introduction to Open XML 26
demo
WordML and ExcelML
An Introduction to Open XML
Content Controls
New in Word 2007
Manageable via the Word Content Control Toolkit
Programmatic access to specific “fields” within a document
“Bindable” Can be bound to XML nodes
Makes use of the customXML folder
27
An Introduction to Open XML
Enabling the Developer ribbon – Word 2007
28
An Introduction to Open XML
Enabling the Developer ribbon – Word 2010
29
An Introduction to Open XML
Why Use Content Controls?
In situations where small amounts information is collected from many users: How often have you seen a spreadsheet being e-mailed to
hundreds of users, asking them to fill in “some” cells?
Give them a Word document with Content Controls Use a custom-written .NET application that aggregates the
information in the Content Controls into an Excel spreadsheet
30
An Introduction to Open XML 31
demo
Content Controls
CustomXML
in Word 2007 / Word 2010
An Introduction to Open XML
Deployment
All that you need to deploy are:
Your OpenXML-enabled application
DocumentFormat.OpenXml.dll
WindowsBase.dll
.NET (VPC test…)
c:\Program Files\Reference Assemblies\Microsoft\Framework\v3.0\WindowsBase.dll
http://blogs.msdn.com/dmahugh/archive/2006/12/14/finding-windowsbase-dll.aspx
33
An Introduction to Open XML 34
Agenda
Motivation
The Tools
What: Open XML SDK 2, API Design
How: Demos - Manual, Code Generation, Injection
Why: Summary
Resources
An Introduction to Open XML
Summary
Open XML is little more than a moderately complex XML document XML is readily accessible
in the .NET framework
in VB6
in Java
in Python, etc.
An Office installation is not required Office client not required on the server
Enables Office document creation from non-Microsoft platforms
“…it’s just zip, it’s just XML…” - Doug Mahugh http://channel9.msdn.com/posts/AdamKinney/Open-XML-File-Formats
35
An Introduction to Open XML
Summary
Start from a template document Easy replication of existing [client] documents
Use the DocumentRefector to generate Open XML code Refactor your report data into the generated code
Learn from the reflected / generated code
Open XML code is cleaner, more readable and more maintainable than its COM counterpart
Open XML documents can be consumed using applications and platforms from vendors other than Microsoft
36
An Introduction to Open XML 37
Resources (web-sites & blogs)
Open XML Format SDK 2.0 http://url.ie/tik
Microsoft’s Open XML portal http://www.openxmldeveloper.org/
If you are interested in Open XML / ODF conversion http://sourceforge.net/projects/odf-converter
http://www.twitter.com/openxml
Microsoft folks: Brian Jones http://blogs.msdn.com/brian_jones/
Doug Mahugh http://blogs.msdn.com/dmahugh/
Kevin Boske http://blogs.msdn.com/kevinboske/
Erika Ehrli http://blogs.msdn.com/erikaehrli/
Eric White http://blogs.msdn.com/ericwhite/
An Introduction to Open XML
Resources (web-sites & blogs)
Word 2007 Content Control Toolkit on CodePlex http://www.codeplex.com/dbe
Matthew Scott’s Content Controls and CustomXML Channel 9 video http://url.ie/u05
Wouter van Vugt http://blogs.code-counsel.net/Wouter/default.aspx
A collection of Open XML resources: http://www.craigmurphy.com/blog/?p=871
Including these slides and C# source code
38
An Introduction to Open XML 39
Resources (Books)
Open XML Explained
Wouter van Vugt
http://openxmldeveloper.org/articles/1970.aspx
An Introduction to Open XML
Contact Information
Craig Murphy
http://www.twitter.com/CAMURPHY
Updated slides, notes and source code:
http://www.CraigMurphy.com
http://www.CraigMurphy.com/blog
Questions