XML Workflows for Book Publishers
-
Upload
thadmcilroy -
Category
Documents
-
view
383 -
download
2
description
Transcript of XML Workflows for Book Publishers
XML W kfl d YXML Workflows and YouThad McIlroyThad McIlroy
The Future of PublishingSan Francisco & VancouverSan Francisco & Vancouver
Presented toThe Association of Canadian Publishers CPDS
Digital Publishing WorkshopThursday, December 10, 2009y, ,
Copyright 2009 by Thad McIlroyThe Future of Publishing
OutlineOutline My backgroundy g My XML Thesis The Vision! The Vision! Coping with a digital world Thinking (a lot) about XML Complexityp y eBooks Implementing XML workflows Implementing XML workflows
M B k dMy Background
8 years in bookselling & publishing in Canada; 4 in the U.S. (15 in SF)( )
20+ years studying the intersection of technology and print publishing, workingtechnology and print publishing, working with publishers, printers & vendors
5 years with Seybold Seminars 5 years with Seybold Seminars 14 books and 200+ articles
More Recent BackgroundMore Recent Background
10 years studying the impact of the10 years studying the impact of the Internet on graphic communications
Major focus now: Major focus now: The future of publishing
Workflow Workflow eBooks and other media
Publishing automation (XSL FO etc ) Publishing automation (XSL-FO, etc.)
Writing for PrintAction, Learned Publishing, TheFutureOfPublishing.com
Th F t fP bli hiTheFutureofPublishing.com
My XML ThesisyThese are all real problems! Print production The WebThe Web Repurposing to multiple media
The semantic world (metadata) The semantic world (metadata) Recombining (or subsetting) for new
products Archiving Accessibility
XML Provides Real SolutionsXML Provides Real Solutions But it is a big, ugly, unwieldy bearg, g y, y And its conceptual metaphors bear little
resemblance to those in book publishingresemblance to those in book publishing It’s based on 25-year-old thinking about
techdoctechdoc Yet its ubiquity makes it hard to shake
….as does its mindshare We’ve got a challenge on our hands!
The Vision: Smart DocumentsThe Vision: Smart DocumentsAuthors w.t l t
Text and vectortemplates
Editors working
graphics in XML
Knows whereelectronically
Proofing
Knows whereit’s been andwhere it’s going:• print & bind
W b & PDAProofingdone digitally
• Web & PDA• distribution
File contains allpreflight info &revision history Contains multiple
l ilanguage versions
Some ACP Survey ResultsSome ACP Survey Results Most title production remains inhousep Few expect this to change soon Few are “very aware” of XML workflows; Few are very aware of XML workflows;
many questions remainN t t i h th ROI ill f Not certain where the ROI will come from
Keen on semantic tagging but without a strong concept of the value
ACP S R lt Q tiACP Survey Result Questions
50% of the software used inhouse is neither QE or ID. What is it?
Why the major interest in semantic tagging?tagging?
Dealing with a Digital World
W kfl C B C t ll dWorkflow Can Be Controlled
PDF is a big part of the answerP di t bl ( t ti ll ) d i d d t Predictable (potentially) and independent
Web-enabled JDF (Job Definition Format) We need to resolve the proofing issuep g
C l b C t ll dColor can be Controlled
CMSs are working Color control on press has been here for Color control on press has been here for
half a decade Press manufacturers are starting to Press manufacturers are starting to
support color controlCl d l l t l i h Closed-loop color control is here now
20 Years Later
What do You Mean, “Erratum”?What do You Mean, Erratum ?
$10k f “E l i ” L k$10k for an “Exculsive” Look
What is to Be Done?
M E ti l P i tMy Essential Point
If you’ve not got your current workflows fully digital and debugged, forget XML y g gg gentirely
Workflow MustBe Charted
Th T t f A t tiThe Tenets of Automation
Full digitization: nothing on paper Full commitment: from management to Full commitment: from management to
sales to all operating staff All the software: the right applications All the software: the right applications
(from creative through DAM/CMS and workflow enablersworkflow enablers
Standards: full support for the standards that enable a tomationthat enable automation
6 projectsh ld h bli hi f h bthat could change publishing for the better
Michael Tamblyn, CEO BookNet CanadaBookNet Canada TechForum 09
an XML publishing p gworkflow that doesn’t
suck
ORORa publishing workflow p gthat offers all of the
benefits of XML,d ’ kyet doesn’t suck
XML & P bli h ’ W kflXML & Publishers’ Workflows
Most publishers are still in the dark ages This is NOT simple to train and support This is NOT simple to train and support Large offshore component.
XML i th AXML is the AnswerA New-Breed of Data Standard,a Single Standard Able to Represent:
1. All manner of content2. The structure of content2. The structure of content3. The “meaning” of content (through smart
tag names and metadata)P d ti / kfl i t4. Production/workflow requirements
5. Rights data6 Repurposing requirements (cross-media)
2005
6. Repurposing requirements (cross-media)
XMLXML
“Composition is the ‘low-hanging fruit’” XML stands for Extremely Mixed-up XML stands for Extremely Mixed up
Language Suited to reference non fiction Suited to reference, non-fiction,
education, multipurposing“XML i lik i l If ’ t tti “XML is like violence. If you’re not getting the result you want you have to use more ”more.”
Th I t f XMLThe Importance of XML
eXtended Markup Language XML enables content management XML enables content management Combining of the power of style sheets
with the power of databaseswith the power of databases Style sheets with meaning
F t St tFormat vs. Structure
Format describes how content is intended to look when it is displayed or p yprinted
Structure describes the purpose or Structure describes the purpose or meaning of content
Th I f ti A l hThe Information Avalanche Doubling the knowledge base: Doubling the knowledge base:
1750 – 1900: 150 years to double1900 – 1950: 50 years to doubley1950 – 1960: 10 years to double1960 – 1992: 5 years to double
By 2020, information is expected to double about every 73 days!
Paper can’t provide data in a cost-effective and timely fashion
Growth in Electronic Documents 1995: 12 trillion electronic and paper
documents 90% of all documents were printed (in 1998)
2005:20 trillion documents 2005: About 50% will be printed
R ti f ff t t di it l i t 40 60 Ratio of offset to digital print — 40:60 Offset @ 40% of today’s volume
Source: Gary Starkweather, Microsoft Research(and inventor of the laser printer)
W3C XML Schema Definition Language (XSD) 1.1 12-03-09
…Part 1: Structures” specifies the XML Schema Definition Language, offering facilities for describing the structure and constraining the contents of XMLstructure and constraining the contents of XML documents, including those which exploit the XML Namespace facility. The schema language, which is itself
t d i XML b l drepresented in an XML vocabulary and uses namespaces, substantially reconstructs and extends the capabilities found in XML document type definitions p yp(DTDs). The second publication, "Datatypes,” defines facilities for defining datatypes to be used in XML Schemas as well as other XML specifications CommentsSchemas as well as other XML specifications. Comments welcome through 31-12.
DocBook (docbook.org)DocBook (docbook.org) What is DocBook? What is DocBook? “DocBook is a schema (available in
several languages including RELAX NGseveral languages including RELAX NG, SGML and XML DTDs, and W3C XML Schema) maintained by the DocBookSchema) maintained by the DocBook Committee of OASIS. It is particularly well suited to books and papers aboutwell suited to books and papers about computer hardware and software...”700 f “d ” d t ti 700 pages of “dense” documentation
DITA 1 1 August-07DITA 1.1 August 07 The Darwin Information Typing The Darwin Information Typing
Architecture (DITA) is an XML-based architecture for authoring producing andarchitecture for authoring, producing, and delivering information. Its main use is for technical publicationstechnical publications
The documentation is 593 pagesM i t i d b OASIS Maintained by OASIS-open.org
And it won’t work for all your titles
Do Not Be Tricked!
M t d t E t th PMetadata Enters the Process
Data that describes other data
The Bean AnalogyThe Bean Analogy
FROM: A Manager’s Introduction to Adobe eXtensible Metadata PlatformWritten by Andrew Salop
B M t d tBean MetadataELEMENT CATEGORY VALUE OF CATEGORY IN THIS DATA TYPE NUMBER OF INFORMATION INSTANCE (What appears on the label)
1 The maker: Trader Joe’s String
2 The contents: Black Beans String 2 The contents: Black Beans String
3 A notion of distinctive food value: A low fat food String
4 A second notation of distinctive food value: An excellent source of dietary fiber String distinctive food value: An excellent source of dietary fiber String
5 Directions for finding nutritional information: See side panel for nutritional information String
6 A notation of weight, in English and metric units: New Wt. 15 oz (415g) Formatted numbers English and metric units: New Wt. 15 oz (415g) Formatted numbers
7 A marketing narrative Trader Joe’s Black Beans have a rich, hearty taste and soft texture. They are wonderful in soups and stews, with rice, and in salads with colorful vegetables and Southwestern or Caribbean flavors. Black beans have gained in popularity due to their Black beans have gained in popularity due to their high dietary fiber and protein content. They are a cholesterol-free and low fat food. Long string
More Bean MetadataMore Bean Metadata cholesterol-free and low fat food.
8 A declaration of No preservatives, no artificial colors, no artificial wholesomeness: flavors String
9 A list of ingredients: black beans, water, salt, calcium chloride List separated by commas
10 The ID of distributor Dist.& Sold Exclusively by Trader Joe’s, and seller: So. Pasadena, CA 91031 String
11 A tracking code, in Roman 0009 6362 Integer
12 Same tracking code in bar- code-readable format Bit map
13 The nutritional facts, in Structured table 13 The nutritional facts, in Structured table standard order and format:
Nutritional Facts Serving Size 1/2 cup (130g) Servings per container about 3
Amount per serving
Calories 130 Fat Cal 5
% Daily
% Daily Value
Total Fat 0.5g 0%
Saturated Fat 0g 0%
Cholesterol 0mg 0%
Sodium 260mg 11% Sodium 260mg 11%
Total Carbohydrates 22g 7%
Dietary Fiber 5g 22%
Sugars 0g
Protein 10g 20%
Vitamin A 0% ° Vitamin C 0% Vitamin A 0% ° Vitamin C 0%
Calcium 4% ° Iron 10%
• Percent Daily Values are based on a 2,000 calorie diet
Structured Taggingby Authors?
Typéfi sample approach
XML TaggingXML TaggingSemantic tagging requires human judgment
<!--the resource links in the ProcessGroup define the input resources that must be available for the ProcessGroup to be submitted and the output resources that are produced by the ProcessGroup -->p p y p
<ResourceLinkPool><!-- print input media --><MediaLink Usage="Input" rRef="L2"/><ResourceLinkPool><GatheringParamsLink Usage="Input" rRef="L4"/><!-- gathered output components -->
C tLi k U "O t t" R f "L7"/<ComponentLink Usage="Output" rRef="L7"/></ResourceLinkPool><ID="J2" Status="Waiting" Type="DigitalPrinting"><ResourceLinkPool><ResourceLinkPool><GatheringParamsLink Usage="Input" rRef="L4"/>
Templated DesignsHow much of XML-tagged content can be
composed automatically?
Typéfi sample approach
If you show this to most yeditors... “they’re going to start drinking at their desks”drinking at their desks”
Digital Asset ManagementDigital Asset ManagementXML’s role in metadata and taxonomies
The Cross-Media Challengeg
PrintPrint
Web
Mobile
Th C M di Ch llThe Cross-Media Challenge
Th C M di Ch llThe Cross-Media Challenge
W ’ Thi ki S llWe’re Thinking Small
The Human FactorNew Internal Roles, Skills & Positions
The production skill set changes substantially Much of the existing knowledge base
changes or obsoletesTh f d i & iti & The move from design & composition & production management to content & product architecting and engineeringp g g g
There is an enormous training challenge ahead And a need for certification
The Tipping PointThe Tipping PointHow Little Things Can Make a Big Difference
“...a book that presents a new way of p yunderstanding why change so often happens as quickly and as unexpectedly as pp q y p yit does...Ideas and behavior and messages and products sometimes behave just like p joutbreaks of infectious disease. They are social epidemics.p
— Malcolm Gladwell
C i th ChCrossing the Chasm
Pragmatists Conservatives
VisionariesSkeptics
Techies
Innovators Early Adopters
Early Majority Late Majority LaggardsAdopters
Source: www.chasmgroup.com
Thank youThank you
[email protected]@theFutureofPublishing.com
Copyright 2009 by Thad McIlroy The Future of PublishingCopyright 2009 by Thad McIlroy, The Future of PublishingMay be redistributed and re-used for non-commercial purposed, providedauthor attribution is provided and a link to www.thefutureofpublishing.com