Journals and Magazines and Books, Oh My! A Look at ACS' Use of NLM Tagsets

56
American Chemical Society Journals and Magazines and Books, Oh My! A Look at ACS' Use of NLM Tagsets Dan O'Brien, ACS Publications Presented at JATS-Con, 1-Nov-2010

description

Journals and Magazines and Books, Oh My! A Look at ACS' Use of NLM Tagsets. Dan O'Brien, ACS Publications Presented at JATS-Con, 1-Nov-2010. What We'll Cover. Intro ACS, Products, Processes Framework & terminology for discussing customizations ACS Pubs' Use of NLM Tagsets - PowerPoint PPT Presentation

Transcript of Journals and Magazines and Books, Oh My! A Look at ACS' Use of NLM Tagsets

Page 1: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society

Journals and Magazines and Books, Oh My!

A Look at ACS' Use of NLM TagsetsDan O'Brien, ACS Publications

Presented at JATS-Con, 1-Nov-2010

Page 2: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 2

What We'll Cover

•Intro

– ACS, Products, Processes

– Framework & terminology for discussing customizations

•ACS Pubs' Use of NLM Tagsets

– Overall Approach

– Journals

– Books

– Magazine

• Successes & Lessons Learned

Page 3: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

3 American Chemical Society

Character Introductions

• ACS & ACS Pubs

• Journals• Books• Magazine

• Processes• Terminology

Page 4: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 4

Introductions: ACS

•Professional membership organization

– Chartered by U.S. Congress in 1876

– Non-profit

– Over 160,000 members

•ACS Publications Division ("ACS Pubs")

– Journals

– Magazine

– Books

– On a quest

Page 5: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 5

Introductions: ACS Journals

•40 peer-reviewed titles

•300,000 annual published pages

•~50% volume published weekly

•Among highest ISI impact factors

•"King" of publishing forest

Page 6: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 6

Introductions: Books

•Symposium Series

•Around 30 titles published annually

•Around 25 chapters per book

•Hard covers, rigid content format

Page 7: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 7

Introductions: C&EN Magazine

•Chemical & Engineering News

•Weekly Print & Web issues

•Daily Online News

•"BusinessWeek" for chemists

•Flexible format, loose content definitions

•More than meets the eye

Page 8: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 8

Introductions, cont.

•Pressure for product innovation: Wicked Which of the West

•NLM Tagsets – has the answers: Wizard of Oz

Page 9: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 9

Introductions: Processes

•Journals & Books:

– Standard scholarly publishing model

– XML-first article/chapter based production

• Automated Pre-Editing (Inera AutoRedact)

• Technical Editing

• Automated Post-Editing & Validations

– Article ASAP publication (Journals)

– Issue/Book publication (Journals & Books)

•Magazine:

– Staff writers vs. authors

– Feature articles, Thematic issues

– Story Online News? Issue?

– Edit-to-Fit

Page 10: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 10

Introductions: Journal Process

Page 11: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 11

Introductions: Books Process

Page 12: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 12

Introductions: Magazine Process

Page 13: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 13

Terminology

•Tag – a bit of XML markup: an element, attribute, etc.

•Tag Definition – the coding (in DTD or XSD syntax) that declares the tag name and what its allowed to do.

•Module – a way of logically organizing tag definitions, allowing reuse for multiple schemas.

•Tagset – a collection of related tag definitions forming a complete vocabulary, usually stored within a set of interrelated modules

•Schema – an application of a tagset to form a specific content model

Page 14: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 14

Terminology

Module Module Module Module

Module Module

ModuleSchema (DTD, XSD, etc.)

Tagset

Tag definition dependencies

Schema (DTD, XSD, etc.)

Module

Tag definition A

Tag definition B

Tag definition C

Tag definition D

Page 15: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 15

Terminology – "Customization Levels"

Tagset is used "As-Is" without customizations

Tagset not directly used; just "informs" your approach

Page 16: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 16

Terminology – "Customization Levels"

As-Is Extended Reduced Customized Built From Informed By

Page 17: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 17

Terminology – "Customization Levels"

As-Is Extended Reduced Customized Built From Informed By

Public version is used without changes or modification

Superset of public tagset is used

Subset of the public schema is used

Combo of Extensions + Reductions

Substantial changes: renamed tags, altered tag hierarchies, etc.

Only the design philosophy of public tagset is used

<xyz> <a/> <b/> </xyz>

<xyz> <a/> <b/> <c/></xyz>

<xyz> <a/> <b/> </xyz>

<xyz> <a/> <b/> <c/></xyz>

<abc> <a> <b/> </a></abc>

<abc> <aa/> <bb/> <cc/></abc>

XML is compatible

Public Custom

Public Custom

XML not compatible?

XML not compatible!

XML not compatible!

Page 18: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 18

Terminology – "Customization Implementation Methods"

Overrides, leaving original public tag definitions versions intact

Modified original public tag definitions

Page 19: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 19

Terminology – "Customization Implementation Methods"

Overrides Mixed Modified

Module

Module

Module

Module

Module

Module

ModuleCustom Schema (DTD,

XSD, etc.)

Tagset

Tag definition dependencies

Public Schema (DTD, XSD, etc.)

Module

Tag definition A

Tag definition B

Tag definition C

Tag definition D

Page 20: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 20

Terminology – "Customization Implementation Methods"

Overrides Mixed Modified

Module

Module

Module

Module

Module

Module

Custom Schema (DTD, XSD, etc.)

Tagset

Tag definition dependencies

Public Schema (DTD, XSD, etc.)

Module

Tag definition A

Tag definition B

Tag definition C

Tag definition D

Page 21: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 21

Terminology – "Customization Profile"

Customization Levels

Customization Implementation Methods

Overrides Mixed Modifications

As-is

Extended

Reduced

Customized

Built from

Informed by

Page 22: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

The Journey: ACS Pubs' Use of NLM Tagsets

Page 23: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 23

ACS Pubs' Use of NLM Tagsets – Overview & Approach

•Leverage a public schema, or develop one from scratch?

•If use a public schema, would customization be needed? (i.e., where on the "Customization Levels” spectrum?)

– Product drivers !!

– Process drivers !!

– ACS Terminology !?

•If customization would be needed:

– How much customization was needed? (scoping)

– What customizations are needed? (details)

– How to implement the customizations? (i.e., where on the "Implementation Methods" spectrum?)

Page 24: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

24 American Chemical Society

ACS Journals' Use of NLM Tagsets

• Production vs. Delivery• What we use and why • Customization Profile • Highlights of Customizations

Page 25: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 25

ACS Journal Production: What we use

•Custom-built DTD based loosely on NLM Journal Archiving & Interchange v2.2

•~2005, as NLM tagset was beginning to increase in prominence for STM publishing

•Pre-2010: Monolithic tagset & schema used for editing, page composition, interchange with web delivery and 3rd parties

•Late 2010: New version of tagset supporting multiple schema flavors:

– "X" – External & Delivery Interchange

– "P" – Internal Production

– "L" – Page Layout

Page 26: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 26

ACS Journal Production: What we use

Core tagset modulesExternal/Interchange

DTD

ACS Journal v1.03 DTDs ACS Journal v1.03 Tagset

Page 27: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 27

ACS Journal Production: What we use

Production-specific tagset features extend core modules

Core tagset modulesExternal/Interchange

DTD

Production DTD

ACS Journal v1.03 DTDs ACS Journal v1.03 Tagset

Overrides of tag definitions

Page 28: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 28

ACS Journal Production: What we use

Production-specific tagset features extend core modules

Core tagset modules

Page layout specific tagset features extend production-specific modules

External/Interchange DTD

Production DTD

Layout DTD

ACS Journal v1.03 DTDs ACS Journal v1.03 Tagset

Overrides of tag definitions

Page 29: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 29

ACS Journal Production: Why

•No public tagset met the minimum requirements for

– ACS Journal Product – without undesirable product limitations

– ACS Journal Process – without increasing costs

– Allowing ACS Pubs Terminology

• Without significant staff training & documentation updates

• Without risking rejection

•NLM's Journal tagset came closest

– Could have used massive extensions?

– ACS Pubs Terminology pushed us into "Built From"

Page 30: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 30

ACS Journal Production: Customization Profile

Customization Levels

Customization Implementation Methods

Overrides Mixed Modifications

As-is

Extended

Reduced

Customized

Built fromACS Journal Production

Informed by

Page 31: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 31

ACS Journal Production: Customizations – Terminology

NLM ACS

<fig> with @fig-type <fig>, <chart>, <scheme>

<abstract> with @abstract-type <abstract>

<synopsis>

<dek>

<graphic> with @content-type <abstract-graphic>

<toc-graphic>

<title-page-graphic>

<bio-pic>

<media> <weo>, <toc-weo>

Page 32: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 32

ACS Journal Production: Customizations – Process

NLM ACS

<article>

  <front>

     <journal-meta>

     <article-meta>

  <body>

  <back>

<document>

  <metadata>

     <journal-meta>

     <document-meta>

     <processing-meta>

  <body>

  <back>

<sub-article>, <response> <sec> beefed up to act as quasi "sub-article"

Page 33: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 33

ACS Journal Production: Customizations – Product

NLM ACS

<nlm-citation> (v2.3),<element-citation> (v3.0)

<acs-titles>, <acs-no-titles>, <acs-biochem>

n/a <chemical-name>, <chemical-process>, <caution>

<live-change> and related tags

<tie-bar-start/>, <tie-bar-end/>

Page 34: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 34

ACS Journal Production: Customizations – Product, cont.

NLM ACS

n/a MathML 2 extensions:    

<ACS:marker>   

<object-group>

(now available in MathML 3)

n/a CALS Table extensions   

@row-type = list of types to receive special handling   

@indent-left = amount + unit    

@indent-left-style = {"full", "first-line", "hanging"}    

@spacing-before, @spacing-after

Page 35: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 35

ACS Journal Delivery: What we use

• Online delivery system: based on Literatum from Atypon

• Literatum speaks "NLM Journal Archive & Interchange"

• Common base tagset ≠ XML content compatibility

– Differing schemas

– Differing tagging expectations

...see Figure <xref rid="xfca3"/>.

vs.

...see Figure <xref rid="xfca3">4</xref>.

Page 36: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 36

ACS Journal Delivery: What we use

• Two-part content interface

1. Production system: "ACS-Delivery-Prep" (export )

2. Delivery system: "ACS2NLM" lexer ( import)

Both advantages & disadvantages

+ Insulates Production developers from Delivery intricacies

+ Delivery system tagging can evolve without Production

- Occasional failure point

- New products, production tagging changes = ACS2NLM lexer changes

Page 37: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 37

ACS Journal Delivery: Customization Profile

Customization Levels

Customization Implementation Methods

Overrides Mixed Modifications

As-is

ExtendedACS Journal

Delivery System

Reduced

Customized

Built fromACS Journal Production

Informed by

Page 38: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

38 American Chemical Society

ACS Books' Use of NLM Tagsets

• What we use and why • Customization Profile • Highlights of Customizations

Page 39: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 39

ACS Books: What we use and why - Drivers

•Delivery System: Leverage our new Literatum-based delivery platform.

•Composition: Leverage Arbortext Publishing Engine for highly-automated XML-based page composition.

•Like Journals: Don't re-invent the XML wheel.

•Unlike Journals: Books had unique product characteristics of their own; different type of wheel.

•Book + Chapter production:

– Individual Chapter level: production editing and some composition

– Whole Book level: final book composition, indexing

– Delivery: combination of both book and chapter XML & PDF deliverables.

Page 40: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 40

ACS Books: What we use and why - Answers

•Delivery System:

– Literatum already supported an Extended version of NLM Book v2.3

– Production & Delivery could share a common tagset!

•Composition: Extended NLM Book v2.3 fit the bill

•Like Journals:

– Extended NLM Book v2.3 had CALS table model

– Many elements & structures were similar to ACS Journal tagset, easing adoption

•Unlike Journals: Extended NLM Book v2.3 addressed almost all book-specific metadata & processing needs

•Book + Chapter production: gap! Solution: Xinclude

– Allows "link book to chapter" instead of "copy chapter into book"

Page 41: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 41

ACS Books: Customization Profile

Customization Levels

Customization Implementation Methods

Overrides Mixed Modifications

As-is

ExtendedACS Journal & Book Delivery

System

Reduced

CustomizedACS Book Production

Built fromACS Journal Production

Informed by

Page 42: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 42

ACS Books: Customization Highlights

•Addition of XInclude

– Allows a chapter XML to be processed both as stand-alone document AND within context of entire book

•Use of OASIS Table Model

(instead of default XHTML Table model)

•Addition of DocBook <index> Model

•Addition of <book-series-meta> section

(similar to <journal-meta>)

Page 43: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 43

ACS Books: Customization Highlights - XInclude

Book XML

<book> <book-series-meta>… <book-meta>… <body> <book-part>… <book-part>… <book-part>…

Book DTD

Chapter XMLs

Book XML

<book> <book-series-meta>… <book-meta>… <body> <xi:include hef="ch1.xml"/> <xi:include hef="ch2.xml"/> <xi:include hef="ch3.xml"/>

Book DTD

<book-part>…

<book-part>…

<book-part>…

Page 44: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

44 American Chemical Society

ACS C&EN Magazine's Use of NLM Tagsets

• What we use and why • Customization Profile • Highlights of Customizations

Page 45: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 45

ACS Magazine: What we use and why

•What: A customized version of the ACS Journal Tagset

– (Which was "informed by" NLM Journal Tagset)

•Drivers:

– Ability to archive a "content of record" that is format independent

– Ability to serve as technology-neutral "content interchange format"

• Automated web delivery

• External content syndication

•Other contenders: DITA for Publications, DocBook, EPUB, PRISM, NewsML,

Page 46: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 46

ACS Magazine: Customization Profile

Customization Levels

Customization Implementation Methods

Overrides Mixed Modifications

As-is

ExtendedACS Journal & Book Delivery

System

Reduced

CustomizedACS Book Production

Built fromACS C&EN Magazine

ACS Journal Production

Informed by

Page 47: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 47

ACS Magazine: Customization Highlights

•Amorphous, modular content structures: XInclude

– Same content produced as

• Single article in print

• Several distinct pages online

– Web-only articles & article components

– Blur between articles & subarticles

– Graphics, tables, media have separate production lifecycles, joined later

•Non-contiguous Pagination

•Ads

Page 48: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 48

ACS Magazine: Customization Highlights

•Flexible, recursive categorization model

– Print/web name, internal code, source/type

• "CO2 Sequestration" vs. "Carbon Dioxide Sequestration"

– RSS feeds

– Alternate topic-oriented TOCs

•Special content constructs

– Dek

– Eyebrow

– Pull quotes

Page 49: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 49

ACS Magazine: Customization Highlights

Page 50: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

50 American Chemical Society

ACS Pubs' Use of NLM Tagsets – Summary

Tagset Lineage & Content Interchange Map

Page 51: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

51 American Chemical Society

Successes & Lessons Learned

• Tagging & Technology• People

Page 52: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 52

Successes & Lessons - Technical

1. Monolithic vs. specialized schemas

2. Use of XInclude for Books & Magazine

Page 53: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 53

Successes & Lessons - Technical

3. ACS Pubs' hosted "Validations service"

• Internal staff

• Internal systems

• External vendors

4. Use of XML for ACS Mobile

Page 54: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 54

Successes & Lessons - People

1. Busting the NLM DTD "compatibility" myth

2. "XML as a product" mentality

Page 55: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

American Chemical Society 55

Successes & Lessons - People

3. Specifying XML requirements via "Three-legged stool" or package:

a) XML DTD/Schema

b) Documentation: Tagging Conventions & Rendering Expectations

c) XML Samples

Page 56: Journals and Magazines and Books, Oh My!  A Look at ACS' Use of NLM Tagsets

56 American Chemical Society

Q & A