Getting Started with AnIML 1animl.sourceforge.net/files/pdf/2012_gwk_about_animl.pdfGetting Started...
Transcript of Getting Started with AnIML 1animl.sourceforge.net/files/pdf/2012_gwk_about_animl.pdfGetting Started...
Getting Started with AnIML 1.0
Gary W. Kramer
Biochemical Science Division National Institute of Standards and Technology
< AnIML >
< / AnIML >
This presentation will:
n Describe Why AnIML Standards are Needed n Describe What the AnIML Standards Are n Describe the Components that Make Up AnIML n Describe What AnIML 1.0 Supports n Discuss Where We Are with AnIML 1.0 n Discuss Getting Started with AnIML
We got TROUBLE, my friends, right here in Science City and that starts with T, and it rhymes with D, and it stands
for DATA…
n Can’t Move It l From Instrument to Instrument l From Instrument to Application l From Application to Application
n Can’t Interconvert It n Can’t Find All its Parts n Can’t Look at It Easily n Can’t Use It Easily with Modern
Computing and Networking Technologies
There is Crisis in Archiving and Retrieving Data n We’re drowning in data, yet increasingly we cannot find
our stuff . n Data mining can increase the value of archived data,
but… n Regulatory agencies are now demanding extraordinarily
long data retention n In terms of retrieving archived data, we were better off
with paper forty years ago before the arrival of lab computers…
n We can still read the data in Newton's notebooks today; will folks be able to read ours in 100, 20, or even 5 years?
What Are the AnIML Standards? n The Analytical Information Markup Language (AnIML)
is a markup language based on Extensible Markup Language (XML) that provides a structure, syntax, and format for recording, reporting, exporting, importing, and archiving the result data and result metadata outcomes from analytical chemistry experiments.
n The AnIML Standards are projects of ASTM Committee E13 on Spectroscopy and Separation Science, Subcommittee E13.15 on Analytical Data
The Analytical Information Markup Language (AnIML)
q Has been Developed Expressly for Analytical Chemistry Result Data and Metadata q Defines a Data Format for Representing All Types of Result Data and Metadata q Provides a System for Identifying Content (Data Elements and Attributes) with Labels (Tags) q Provides a Model and the Means for Defining the Structure, Content, and Semantics of Result Data and Metadata in Documents q Expresses Shared Vocabularies q Allows Machines to Carry Out Rules Made by People
Some AnIML Files are Simple
Some AnIML Files are Complex
Key AnIML Features n Extensible – Supports New Vendor and User
Requirements n Verifiable - Supports Digital Signing of Full and Partial
Documents to Ensure Data Integrity n Traceable - Supports Audit Trails to Track Data
Alterations and Create a Dataset “Chain of Custody” n Validatable - Permits Checking of Datasets Against
Data Requisites from Multiple Data Authorities l XML Syntax and Schema Language l AnIML Schemas and Technique Definitions l AnIML Naming and Design Rules l Semantic Rules l Data Requisites from Higher Data Authorities
W3C XML Language
W3C Schema Language
In the Beginning, There was…
+
W3C XML Language
W3C Schema Language
AnIML Schema NDRs
To which We Added…
+ +
W3C XML Language
W3C Schema Language
AnIML Schema NDRs
AnIML Core Schema
And this Begat…
+ +
A Generic Data Container Capable of Holding Any Analytical Data and Metadata
And it was Good…But Lo, It was Too Generic to Support Interoperability
W3C XML Language
W3C Schema Language
AnIML Schema NDRs
AnIML Core Schema
AnIML Technique Schema
And so, We Created a Helper for the Core…
+ +
A Mechanism to Constrain the Core And Make the Tags Meaningful to Technique Users
And It was Good…,But, It still Needed Rules
W3C XML Language
W3C Schema Language
AnIML Schema NDRs
AnIML Core Schema
AnIML Technique Schema
AnIML Technique Definitions
AnIML Technique NDRs
And Rules Enabling It to Constrain the Core…
+ +
Thus the Core could be Constrained to Represent Data & Metadata In Ways Familiar to Analytical Users
And this was better…, but Vendors & Users Still Needed to Add their Own Tags & Constraints
W3C XML Language
W3C Schema Language
AnIML Schema NDRs
AnIML Core Schema
AnIML Technique Schema
Vendor Technique Extension
AnIML Technique Definitions
Institutional Technique Extension
AnIML Technique NDRs
Allowing AnIML to Be Extended…
+ +
Extensions can Supplement, but Not Override the Constraints in Technique Definitions & the Core
And It was Way Good, so AnIML Can Grow & Respond To Vendor & User Needs
W3C XML Language
W3C Schema Language
AnIML Schema NDRs
AnIML Core Schema
AnIML Technique Schema
Vendor Technique Extension
AnIML Technique Definitions
Institutional Technique Extension
AnIML Data File Creation App
Instrument Data, Params, Metadata
AnIML Data File
AnIML Technique NDRs
Then Along Came a Programmer…
+ +
Who Created an App that Put Instrument Data Into the AnIML Framework
And Lo, It was Awesome!
Main AnIML Components n AnIML Core Schema
l One Schema l Maintained by ASTM E13.15
n AnIML Technique Schema l One Schema l Maintained by ASTM E13.15
n AnIML Technique Definition Documents l One or More Instance Documents per Technique l Maintained by ASTM E13 or Appropriate Domain Expertise Organization
n AnIML Technique Definition Extensions l One or More Instance Documents per Technique l Maintained by Vendor, Organization, User, or Whomever Extends the
Technique
n AnIML Result Data Files
What Does AnIML 1.0 Support?
n Storage of Raw or Processed Experimental, Simulated, or Annotated Data & Metadata from Selected Analytical Techniques
n Processed Data and Metadata from Trace-Forming/ Peak-Finding, Cross-Cutting, Peak Integration, and Peak-Table Creation
n Quantitation
AnIML 1.0 Supported Techniques n Chromatography and Other Flow Techniques
l GC Point Detectors – FID, TCD, ECD, NPD, FPD l LC Point Detectors – RI, ELS
n UV-Vis n IR n 1D NMR n MS
n Combinations of the Above Techniques (Hyphenated Techniques)
n Indexed Experiments Using the Above Techniques l Kinetic Runs l Temperature or pH Profiles
AnIML: Where We’re At for Version 1.0 n AnIML Core Schema
l Complete l Frozen l Internally Documented
n AnIML Schema NDRs l Complete
n AnIML Technique Schema l Complete l Frozen l Internally Documented
n AnIML Technique NDRs l In Progress
n AnIML Technique Exemplar l In Progress
AnIML: Where We’re At for Version 1.0 n AnIML Technique Definitions
l UV-Vis Technique Definition – Complete l Chromatography Technique Definition – Complete l Mass Spec Technique Definition – Started l 1D NMR Technique Definition – Started l IR Technique Definition – Started l Point Detector Technique Definitions
u GC Detectors (5) – Complete u LC Detectors (2) – Started
l AnIML Misc. Technique Definitions u Indexing – Complete u Cross-Cutting – Not Started u Trace Forming – Not Started u Peak Table – Not Started
AnIML: Where We’re At for Version 1.0
n AnIML Applications and Tools
l AnIML Data Viewers – Several Complete l AnIML Extensible Validator – Complete l Simple AnIML File Writer for Agilent 8453 UV/Vis – Complete l Simple AnIML File Writer for PE Lambda 900 UV/Vis - Complete l JCAMP-DX to AnIML Data Converters – Several Complete l XSLT Documentation Extractor/Stripper for Schemas – Complete l Documentation Extractor /Creator for Technique Definitions – Complete
AnIML Documents and Standards n AnIML Standard Practice Document – Nearly Complete n AnIML Core and Technique Schema Standard Specification and
Guide Document n AnIML Standard Technique Definition Guide Document n Individual Technique Standard Specification and Guide
Documents l AnIML UV-Vis Technique l AnIML IR Technique l AnIML Chromatography Technique l AnIML Mass Spec Technique l AnIML 1D NMR l AnIML Misc. Techniques
u Indexing u Point Detectors u Cross-Cutting u Trace Forming/Peak Finding u Peak Table
Expediting the Completion of AnIML 1.0
§ Current Volunteer Effort is Too Slow § There is Still Too Much to Do § Need Paid Help to Develop Components & Documentation § AnIML 1.0 Documentation Expedition Project
• Funded by • Agilent Technologies • Amgen • Glaxo SmithKline • PerkinElmer • Waters Corporation
• Administered by ASTM
Getting Started with AnIML?
n Check Out the AnIML Website l http://animl.sourceforge.net l http://www.animl.org
n Delve into the Internally Documented Core Schema n Examine the AnIML Schema NDRs n Check Out Some Technique Definition Docs and
Their Documentation n While There May Be Some Changes Before Vers.
1.0, They Likely Will Be Minor or Organizational
Want to Help Develop AnIML?
n Contact l Gary Kramer, NIST l +1-301-975-4132 l [email protected]
l Burkhard Schäfer, BSSN l +1-888-674-0047 l +49-6136-75-27-60 l [email protected]
n Webex Virtual Meetings l Monthly – 10:30 AM EST l ASTM-sponsored
Acknowledgements
n Alex Rühl n Martin Peschke n Aykut Arslan n Dominick Pötz n Anh Dao Nguyen n Alex Roth
n Ronnie Jopp n Patrick Gleichmann n Kordian Placzek n Frank Masur n Dennis Backhaus
Mark Bean – GlaxoSmithKline Maren Fiege – Waters Informatics GmbH Burkhard Schäfer – BSSN Software Stuart Chalk – University of North Florida Dale O’Neill & Michael Leiber – Agilent Technologies Adam Patkin – PerkinElmer Jamie McQuay – Scimatic Software Peter Linstrom – NIST
Tony Davies – IUPAC Subcommittee on Electronic Data Standards
Reinhold Schäfer – Fachhochschule Wiesbaden
What is a Data Requisite?
n A Data Requisite is the collection of reported result data and result metadata required by a data authority to make a dataset fit for the purpose of an experiment.
n Examples of Data Requisites: l MIAME “Format” – Minimum Information About a Microarray
Experiment l IUPAC Recommendations for the Presentation of NMR Data for
Publication in Chemical Journals l Dublin Core Metadata Set - resource description for publications
and semantic web l Experimental Descriptions in National Pharmacopiae l USEPA/USACE SEDD – Staged Electronic Data Deliverable
What is a Data Authority? n An individual, group, organization, or body—
l Who decides what result metadata are required to accompany the result data from an experiment in order to make the resulting dataset fit for the purpose of the experiment and
l Who enforces that decision.
n Examples of Data Authorities: l Consensus standards bodies l Accreditation agencies l Individual researcher policies on recording experimental results l Major Professor/Group Leader policies on experimental data l QC/QA group requirement policies l Journal editors and editorial boards l Pharmaceutical company policies on experimental results l FDA polices for experimental results supporting NDAs l EPA requirements for environmental analysis reporting
What AnIML Standards Are Not
n AnIML standards are not proscriptive and do not contain lists of data and metadata elements and attributes that are required for recording, reporting, exporting, importing, and archiving the result data and result metadata outcomes from analytical chemistry experiments.
n With very few exceptions, the AnIML standards do not require the inclusion of any specific data or metadata elements and attributes.
n Instead, AnIML dictates how data and metadata, if present, must be represented and located within a dataset.
What are Result Data? n Result Data are the Qualitative and Quantitative
Outcomes from an Experiment.
What are Result Metadata?
n Result Metadata are the Data about the Result Data that Describe Parameters, Attributes, and Conditions Relating to the Result Data, their Collection, and their Manipulation.
What is a Dataset? n A Dataset is the Combination of the Reported Result
Data and Attending Result Metadata from Carrying Out an Experiment or Experiments.