January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information...

50
January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology

Transcript of January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information...

Page 1: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

January 19, 2011

Sherri de Coronado, Semantic ServicesCenter for Bioinformatics and Information Technology

Page 2: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Interoperability

• Interoperability: The ability of a system...to use the parts or equipment of

another system

Source: Merriam-Webster web site

• Interoperability:The ability of two or more systems or components to

exchange information and to use the information that has been exchanged.

Source: IEEE Standard Computer Dictionary, 1990 SemanticSemanticinteroperabilityinteroperability

SyntacticSyntacticinteroperabilityinteroperability

Page 3: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCI Design for Interoperability

- Common API Integration: Part of the syntactic component of interoperability.

- Vocabularies/Terminologies/Ontologies: Provides semantic interoperability, used to record information in and about systems and data.

- Information Models: Describe the structure of the data maintained in a system, such as a Tissue Repository.

- Data Elements: or Metadata, provides a description of the meaning of recorded information in addition to its value. For example “Patient Temperature” would describe both a meaning and what constitutes a valid value for patient temperature (such as a number range measured in degrees Fahrenheit).

Page 4: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Extending Interoperability Beyond the Enterprise

•cancer Biomedical Informatics Grid (caBIG)

- Shared infrastructure, applications and data- Enable cancer research community to focus on innovation and

move research from bench to bedside and back- Shared vocabulary, data elements, data models facilitate

information exchange- Interoperable applications developed to common standard- Making research data available for mining and integration

•Several new ARRA initiatives leveraging this infrastructure to extend interoperability principles to the broader healthcare community

Page 5: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Semantic Infrastructure Futures

Page 6: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Soup to Nuts Terminology Services for NCI & Collaborators

Page 7: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Background

Page 8: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

High Value Use Cases

Page 9: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

EVS Resources

Page 10: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCI Thesaurus (NCIt)

Page 11: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCI Thesaurus (3)

Page 12: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

What ‘s in NCIt ?

Page 13: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Semantic Diversity

plants fungus virus bacteriumeukaryote

archaeonanimalvertebratesamphibianbirdfishreptilemam

malhuman

embryonic structureanatomical abnormalityanatomical structure

medical device

laboratory testsbodyparts &organscongenital abnormality languageclinical drug

tissuesign or symptomsnucleic acidfindings

regulation or lawge

negeographic arearesearch activitycell s

Mental process

molecular sequencedisease or syndrome

neoplastic process

experimental model of disease

genetic function

therapeutic or preventative procedure

educational activitynatural phenomenonevent

behavior

family group

health care activityactivity organizationlaboratory procedurequantitative concept

element,ion,isotope

Page 14: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Terminology Subsets

Page 15: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

FDA-NCIMemorandum of Understanding

Page 16: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Scope of MOU (2)

Page 17: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCI-FDA Terminology Collaboration

Page 18: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Example: Structured Product Label

FOR IMMEDIATE RELEASEP05-80

November 2, 2005

Media Inquiries: Kristen Neese, 301-827-6242Consumer Inquiries: 888-INFO-FDA

FDA Announces the Use of New Electronic Drug Labels to Help Better Inform the Public and Improve Patient SafetyIn a continuing effort to use modern information technology to help inform the public and health care providers and to further improve patient safety, the Food and Drug Administration (FDA) today began requiring drug manufacturers to submit prescription drug label information to FDA in a new electronic format. This electronic format will allow healthcare providers and the general public to more easily access the product information found in the FDA-approved package inserts ("labels") for all approved medicines in the United States.

Pharmaceutical Companies must provide information for electronic labels to FDA using controlled terminology

Page 19: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

FDA Structured Product Labels

Page 20: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

SPL in NCIt

Page 21: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Concept details from Browser

Page 22: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Concept details from Browser (2)

Page 23: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

CDISC Terminology

Page 24: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Federal Register / Volume 71, No. 237 /Monday, December 11, 2006Federal Register / Volume 71, No. 237 /Monday, December 11, 2006

The Food and Drug Administration is proposing to amend the regulations governing the format in which clinical study data and bioequivalence data are required to be submitted for new drug applications (NDAs), biological license applications (BLAs), and abbreviated new drug applications (ANDAs). The proposal would revise our regulations to require that data submitted for NDAs, BLAs, and ANDAs, and their supplements and amendments be provided in an electronic format that FDA can process, review, and archive. The proposal would also require the use of standardized data structure, terminology, and code sets contained in current FDA guidance (the Study Data Tabulation Model (SDTM) developed by the Clinical Data Interchange Standards Consortium) to allow for more efficient and comprehensive data review.

Page 25: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCIthesaurushttp://ncit.nci.nih.gov

Search Box

Version information

Choices, choices...

Page 26: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Term search

Search on term - mg - 5 results

Page 27: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Code Search

6sources

Search on Code - 1 result

Page 28: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Concept Code:A unique, permanent identifier

Terms

TermSource

TermSource

Additional Source Data

Concept Code

mammal?spy?chemistry measure-ment?chocolate sauce?skin lesion?

Page 29: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Concept Code:A unique, permanent identifier (2)

Terms

TermSource

TermSource

Additional Source Data

Concept Code

Page 30: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Unambiguous Meaning

Semantic Type: Quantitative ConceptCode: C42539Definition: A unit of amount of substance, one of the seven base units of the International System of Units (Systeme International d'Unites, SI). It is the amount of substance that contains as many elementary units as there are atoms in 0.012 kg of carbon-12. When the mole is used, the elementary entities must be specified and may be atoms, molecules, ions, electrons, other particles, or specified groups of such particles.

Semantic Type: Neoplastic ProcessCode: C7570Definition: A neoplasm composed of melanocytes that usually appears as a dark spot on the skin.

Semantic Type: MammalCode: C14876Definition: A small, furry creature of the family Talpidae that lives underground and feeds on small invertebrates. The mole has tiny covered eyes that are believed to be able to distinguish night from day, and not much else.

Semantic Type: Occupation or DisciplineDefinition: [No use case for this term yet, but welcome CIA inquiries].

Semantic Type: Food or Food ProductDefinition: [No use case for this term yet, but welcome inquiries accompanied by samples].

Page 31: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Concept Relationships & Associations

Subset Associations:How concepts are "bundled"

Page 32: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCIt: Example Concept (1 of 2)

Preferred Name: Gastric Mucosa-Associated Lymphoid Tissue LymphomaCode: C5266Semantic Type: Neoplastic Process

Parent Concepts: Extranodal Marginal Zone B-Cell Lymphoma of Mucosa-Associated Lymphoid Tissue

Gastric Non-Hodgkin's Lymphoma

Synonyms & Gastric MALT LymphomaAbbreviations: Gastric MALToma(subset) MALT Lymphoma of the Stomach

MALToma of the StomachPrimary Gastric MALT LymphomaPrimary Gastric B-Cell MALT LymphomaPrimary MALT Lymphoma of the Stomach

Definition: A low grade, indolent B-cell lymphoma, usually associated with Helicobacter Pylori infection. Morphologically it is characterized by a dense mucosal atypical lymphocytic (centrocyte-like cell) infiltrate with often prominent lymphoepithelial lesions and plasmacytic differentiation. Approximately 40% of gastric MALT lymphomas carry the t(11;18)(q21;q21). Such cases are resistant to Helicobacter Pylori therapy.

Page 33: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Role Relationships (subset) for Gastric Mucosa-Associated Lymphoid Tissue Lymphoma:

Molecular abnormalities:Disease_May_Have_Cytogenetic_Abnormality: Trisomy 3Disease_May_Have_Cytogenetic_Abnormality: Trisomy 18Role group 1:

Disease_May_Have_Cytogenetic_Abnormality: t(11;18)(q21;q21)Disease_May_Have_Molecular_Abnormality: AP12-MLT Fusion Protein Expression

Histogenesis:Disease_Has_Normal_Cell_Origin: Post-Germinal Center Marginal Zone B-Lymphocyte

Pathology:Disease_Has_Abnormal_Cell: Centrocyte-Like CellDisease_May_Have_Abnormal_Cell: Neoplastic Monocytoid B-LymphocyteDisease_May_Have_Abnormal_Cell: Neoplastic Plasma CellDisease_May_Have_Finding: Lymphoepithelial Lesion

Anatomy:Disease_Has_Primary_Anatomic_Site: StomachDisease_Has_Normal_Tissue_Origin: Gut Associated Lymphoid Tissue

Clinical information:Disease_Has_Finding: Primary LesionDisease_May_Have_Finding: Indolent Clinical CourseDisease_May_Have_Associated_Disease: Hepatitis C

NCIt: Role Relationships (Gastric MALT Lymphoma)

Page 34: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

LexEVS Terminology Server

• Hosts multiple coding schemes/ terminologies including NCIt

• Uses LexGrid Model (now extended to comply with the draft CTS2 spec)

• OWL, RRF and other loaders to convert and load terminologies

• LexGrid 6.0 just released, adds value set, pick list and mapping capabilities

• Documentation, see: LexEVS on caBIG Vocabulary Knowledge Center Wiki

Page 35: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCI Metathesaurus

Page 36: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCI Metathesaurushttps://ncim.nci.nih.gov

3,600,000 terms76 Sources

1,400,000 concepts

Page 37: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCImetathesaurus

11 Sources

Choose your source

Page 38: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCITerm Browserhttp://nciterms.nci.nih.gov

Sources

Page 39: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

EVS Products & Services Are Open

• NCI Thesaurus is Open Content http://evs.nci.nih.gov/terminologies• NCI Metathesaurus is Mostly Open Source (See Each Source’s License)

http://ncim.nci.nih.gov/ncimbrowser/pages/source_help_info.jsf• NCI EVS Servers Are Freely Accessible

- On the Web:http://nciterms.nci.nih.govhttp://ncimeta.nci.nih.gov

- Via API: https://cabig.nci.nih.gov/tools/LexEVS_API https://cabig.nci.nih.gov/workspaces/Architecture/caGrid

• All Software Developed by NCI EVS is Public Open Source and Free for the Asking:

http://ncicb.nci.nih.gov/download/#ETools

Page 40: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Methods of Data Retrieval

Page 41: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCIt ftp sitehttp://evs.nci.nih.gov/ftp1

You can download the entire NCIt in various formats

Page 42: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Shared Content Standards

Page 43: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Consolidated Content Services

SNOMED CT®

FedMed

UCUM

Page 44: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

NCIt Editing Priorities for 2011

• Terminology associated with standardized case report forms of all kinds,

• Safety reporting (drug, device, food), • EHR related terminology for the caCIS project,• Terminology in support of the NCPDP SCRIPT standard

for e-prescribing, • Structured product labeling, • Nanotechnology,• Imaging

Page 45: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

LexEVS

• LexEVS is a collection of programmable interfaces and services that provide users with the ability to access controlled terminologies and value sets supplied by the NCI Enterprise Vocabulary Services (EVS) Project.

• Services support both the publishing and processing of terminology and value set content as defined by the Semantic Infrastructure.

Page 46: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

LexEVS Terminology Server: Ver 5.0

Includes the following components: - Java API - Java interface based on LexGrid 5.0 Object Model - REST/HTTP Interface - Offers an HTTP based query mechanism.

Results are returned in either XML or HTML formats - SOAP/Web Services Interface - Provides a programming language

neutral Service-Oriented Architecture (SOA) - Distributed LexBIG (DLB) API - A Java interface that relies on a

LexEVS Proxy and *Distributed LexEVS Adapter to provide remote clients access to the native LexEVS API

- LexEVS 5.0 Grid Service - An interface which uses the caGRID infrastructure to provide access to the native LexEVS API via he caGRID Services

Page 47: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

LexEVS 6.0 / CTS2 – What is CTS2?

• Common Terminology Services - Release 2 specifies a set of service interfaces to standardize necessary functional operations of a terminology service.- Administration- Search/ Query- Mapping Support- Authoring/ Maintenance

• Focused on extending existing Health Level 7 (HL7) Common Terminology Services (CTS) specification based on consensus requirements from the user community (including LexEVS users).

• Developed as an HL7 Service Functional Model (SFM); accepted as an HL7 draft standard for trial use (DSTU) and is currently an Object Management Group (OMG) RFP. OMG vote expected in June 2011

Page 48: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

What’s new in LexEVS 6.0

• LexEVS 6.0 will add comprehensive support for CTS 2 functionalities that are either partially supported or unsupported in LexEVS 5.1.

• Provide expanded support for value sets• Develop the ability to provide local extensions to code

sets• Provide expanded mapping ability among code sets• Develop other capabilities called for in the CTS 2

specification

Page 49: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

LexEVS 6.0 Updates (highlights)

• Association/Mapping Functionality

• Association Administrative Functionality

• Association Search / Query Functionality

• Association Author / Curation Functionality

Search / Query Functionality• Value Set Search / Query• Concept Domain Search / Query• Local Extension Search / Query

Authoring / Curation Functionality• Code System Authoring / Curation • Value Set Authoring / Curation • Concept Domain and Usage

Context Authoring / Curation

Page 50: January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology.

Contact Information