MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC)...

25
MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

Transcript of MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC)...

Page 1: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

MMHCC Informatics

Providing Innovative and IntegrativeInformatics Solutions

Johnita Beasley (SAIC)Dana Zhang (SAIC)

Sharon Settnek (SAIC)

Page 2: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

2

Cancer Models Database (caMDB)

• The Cancer Models Database allows both intramural and extramural researchers to search and submit mouse models– All models submitted

by extramural researchers are curated to ensure data integrity

Page 3: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

3

Model Search

• A simple search is available to allow researchers to search by model name, tumor, organ system, tissue, or species

• An advanced search includes genetic descriptions, carcinogenic agents, the model phenotype, cell lines, and therapeutic approaches

Page 4: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

4

Search Results

• Model search results are displayed in table format– Model details can

be displayed by selecting the model descriptor link

– Specific Model components can be viewed by selecting each component Tab

Page 5: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

5

Image Constructs

• Search results include any associated image constructs and annotations– Full image views

with zoom and pan capabilities are available

Page 6: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

6

Model Submission

• Users can submit new models via the Add new model link

• Users can edit previously submitted models by selecting the model descriptor

• Users can clone or delete a model that they previously submitted

Page 7: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

7

Model Components• Submitting a new model involves entering model information

including:– General Information– Genetic Descriptions– Carcinogenic Agents– Publications– Histopathology– Therapeutic Approaches– Cell Lines– Images– Microarray Data

• Required fields appear in bold red text• Data is saved throughout the submission process

Page 8: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

8

Model Curation

• The Cancer Models review process is automated in the CMD Admin Function

– A model coordinator assigns users to review specific models

– Reviewers inform the coordinator of any recommendations and modify the status of the model

– The coordinator contacts the model originator concerning any modifications

Page 9: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

9

New Features

• In recent months we’ve added or enhanced several caMDB features to include:– Updating the delete and clone capabilities.

– Adding the Model Vocabulary

– Adding the ability to interface with the GEDP

– Adding the Comment capability

• To stay abreast of all new features, visit our “What’s New” link on the caMDB home page.

Page 10: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

10

Model Vocabulary

• The Enterprise Vocabulary System (EVS) is a set of services and resources that address NCI’s needs for controlled vocabulary

• The CMD utilizes EVS to access controlled vocabulary– Organ/Tissue– Diagnosis

Page 11: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

11

GEDP Interface

• User’s can now connect to the Gene Expression Data Portal (GEDP) to submit Microarray Data they’d like to be associated with a model.

Page 12: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

12

Comments

• This newly added feature will allow users of the database to comment on models and to enter additional data to models previously entered by other labs.

Page 13: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

13

caMDB Object Model

• The caMDB object model is represented with Unified Modeling Language (UML).

• The object model evolved through a use-case driven, architecture-centric, iterative, and incremental process.

• The object model establishes a standard set of genomic components related to an animal model.

• The object model is extensible and employs reuse (i.e. the caBIO API is used to provide access to other data sources).

Page 14: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

14

caMDB Object Modelgov.nih.nci .caBIO.bean.EVSInterface

<<Interface>>

gov.nih.nci .caBIO.bean.EVSInterface<<Interface>>

gov.nih.nci .caBIO.bean.TaxonInterface<<Interface>>

ApprovalInterface<<Interface>>

gov.nih.nci .caBIO.bean.AgentInterface<<Interface>>

Agent

Disease

GeneticAlteration

Organ

Role

ContactInfo

TreatmentSchedule

GeneDelivery

EnvironmentalFactor

InducedMutation

Therapy

1

+drug

1

0..1+treatmentSchedule

0..1

MicroArrayData

CellLine

1+organ

1

ApprovalStatus

TargetedModification

Image

Avai labili ty

1+availabi lity

11

+availabi lity

1

Histopathology

0..1+matastasisOf

0..1

0..*

0..*

+diagnoses

0..*

+histopathologies0..*

0..*+geneticAlteration 0..*

1 +organ1

GenomicSegment

Transgene

Publ ication

0..*

+publications

0..*

0..*

+publications

0..*

Species

Xenograft

1

+organ

1

1+species 1

1

+hostSpecies

1

Person

1..*

+roles

1..*

1+contactInfo1

CarcinogenicIntervention

1+intervention

1

0..1+protocol

0..1

1+geneDel ivery1

1

+environmentalFactor

1

AnimalModel

0..*

+inducedMutations

0..*

0..*+therapies

0..*

0..*+microarrayData

0..*0..*

+cel lLines

0..*

1

+status

1

0..*

+targetedModifications

0..*

0..*

+images

0..*

1+releaseDate

1

0..*

+histopathologies

0..*

0..*

+genomicSegments

0..*

0..*+transgenes

0..*

0..*+publications

0..*

1+species1

0..*+xenografts0..*

1

+principalInvestigator

1 1+submitter

1

0..*

+carcinogenicInterventions

0..*

SexDistribution

Phenotype

0..1

+phenotype

0..1

11

Page 15: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

15

caMDB API• The caMDB API provides a means of accessing

animal model data submitted via the Cancer Models Database (caMDB) application.

• The API is based on the caMDB Object Model• The caMDB objects, through their relationships,

simulate the behaviors of an animal model. The model components include:– Cell Lines– Histopathologies– Images– Genetic Descriptions (Transgenes, Genomic Segments, Targeted

Modifications, and Induced Mutations)– CarcinogenicInterventions– Theraputic Approaches– Microarray Data– Publications

Page 16: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

16

caMDB API

• The caMDB objects can simulate the behavior of actual genomic components, link to an animal model, such as genes, organs, diseases, etc. by accessing other genomic data sources like:

- caBIO

- Enterprise Vocabulary Services (EVS)

Page 17: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

17

caMDB API Architecture

External Java Apps

Clients Presentation Layer Object Layer Data Sources

Browsers

Other Apps

HTML/HTTP

XML/HTTP

Internal Java Apps

Web Server

Servlet Container

JSPs

Servlets

UI Bean

XML Builder

XSLT Engine

SOAP Engine

XML Docs

DTDsXSL

Style Sheet

RMI

URLs

Flat Files

DatabasesDatabases

Histopathologies Phenotypes

PublicationsCellLines

InducedMutations

Object Managers

JDBC

HTTP

FTP

SOAP

Data Access Objects

Transgenes

Therapies Other

Domain Objects

Page 18: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

18

caMDB API Architecture• The caMDB API models the n-tiered architecture of caBIO

with client interfaces, server components, back-end objects and data sources

• Clients (browsers, applications) can receive information (HTML and XML) from back-end objects over HTTP– Client applications can also communicate with back-end

objects via Java RMI (Java applications)– Non-Java based applications will communicate via SOAP

• Server components communicate with back-end objects via Java RMI

• Back-end objects communicate directly with data sources (database, URLs, flat files)

Page 19: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

19

Image Portal• The NCICB has developed an image portal to

allow researchers to search and submit rodent and human images with annotations

Page 20: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

20

Image Portal

• Image annotations may include a detailed description, species, organ, diagnosis, strain, and image dimensions

Page 21: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

21

Imaging Technologies• The NCICB is investigating imaging technologies to facilitate

efficient image retrieval and annotation integration• Imaging technologies include JPEG 2000, DICOM 3, and

Image Content Servers– JPEG 2000 is a standard currently under development that defines a set of lossless

(bit-preserving) and lossy compression methods for coding continuous-tone, bi-level, gray-scale, or color digital still images

– DICOM 3 (Digital Imaging and Communications in Medicine) is the industry standard for the transferal of radiology images and other medical information between computers

• DICOM- SR (Structured Reporting) is a UML and XML representation of the DICOM specification

– Image Content Servers provide a mechanism to speed image transmission and improve image quality

• NCICB is exploring interfacing with the MIRC Project (RSNA)

Page 22: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

22

Image Annotation Standards• To facilitate image

sharing, a “minimal” set of image annotations are necessary

• Image annotations should leverage existing standards and may be derived from use cases for image retrieval and analysis

• Annotations should include parent-child relationships

Page 23: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

23

Image Object Model

Page 24: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

24

caIMAGE Architecture

BrowserClient

BrowserClient

caIMAGEWeb

Application Server

caIMAGEWeb

Application Server

LizardTech Image Content

Server

LizardTech Image Content

Server

Network File

System

Network File

SystemCaIMAGE Database

CaIMAGE Database

Lizard Image

Converter

Lizard Image

Converter

ImagesImage Annotations

Image Annotations

Images

Images

Sid Files

Images

Page 25: MMHCC Informatics Providing Innovative and Integrative Informatics Solutions Johnita Beasley (SAIC) Dana Zhang (SAIC) Sharon Settnek (SAIC)

25

MMHCC Links

• EMICE Websitehttp://emice.nci.nih.gov

• Cancer Models Database (caMDB)http://cancermodels.nci.nih.gov

• Cancer Image Portal (caImage)http://cancerimages.nci.nih.gov

• caMDB API (including JavaDocs and Object Model)http://emice.nci.nih.gov/MMHCC/mmhcc_organization/members/bioinformatics