The Functional Genomics Experiment Object Model (FuGE)

21
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society

description

The Functional Genomics Experiment Object Model (FuGE). Andrew Jones, School of Computer Science, University of Manchester. MGED Society. What is FuGE?. Various groups have tried to fuse MAGE and PEDRo in the past Such a model would be difficult to manage - PowerPoint PPT Presentation

Transcript of The Functional Genomics Experiment Object Model (FuGE)

Page 1: The Functional Genomics Experiment Object Model (FuGE)

The Functional Genomics Experiment Object Model (FuGE)

Andrew Jones,School of Computer Science,

University of Manchester

MGED Society

Page 2: The Functional Genomics Experiment Object Model (FuGE)

What is FuGE?• Various groups have tried to fuse MAGE and

PEDRo in the past– Such a model would be difficult to manage

• FuGE is a model of the common components of functional genomics experiments

• Aims to help the development of data standards• Should allow some cross-compatibility between

different ‘omics experiments• Microarray & proteome standards will use parts

of FuGE for some data formats

Page 3: The Functional Genomics Experiment Object Model (FuGE)

So, what is FuGE?

• An object model in UML (close to 1st stable release)

• An XML Schema (in development)• A software API (will be created from UML)• FuGE use ontologies extensively, such as

MGED Ontology or its successor (FuGO)

Developed by members of MGED / PSI with input from cross-omics experimentalists e.g. RSBI

Page 4: The Functional Genomics Experiment Object Model (FuGE)

What is FuGE not…?

• Not an effort to create one data standard for all lab techniques– This problem is hard at technical level and v

hard getting agreement from all groups• Not a model for metabolomics metadata

– But it might help in the development of one– …and we would like to encourage input from

the metabolomics community

Page 5: The Functional Genomics Experiment Object Model (FuGE)

FuGE Structure

• 2 sections: Common and Bio• Common – components that aid the

development of a rich data standard– Protocols, external references, auditing and security

settings• Bio – biological specific components

– Biological (or chemical) materials, bio sequences– Summary of an investigation structure– References to data model specific to each domain

Page 6: The Functional Genomics Experiment Object Model (FuGE)

Protocols• Protocols have a set of ordered atomic actions

– Actions are user-entered text or ontology terms• Protocols can be associated with Software and

Equipment• Protocols, Software and Equipment can have a

set of defined Parameters• Mechanism for defining a standard protocol, and

an instance of a protocol (date, operator…)• Nested protocols can be defined for representing

complex procedures– An Action can be a reference to another Protocol

Page 7: The Functional Genomics Experiment Object Model (FuGE)

FuGE WorkflowMaterial

Treatment

Material

Material

Treatment

Material

Treatment

Material

Data Acquisition

Data

Data Transformation

Data

= Inputs and outputs of Protocols

= Instance of some Protocol

Page 8: The Functional Genomics Experiment Object Model (FuGE)

FuGE WorkflowMaterial

Treatment

Material

Material

Treatment

Material

Treatment

Material

Data Acquisition

Data

Data Transformation

Data

•Materials defined using terms from ontologies

•Treatments defined by Protocols

•Data represented in domain specific format

•FuGE is the “glue” for sticking components together

Page 9: The Functional Genomics Experiment Object Model (FuGE)

Other useful components• Each object can be tagged with audit info:

– Who made a change, when, what type of change• Security information:

– users, groups for accessing/changing data• Consistent mechanism for identifying objects

– Life sciences IDs (LSIDs) used to uniquely ID components

– Objects can be referenced across documents• Mechanism for linking to external databases,

literature refs and ontologies

Page 10: The Functional Genomics Experiment Object Model (FuGE)

Investigation model

• Stores a summary of the investigation to facilitate queries

• Purpose of investigation (hypothesis)• Design of the investigation

– e.g. strain differences, gene knockout, drug doses, time course

• Stores the important variables– Values from ontology e.g. gene names, units etc…

• Links from variables to relevant data items

Page 11: The Functional Genomics Experiment Object Model (FuGE)

Benefits of shared components

• Queries over common annotation– Samples, hypotheses, protocols

• Shared software for experimental annotation and analysis– Microarrays, proteomics and metabolomics (and other

experiments!) performed in same lab• Developing standards for each technique is a

hard problem– Shared resources could alleviate the problems (audit,

security, identifying objects, ontologies)

Page 12: The Functional Genomics Experiment Object Model (FuGE)

Using FuGE in Practice

1. Imports parts of UML or XML Schema and extend with domain-specific components

• Example: Attempting to integrate FuGE with our Manchester metabolomics database

2. Reference a FuGE entry for investigation structure and bio samples

3. Define ontologies and use FuGE as it is for experimental metadata

• This would not include a format for mass spec or NMR data, which would also be needed

Page 13: The Functional Genomics Experiment Object Model (FuGE)

Conclusions• FuGE was created to solve the general problem:

– What are the common requirements for a “functional genomics” data standard?

• MGED will use FuGE for generating MAGE version 2

• PSI evaluating FuGE for protein separation standard format

• FuGE-based systems being implemented by a number of organisations

• FuGE could help develop a metabolome format

http://fuge.sourceforge.net

Page 14: The Functional Genomics Experiment Object Model (FuGE)

Acknowledgements

• FuGE has been developed in collaboration with many groups, including:– Angel Pizarro (U Penn)– Paul Spellman (Lawrence Berkley)– Michael Miller (Rosetta)– Members of Fred Hutchinson CRC, Seattle– RSBI– Various other members of MGED and PSI

http://fuge.sourceforge.net

Page 15: The Functional Genomics Experiment Object Model (FuGE)

DescribableIdentifiable

Page 16: The Functional Genomics Experiment Object Model (FuGE)

Common.Description

•Many classes inherit from Describable•Link to Audit / Security details•URI and text description

Page 17: The Functional Genomics Experiment Object Model (FuGE)

Protocol

Page 18: The Functional Genomics Experiment Object Model (FuGE)

Audit

Page 19: The Functional Genomics Experiment Object Model (FuGE)

Investigation

Page 20: The Functional Genomics Experiment Object Model (FuGE)

Material

Page 21: The Functional Genomics Experiment Object Model (FuGE)

Common.Data

•Ordered set of Dimensions•Data stored in Matrix•Matrix must be extended with subclasses