Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The...

21
Modeling a microbial community and biodiversity assay with OBI and PCO: the gains of a modular approach ICBO2014, in Houston Oct 6-9 Philippe Rocca-Serra, Ramona Walls, Jacob Parnell, Rachel Gallery, Jie Zheng, Susanna Assunta Sansone and Alejandra Gonzalez-Beltran

Transcript of Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The...

Page 1: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Modeling a microbial community

and biodiversity assay with OBI

and PCO: the gains of a modular

approach

ICBO2014, in Houston Oct 6-9Philippe Rocca-Serra, Ramona Walls, Jacob Parnell, Rachel Gallery, Jie

Zheng, Susanna Assunta Sansone and Alejandra Gonzalez-Beltran

Page 2: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Biodiversity in the

News

• Grim headlines

• True for many

Vertebrates species

• Mankind only now

starts to build tools

enabling true

exploration of diversity

Page 3: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Exploring the world biodiversity

• Game changing progress in sequencing technology

– Illumina

– Oxford Nanopore Minion

http://dx.doi.org/10.5524/100102

Page 4: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Microbial Diversity

Page 5: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Biodiversity studies with molecular

techniques

• Shotgun sequencing:

– Sequencing as much as possible (probing is

limited by sequencing depth available, the

rarer the species, the deeper the sequencing

needs to be)

• Targeted sequencing:

– Reliance on a ‘marker gene’ whose variability

will be used to estimate distance between

species

Page 6: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

‘Barcode’ as in Multiplexed

Libraries

Credits: http://rdp.cme.msu.edu/wiki/index.php/Pyrosequencing_Help

genomic DNA isolated from individual sample is

-fragmented (shearing)

-ligated to a unique short DNA tag (i.e called the barcode)

-PCR amplification and sequencing

-output of a single collection of reads which can be subsequently sorted

using the DNA short-hand by computational mean – deconvolution process

Page 7: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

‘Barcode’ as in Barcode of Life

Credits: http://www.barcodeoflife.org

Page 8: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Ambiguous Language

• What is a barcode or what is a barcoding experiment?

– Metaphors are impenetrable to computers.

– Need to make representation unambiguous

– Barcoding, meaning a technique for processing more samples in one go -> another word for multiplexing

– Barcoding, meaning the creation of a unique profile as a means to identify types of living things

Page 9: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Heaps of sequence data for

sure….but

• What is the value in

the absence of

accompanying

descriptors?

• Essential annotation

to ascertain identity

and origin, sampling

conditions and

rationale

Page 10: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Helping Data Management

• MIXS Guidelines checklist

• SRA xml schema, Genbank records…

• Tabular Templates for Data Collection

• Wealth of RDF conversion tools

– R2RML W3C data standards

• Using the same xml and same guidelines,

nevertheless ambiguities subsist

Page 11: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach
Page 12: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

ISA templates for Microbial

Diversity Studies

• Integrating MIXS checklist in the ISA

framework

• Mapping MIXS entities into SRA XML

schema

– Properties of sample

– Properties of sample processing

– Properties of resulting libraries

– Properties of data processing

Page 13: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Ambiguities: Barcoding

• Library Experiment Sample unicity

• Use Case: creation of libraries for

Bacteria,Fungi,Eukaryota with specific genes

(16sRNA, ITS, COI)

• ISA conversion to ENA:

– 1 sample -> 3 libraries

• SRA/ENA submission:

– 3 libraries -> 3 samples

Page 14: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Working with OBI, PCO,SO, CHEBI

Drawn using CMAPtools: http://cmap.ihmc.us

Page 15: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Working with OBI, PCO,SO, CHEBI

Drawn using CMAPtools: http://cmap.ihmc.us

Page 16: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

OBI-PCO based representation

• ‘targeted gene survey’

• has part some ‘library preparation’ (OBI_0000711)

• ‘polymerase chain reaction’ (OBI_0000415) is_part_of ‘library preparation’ (OBI_0000711)

• ‘polymerase chain reaction’(OBI_0000415)

• has_specified_input some ‘forward pcr primer’ (OBI_0000722)

• has_specified_input some ‘reverse pcr primer’ (OBI_0001951)

• has_specified_input some ‘multiplexing sequence identifier’

• has_specified_input some ‘DNA extract’ (OBI_0001051)

• ‘library preparation’ (OBI_0000711) ‘has_specified_output’ some ‘single fragment library’ (OBI_0000736)

• ‘library preparation’ (OBI_0000711) precedes ‘DNA sequencing’(OBI_0000626)

• ‘library sequence deconvolution’ is_preceded_by ‘DNA sequencing’(OBI_0000626)

• ‘library sequence deconvolution’ is_followed_by ‘(OBI_0200187)’

• ‘sequence analysis data transformation’ (OBI_0200187) has_specified_output some ‘data item’ (IAO_0000027) and is about ‘population quality’ (PCO_0000003)

Page 17: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Conclusions

• We have clarified the OWL representation of

several assays commonly used in biodiversity

studies.

• We have outlined good practice for serializing

biodiversity experimental process both using ISA,

SRA and RDF format

• We have shown how synergies obtained from

resources of the OBO Foundry can greatly benefit

fast development of fit for purpose tabular data

collection templates which greatly help compliance

with annotation standard guidelines.

Page 18: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Why does it matter?

• Correct sample size assessment

• Assessing independence of samples and

sampling events.

• Is it really possible to ascertain identity of

samples by solely relying a metadata?

• How can such uncertainties affect

downstream analysis / meta analysis?

Page 19: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Future directions

• Sample Collection Protocols and

Procedures as applied in biodiversity

studies (field studies, “Marine macrofauna

grab sampling method” and so forth)

• Clarify the reporting of actual results

• Keeping working with PCO and OBO

Foundry related efforts.

Page 20: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Acknowledgements

• Dr. Ramona Walls (iPlant, Uni of Arizona)

• Pr. Paula Mabee (Uni South Dakota)

• RCN: Phenotype Ontology Research Coordination Network , National Science Foundation (NSF-DEB-0956049), (2010 - 2015)

• Dr. Jie Zheng and OBI companions

• PCO coworkers and RCN workshop participants

• ISA Team

• You

Page 21: Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Foundry Ontologies: The Interoperability Gains of a Modular Approach

Acknowledgements 2