Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational...

41
1 Genoscope, Network Modeling Team ([email protected] ) - Pathway Tools User’s Group meeting Geneva 12/2005 Cyclone: the development of Cyclone, an additional API to query, manipulate BioCyc information in an Object Oriented manner Part two

Transcript of Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational...

Page 1: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

1Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonethe development of Cyclone an additional

API to query manipulate BioCyc information in an Object Oriented manner

Part two

2Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ways to access biocyc data

bull Graphical Interfacendash Webndash PathwayTools Interface

bull Pathway tools serverndash Lisp apindash PerlCycndash JavaCyc

bull Output filesndash text files attribut -value or tab-delimitedndash sbmlndash biopax

bull Biowarehousendash SQL

3Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ways to access biocyc data Lisp Api

bull Need to know Lispbull Low level mySQL database schema SQL queries almost impossible-

bull Lisp is Pathway-tools native language bull Multi-user editing bull Access to a transaction log of all PGDB edits

+

Ocelot

Oracle

MySQL

Lisp

pathway tools

server oct 2005

4Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ways to access biocyc data JavaCyc and PerlCyc Api

Ocelot

Oracle

MySQL

Lisp

pathway tools

server

JavaCyc

PerlCyc

bull No implementation of the GFP objects with Perl or Javabull Need to run a special socket serverbull Only one such connection can be openedbull Run on Unix only

-

+ bullA first approach to access Lisp fonction throw java or perl

5Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse Api

Ocelot

Oracle

MySQL

BioCyc

loader

Biowarehouse

MySQL

bull unidirectional no modification can be loaded back into biocycbull loss of information in regards of BioCyc model ( transcription unit citation compound-pathway classification no multifun hellip)bull querying is a bit tedious as you will see

bull relational schema corresponding to biological conceptsbull a solution to query BioCyc with the integration of external databull simple installation bull fast data upload

-

+

SQL

6Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Our Initial Motivation

1 Overcome some limitations of BioCycndash dependency on the OS (Unix)ndash no concurrent accessndash memory management (load all pgdbs in memory)

2 Manipulate objects corresponding to biological concepts

3 No loss of info with respect to Biocyc

4 Add import export features

5 Benefit from existing Java tools packages

before released 95

7Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What is Cyclone

bull Cyclone is an Object Oriented Cyc (api design to manipulate biocycbiological object )

bull Cyclone is a pipeline

bull Cyclone is a relational database in your favorite RDBMS

bull Cyclone is an import export of Biocyc in xml or in your database

8Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From Frame hellip hellip to Java classes

CycloneBioCycFrame

SlotValue

Annotation1

Annotation1Value

RXN-1

Left

Glucose

Compartment

Cytoplasm

SlotValue

Parents

GeneralizedReactions

Left

Value=GlucoseCompartment=Cytoplasmhellip

GeneralizedReactions

Id=RXN-1hellip

Chemicals

Id=glucosehellip

Slot hellip

Distinct Frame-Slot-Parents ClassSlot Association classSlot Value Attribut valueAnnotation Attribut nameAnnotationvalueAttribut value

Mapping rules

9Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc Rdbms

Schema Generator

Mappingrules

JaxB

XmlTools

XmlSchema

Hibernate

ORmapper

FrameClassesFrame

ClassesFrame

Classes

JAVAClasses

Cyclonersquos pipeline - initial step code and database generation

10Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc RDBMSJAVAObject

Hibernate

ORmapper

Oo FRSTranslator

Bio2Oo

Oo2Bio

FrameInstanceFrame

InstanceFrame

Instance

Cyclonersquos pipeline - daily step Data Synchronization

11Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

RDBMS

Hibernate

Xml

Jaxb

GenoscopeSoftware toolsData Files

MySQL DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xmlmapping

SchemaGenerator

Oo FRSTranslator

ObjectModel

XmlTools

Cyclone

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 2: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

2Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ways to access biocyc data

bull Graphical Interfacendash Webndash PathwayTools Interface

bull Pathway tools serverndash Lisp apindash PerlCycndash JavaCyc

bull Output filesndash text files attribut -value or tab-delimitedndash sbmlndash biopax

bull Biowarehousendash SQL

3Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ways to access biocyc data Lisp Api

bull Need to know Lispbull Low level mySQL database schema SQL queries almost impossible-

bull Lisp is Pathway-tools native language bull Multi-user editing bull Access to a transaction log of all PGDB edits

+

Ocelot

Oracle

MySQL

Lisp

pathway tools

server oct 2005

4Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ways to access biocyc data JavaCyc and PerlCyc Api

Ocelot

Oracle

MySQL

Lisp

pathway tools

server

JavaCyc

PerlCyc

bull No implementation of the GFP objects with Perl or Javabull Need to run a special socket serverbull Only one such connection can be openedbull Run on Unix only

-

+ bullA first approach to access Lisp fonction throw java or perl

5Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse Api

Ocelot

Oracle

MySQL

BioCyc

loader

Biowarehouse

MySQL

bull unidirectional no modification can be loaded back into biocycbull loss of information in regards of BioCyc model ( transcription unit citation compound-pathway classification no multifun hellip)bull querying is a bit tedious as you will see

bull relational schema corresponding to biological conceptsbull a solution to query BioCyc with the integration of external databull simple installation bull fast data upload

-

+

SQL

6Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Our Initial Motivation

1 Overcome some limitations of BioCycndash dependency on the OS (Unix)ndash no concurrent accessndash memory management (load all pgdbs in memory)

2 Manipulate objects corresponding to biological concepts

3 No loss of info with respect to Biocyc

4 Add import export features

5 Benefit from existing Java tools packages

before released 95

7Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What is Cyclone

bull Cyclone is an Object Oriented Cyc (api design to manipulate biocycbiological object )

bull Cyclone is a pipeline

bull Cyclone is a relational database in your favorite RDBMS

bull Cyclone is an import export of Biocyc in xml or in your database

8Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From Frame hellip hellip to Java classes

CycloneBioCycFrame

SlotValue

Annotation1

Annotation1Value

RXN-1

Left

Glucose

Compartment

Cytoplasm

SlotValue

Parents

GeneralizedReactions

Left

Value=GlucoseCompartment=Cytoplasmhellip

GeneralizedReactions

Id=RXN-1hellip

Chemicals

Id=glucosehellip

Slot hellip

Distinct Frame-Slot-Parents ClassSlot Association classSlot Value Attribut valueAnnotation Attribut nameAnnotationvalueAttribut value

Mapping rules

9Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc Rdbms

Schema Generator

Mappingrules

JaxB

XmlTools

XmlSchema

Hibernate

ORmapper

FrameClassesFrame

ClassesFrame

Classes

JAVAClasses

Cyclonersquos pipeline - initial step code and database generation

10Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc RDBMSJAVAObject

Hibernate

ORmapper

Oo FRSTranslator

Bio2Oo

Oo2Bio

FrameInstanceFrame

InstanceFrame

Instance

Cyclonersquos pipeline - daily step Data Synchronization

11Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

RDBMS

Hibernate

Xml

Jaxb

GenoscopeSoftware toolsData Files

MySQL DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xmlmapping

SchemaGenerator

Oo FRSTranslator

ObjectModel

XmlTools

Cyclone

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 3: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

3Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ways to access biocyc data Lisp Api

bull Need to know Lispbull Low level mySQL database schema SQL queries almost impossible-

bull Lisp is Pathway-tools native language bull Multi-user editing bull Access to a transaction log of all PGDB edits

+

Ocelot

Oracle

MySQL

Lisp

pathway tools

server oct 2005

4Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ways to access biocyc data JavaCyc and PerlCyc Api

Ocelot

Oracle

MySQL

Lisp

pathway tools

server

JavaCyc

PerlCyc

bull No implementation of the GFP objects with Perl or Javabull Need to run a special socket serverbull Only one such connection can be openedbull Run on Unix only

-

+ bullA first approach to access Lisp fonction throw java or perl

5Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse Api

Ocelot

Oracle

MySQL

BioCyc

loader

Biowarehouse

MySQL

bull unidirectional no modification can be loaded back into biocycbull loss of information in regards of BioCyc model ( transcription unit citation compound-pathway classification no multifun hellip)bull querying is a bit tedious as you will see

bull relational schema corresponding to biological conceptsbull a solution to query BioCyc with the integration of external databull simple installation bull fast data upload

-

+

SQL

6Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Our Initial Motivation

1 Overcome some limitations of BioCycndash dependency on the OS (Unix)ndash no concurrent accessndash memory management (load all pgdbs in memory)

2 Manipulate objects corresponding to biological concepts

3 No loss of info with respect to Biocyc

4 Add import export features

5 Benefit from existing Java tools packages

before released 95

7Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What is Cyclone

bull Cyclone is an Object Oriented Cyc (api design to manipulate biocycbiological object )

bull Cyclone is a pipeline

bull Cyclone is a relational database in your favorite RDBMS

bull Cyclone is an import export of Biocyc in xml or in your database

8Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From Frame hellip hellip to Java classes

CycloneBioCycFrame

SlotValue

Annotation1

Annotation1Value

RXN-1

Left

Glucose

Compartment

Cytoplasm

SlotValue

Parents

GeneralizedReactions

Left

Value=GlucoseCompartment=Cytoplasmhellip

GeneralizedReactions

Id=RXN-1hellip

Chemicals

Id=glucosehellip

Slot hellip

Distinct Frame-Slot-Parents ClassSlot Association classSlot Value Attribut valueAnnotation Attribut nameAnnotationvalueAttribut value

Mapping rules

9Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc Rdbms

Schema Generator

Mappingrules

JaxB

XmlTools

XmlSchema

Hibernate

ORmapper

FrameClassesFrame

ClassesFrame

Classes

JAVAClasses

Cyclonersquos pipeline - initial step code and database generation

10Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc RDBMSJAVAObject

Hibernate

ORmapper

Oo FRSTranslator

Bio2Oo

Oo2Bio

FrameInstanceFrame

InstanceFrame

Instance

Cyclonersquos pipeline - daily step Data Synchronization

11Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

RDBMS

Hibernate

Xml

Jaxb

GenoscopeSoftware toolsData Files

MySQL DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xmlmapping

SchemaGenerator

Oo FRSTranslator

ObjectModel

XmlTools

Cyclone

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 4: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

4Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ways to access biocyc data JavaCyc and PerlCyc Api

Ocelot

Oracle

MySQL

Lisp

pathway tools

server

JavaCyc

PerlCyc

bull No implementation of the GFP objects with Perl or Javabull Need to run a special socket serverbull Only one such connection can be openedbull Run on Unix only

-

+ bullA first approach to access Lisp fonction throw java or perl

5Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse Api

Ocelot

Oracle

MySQL

BioCyc

loader

Biowarehouse

MySQL

bull unidirectional no modification can be loaded back into biocycbull loss of information in regards of BioCyc model ( transcription unit citation compound-pathway classification no multifun hellip)bull querying is a bit tedious as you will see

bull relational schema corresponding to biological conceptsbull a solution to query BioCyc with the integration of external databull simple installation bull fast data upload

-

+

SQL

6Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Our Initial Motivation

1 Overcome some limitations of BioCycndash dependency on the OS (Unix)ndash no concurrent accessndash memory management (load all pgdbs in memory)

2 Manipulate objects corresponding to biological concepts

3 No loss of info with respect to Biocyc

4 Add import export features

5 Benefit from existing Java tools packages

before released 95

7Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What is Cyclone

bull Cyclone is an Object Oriented Cyc (api design to manipulate biocycbiological object )

bull Cyclone is a pipeline

bull Cyclone is a relational database in your favorite RDBMS

bull Cyclone is an import export of Biocyc in xml or in your database

8Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From Frame hellip hellip to Java classes

CycloneBioCycFrame

SlotValue

Annotation1

Annotation1Value

RXN-1

Left

Glucose

Compartment

Cytoplasm

SlotValue

Parents

GeneralizedReactions

Left

Value=GlucoseCompartment=Cytoplasmhellip

GeneralizedReactions

Id=RXN-1hellip

Chemicals

Id=glucosehellip

Slot hellip

Distinct Frame-Slot-Parents ClassSlot Association classSlot Value Attribut valueAnnotation Attribut nameAnnotationvalueAttribut value

Mapping rules

9Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc Rdbms

Schema Generator

Mappingrules

JaxB

XmlTools

XmlSchema

Hibernate

ORmapper

FrameClassesFrame

ClassesFrame

Classes

JAVAClasses

Cyclonersquos pipeline - initial step code and database generation

10Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc RDBMSJAVAObject

Hibernate

ORmapper

Oo FRSTranslator

Bio2Oo

Oo2Bio

FrameInstanceFrame

InstanceFrame

Instance

Cyclonersquos pipeline - daily step Data Synchronization

11Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

RDBMS

Hibernate

Xml

Jaxb

GenoscopeSoftware toolsData Files

MySQL DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xmlmapping

SchemaGenerator

Oo FRSTranslator

ObjectModel

XmlTools

Cyclone

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 5: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

5Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse Api

Ocelot

Oracle

MySQL

BioCyc

loader

Biowarehouse

MySQL

bull unidirectional no modification can be loaded back into biocycbull loss of information in regards of BioCyc model ( transcription unit citation compound-pathway classification no multifun hellip)bull querying is a bit tedious as you will see

bull relational schema corresponding to biological conceptsbull a solution to query BioCyc with the integration of external databull simple installation bull fast data upload

-

+

SQL

6Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Our Initial Motivation

1 Overcome some limitations of BioCycndash dependency on the OS (Unix)ndash no concurrent accessndash memory management (load all pgdbs in memory)

2 Manipulate objects corresponding to biological concepts

3 No loss of info with respect to Biocyc

4 Add import export features

5 Benefit from existing Java tools packages

before released 95

7Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What is Cyclone

bull Cyclone is an Object Oriented Cyc (api design to manipulate biocycbiological object )

bull Cyclone is a pipeline

bull Cyclone is a relational database in your favorite RDBMS

bull Cyclone is an import export of Biocyc in xml or in your database

8Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From Frame hellip hellip to Java classes

CycloneBioCycFrame

SlotValue

Annotation1

Annotation1Value

RXN-1

Left

Glucose

Compartment

Cytoplasm

SlotValue

Parents

GeneralizedReactions

Left

Value=GlucoseCompartment=Cytoplasmhellip

GeneralizedReactions

Id=RXN-1hellip

Chemicals

Id=glucosehellip

Slot hellip

Distinct Frame-Slot-Parents ClassSlot Association classSlot Value Attribut valueAnnotation Attribut nameAnnotationvalueAttribut value

Mapping rules

9Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc Rdbms

Schema Generator

Mappingrules

JaxB

XmlTools

XmlSchema

Hibernate

ORmapper

FrameClassesFrame

ClassesFrame

Classes

JAVAClasses

Cyclonersquos pipeline - initial step code and database generation

10Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc RDBMSJAVAObject

Hibernate

ORmapper

Oo FRSTranslator

Bio2Oo

Oo2Bio

FrameInstanceFrame

InstanceFrame

Instance

Cyclonersquos pipeline - daily step Data Synchronization

11Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

RDBMS

Hibernate

Xml

Jaxb

GenoscopeSoftware toolsData Files

MySQL DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xmlmapping

SchemaGenerator

Oo FRSTranslator

ObjectModel

XmlTools

Cyclone

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 6: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

6Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Our Initial Motivation

1 Overcome some limitations of BioCycndash dependency on the OS (Unix)ndash no concurrent accessndash memory management (load all pgdbs in memory)

2 Manipulate objects corresponding to biological concepts

3 No loss of info with respect to Biocyc

4 Add import export features

5 Benefit from existing Java tools packages

before released 95

7Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What is Cyclone

bull Cyclone is an Object Oriented Cyc (api design to manipulate biocycbiological object )

bull Cyclone is a pipeline

bull Cyclone is a relational database in your favorite RDBMS

bull Cyclone is an import export of Biocyc in xml or in your database

8Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From Frame hellip hellip to Java classes

CycloneBioCycFrame

SlotValue

Annotation1

Annotation1Value

RXN-1

Left

Glucose

Compartment

Cytoplasm

SlotValue

Parents

GeneralizedReactions

Left

Value=GlucoseCompartment=Cytoplasmhellip

GeneralizedReactions

Id=RXN-1hellip

Chemicals

Id=glucosehellip

Slot hellip

Distinct Frame-Slot-Parents ClassSlot Association classSlot Value Attribut valueAnnotation Attribut nameAnnotationvalueAttribut value

Mapping rules

9Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc Rdbms

Schema Generator

Mappingrules

JaxB

XmlTools

XmlSchema

Hibernate

ORmapper

FrameClassesFrame

ClassesFrame

Classes

JAVAClasses

Cyclonersquos pipeline - initial step code and database generation

10Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc RDBMSJAVAObject

Hibernate

ORmapper

Oo FRSTranslator

Bio2Oo

Oo2Bio

FrameInstanceFrame

InstanceFrame

Instance

Cyclonersquos pipeline - daily step Data Synchronization

11Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

RDBMS

Hibernate

Xml

Jaxb

GenoscopeSoftware toolsData Files

MySQL DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xmlmapping

SchemaGenerator

Oo FRSTranslator

ObjectModel

XmlTools

Cyclone

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 7: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

7Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What is Cyclone

bull Cyclone is an Object Oriented Cyc (api design to manipulate biocycbiological object )

bull Cyclone is a pipeline

bull Cyclone is a relational database in your favorite RDBMS

bull Cyclone is an import export of Biocyc in xml or in your database

8Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From Frame hellip hellip to Java classes

CycloneBioCycFrame

SlotValue

Annotation1

Annotation1Value

RXN-1

Left

Glucose

Compartment

Cytoplasm

SlotValue

Parents

GeneralizedReactions

Left

Value=GlucoseCompartment=Cytoplasmhellip

GeneralizedReactions

Id=RXN-1hellip

Chemicals

Id=glucosehellip

Slot hellip

Distinct Frame-Slot-Parents ClassSlot Association classSlot Value Attribut valueAnnotation Attribut nameAnnotationvalueAttribut value

Mapping rules

9Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc Rdbms

Schema Generator

Mappingrules

JaxB

XmlTools

XmlSchema

Hibernate

ORmapper

FrameClassesFrame

ClassesFrame

Classes

JAVAClasses

Cyclonersquos pipeline - initial step code and database generation

10Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc RDBMSJAVAObject

Hibernate

ORmapper

Oo FRSTranslator

Bio2Oo

Oo2Bio

FrameInstanceFrame

InstanceFrame

Instance

Cyclonersquos pipeline - daily step Data Synchronization

11Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

RDBMS

Hibernate

Xml

Jaxb

GenoscopeSoftware toolsData Files

MySQL DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xmlmapping

SchemaGenerator

Oo FRSTranslator

ObjectModel

XmlTools

Cyclone

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 8: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

8Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From Frame hellip hellip to Java classes

CycloneBioCycFrame

SlotValue

Annotation1

Annotation1Value

RXN-1

Left

Glucose

Compartment

Cytoplasm

SlotValue

Parents

GeneralizedReactions

Left

Value=GlucoseCompartment=Cytoplasmhellip

GeneralizedReactions

Id=RXN-1hellip

Chemicals

Id=glucosehellip

Slot hellip

Distinct Frame-Slot-Parents ClassSlot Association classSlot Value Attribut valueAnnotation Attribut nameAnnotationvalueAttribut value

Mapping rules

9Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc Rdbms

Schema Generator

Mappingrules

JaxB

XmlTools

XmlSchema

Hibernate

ORmapper

FrameClassesFrame

ClassesFrame

Classes

JAVAClasses

Cyclonersquos pipeline - initial step code and database generation

10Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc RDBMSJAVAObject

Hibernate

ORmapper

Oo FRSTranslator

Bio2Oo

Oo2Bio

FrameInstanceFrame

InstanceFrame

Instance

Cyclonersquos pipeline - daily step Data Synchronization

11Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

RDBMS

Hibernate

Xml

Jaxb

GenoscopeSoftware toolsData Files

MySQL DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xmlmapping

SchemaGenerator

Oo FRSTranslator

ObjectModel

XmlTools

Cyclone

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 9: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

9Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc Rdbms

Schema Generator

Mappingrules

JaxB

XmlTools

XmlSchema

Hibernate

ORmapper

FrameClassesFrame

ClassesFrame

Classes

JAVAClasses

Cyclonersquos pipeline - initial step code and database generation

10Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc RDBMSJAVAObject

Hibernate

ORmapper

Oo FRSTranslator

Bio2Oo

Oo2Bio

FrameInstanceFrame

InstanceFrame

Instance

Cyclonersquos pipeline - daily step Data Synchronization

11Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

RDBMS

Hibernate

Xml

Jaxb

GenoscopeSoftware toolsData Files

MySQL DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xmlmapping

SchemaGenerator

Oo FRSTranslator

ObjectModel

XmlTools

Cyclone

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 10: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

10Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc RDBMSJAVAObject

Hibernate

ORmapper

Oo FRSTranslator

Bio2Oo

Oo2Bio

FrameInstanceFrame

InstanceFrame

Instance

Cyclonersquos pipeline - daily step Data Synchronization

11Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

RDBMS

Hibernate

Xml

Jaxb

GenoscopeSoftware toolsData Files

MySQL DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xmlmapping

SchemaGenerator

Oo FRSTranslator

ObjectModel

XmlTools

Cyclone

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 11: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

11Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

RDBMS

Hibernate

Xml

Jaxb

GenoscopeSoftware toolsData Files

MySQL DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

Ocelot Oracle MySQL

JavaCyc

Lisp - Pathway Tools

Object - relational mapping Object - xmlmapping

SchemaGenerator

Oo FRSTranslator

ObjectModel

XmlTools

Cyclone

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 12: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

12Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Java Object Api matching biological concept

in your favorite ide access with one click to all functions of Cyclone objectsdevelopp your own business methods

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 13: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

13Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 (defun atp-inhibits () We check every instance of the class

2 (loop for x in (get-class-all-instances |Enzymatic-Reactions|) Test for whether the INHIBITORS-ALL slot contains the

compound frame ATP3 when (member-slot-value-p x INHIBITORS-ALL ATP)

Whenever the test is positive we collect the value of theslot ENZYME The collected values are returned as a list once

the loop terminates4 collect (get-slot-value x rsquoENZYME))5 )

invoking the query6 (select-organism org-id rsquoECOLI)7 (atp-inhibits)

Lisp Find all enzym for which ATP is an inhibitor

OQL is a powerful and easy-to-use SQL-like query language with special featuresdealing with complex objects values and methods

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 14: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

14Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

1 use perlcyc2 my $cyc = perlcyc -gt new(ECOLI)3 my enzrxns = $cyc -gt get_class_all_instances(|Enzymatic-

Reactions|) We check every instance of the class

4 foreach my $er (enzrxns) We test for whether the INHIBITORS-ALL slot contains the

compound frame ATP5 my $bool = $cyc -gt member_slot_value_p($erInhibitors-

All Atp)6 if ($bool)

Whenever the test is =0 we collect the value of the slot ENZYME The results are printed in theterminal

7 my $enz = $cyc -gt get_slot_value($er Enzyme)8 print STDOUT $enzn9 10

Beneficits of using Cyclone Query in OQL (Object Query Language)

PerlCyc Find all enzym for which ATP is an inhibitor

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 15: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

15Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 import javautil 2 public class JavacycSample 3 public static void main(String[] args) 4 Javacyc cyc = new Javacyc(ECOLI) 5 ArrayList enzrxns = cycgetClassAllInstances(|Enzymatic-

Reactions|) 6 for (int i = 0 i lt enzrxnssize() i++) 7 String er = (String)enzrxnsget(i) 8 boolean bool = cycmemberSlotValueP(er Inhibitors-

AllAtp) 9 if (bool) 10 String enz = cycgetSlotValue(er Enzyme)9 Systemoutprintln(enz) 11 12 13 14

JavaCyc Find all enzym for which ATP is an inhibitor

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 16: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

16Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT DISTINCT DBIDxid2 FROM DBID Protein EnzymaticReaction Chemical DataSet

EnzReactionInhibitorActivator3 WHERE DataSetname=EcoCyc4 AND DataSetwid=EnzymaticReactiondatasetwid5 AND EnzymaticReactionproteinwid = Proteinwid6 AND EnzymaticReactionwid =

EnzReactionInhibitorActivatorenzymaticreactionwid7 AND EnzReactionInhibitorActivatorcompoundwid=Chemicalwid8 AND EnzReactionInhibitorActivatorinhibitoractivate=I9 AND Chemicalname=ATP10 AND DBIDotherwid = Proteinwid

Biowarehouse Find all enzym for which ATP is an inhibitor

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 17: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

17Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Beneficits of using Cyclone Query in OQL (Object Query Language)

1 SELECT er FROM ENZYMATICREACTIONS er2 LEFT JOIN erINHIBITORSALLInternal inhibitors3 WHERE erORGANISMVALUE LIKE organism

AND 4 Atp = inhibitorsVALUE

Cyclone Find all enzym for which ATP is an inhibitor

HQL is fully object-oriented understanding notions like inheritence polymorphism and association

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 18: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

18Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Comparative Matrix

YesYesNoIndependent of Pathway tools

Yes YesYes No-Import Export to BioCyc

full (- class)not full-BioCycCompatibilty

Java-LispAPI

OQL SQLJava

SQLSQLLisp

(perljava)Query

mySQL Oracle Sybase PostgreSQL hellip

mySQLmySQLoracle

OcelotStoring System

Object modelRelational modelFRSRepresentationSystem

CycloneBiowarehouseBioCyc

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 19: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

19Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

What can you do with Cyclone Bioinformatics applications

bull Easy data extraction from BioCycndash to mix them with different types of data (clusters of transcription units pathways

complexes regulon db)ndash to build FBA models by

bull grouping together Pathways for map design in Flux Analyserbull assemble elementary bricks of the metabolic network based on reactions genes

compounds found in BioCyc

bull Link easily Biocyc to new systems ndash example Genoscopersquos LIMS use BioCyc compounds repository as a language

of media representationndash algorithm for inferrence protein complexes

bull Modify or implement new housekeeping functions in a PGDBndash add updates data from a given annotation database (eg multifun cellular

localization of genes products pubmed links) actualy not feasible with pathologic format

ndash delete annotate pathways known not to exist and keep those informationsunless new BioCyc updates

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 20: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

20Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Ongoing work - Roadmap

bull Extends the mapping rule between FRS and Java Object

bull Future work a software plateform dedicated to FBA models ndash Easy management of FBA models (define reactions objective

functionshellip)ndash Convenient handling of environmentsndash Extensible perturbation framework (systematic mutants)ndash Pluggable analysis modules (topological analyses phenotype

predictionshellip)

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 21: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

21Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Open questions

bull Are you convince of this interest of Cyclonendash access with java to biocyc objectndash store your PGDBs in your favorite rdbmsndash manipulate classes representing biological conceptndash use efficient mapping object -relational (Hibernate)ndash hellip

bull If yes to at least one how to deliver Cyclonendash jar filesndash eclipse projectndash eclipse pluginndash open a new sourceforge project

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 22: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

22Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Supplementary Slides

Cyclone

supplementaryslides

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 23: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

23Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Biowarehouse another solution to query biocyc data

supplementaryslides

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 24: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

24Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Language

ManipulatedConcept

ComplexeTypeElementAttribut

XSD

ClassAttributAssociation

JAVA

TableRowColumn

SQL

FrameSlotAnnotation

LISP

Pipeline for the creation of Cyclone

supplementaryslides

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 25: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

25Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application complexe inference

supplementaryslides

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 26: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

26Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction of graph

supplementaryslides

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 27: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

27Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Application extraction de clusters genes by pathway

pathway extracted from AcinetoCyc

zoom on genes involved in this pathway

supplementaryslides

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 28: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

28Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of xml schema to java world JAXB

Java Architecture for XML Binding (JAXB) provides a convenient way to bindan XML schema to a representation in Java code This makes it easy for you to incorporate XML data and processing functions in applications based on Java technology without having to know much about XML itself

httpjavasuncomwebservicesjaxb

supplementaryslides

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 29: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

29Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of java world to sql world HIBERNATE

httpwwwhibernateorg

Hibernate is a powerful ultra-high performance objectrelational persistence and query service for Java Hibernate lets you develop persistent classes following common Java idiom - including association inheritance polymorphism composition and the Java collections framework Hibernate allows you to express queries in its own portable SQL extension (HQL) as well as in native SQL or with Java-based Criteria and Example objects Unlike many other persistence solutions Hibernate does not hide the power of SQL or JDBC from you and guarantees that your investment in relational technology and knowledge is as valid as always

supplementaryslides

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 30: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

30Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone pipeline for the creation of hellip

Cyclone

FRSBioCyc

XML

XSD

Hibernate

SQL

JaxB

HiperJaxB

JavaCyc1

3 4

1a

0

Extension of JavaCyc getAnnot()Query of PGDB to build a xml schemaFrom xsd to java classesFrom java classes to hibernate mappingFrom hibernate mapping to sql design

X

Import all Frame of a PGDB

1b

1 Populate the mysql db and java classes

2a

2b

2 Upload

Save it in MySql

3 Querying amp modification of your PGDB

4 Upload your Cyclone into BioCyc

4a Extension of JavaCyc createFrame()

FLF Attention je confond cote pipeline et user

Made in GenoscopeSoftware tools

Data Files

supplementaryslides

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 31: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

31Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone rules for mapping the world of frame to the world of xml schema

(FRAMEID NIL ((SLOT SValue)((SLOT SValue Annotation Avalue))

)

ltxscomplexType name=ldquoFRAME-CLASSgtltxssequencegt

ltxselement name=FRAMEID type=xsstringgtltxselement name=ldquoSLOT type=ldquoSLOT-TYPE maxOccurs=unboundedgt

ltxssequencegtltxscomplexTypegt

ltxscomplexType name=ldquoSLOT-TYPEgtltxssequencegt

ltxselement name=ldquoSValue type=xsstringrdquo gtltxselement name=ldquoAnnotation type=xsstring minOccurs=0gt

ltxssequencegtltxscomplexTypegt

supplementaryslides

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 32: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

32Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Biocyc FRS into xml schema

supplementaryslides

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 33: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

33Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone GeneralizedReactions FRS into xml schema

supplementaryslides

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 34: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

34Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Sample of a frame in xml

PGDB

Frame

Slot for Inheritance management

Slot-SValue

Annot-AValue

supplementaryslides

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 35: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

35Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclone Open discussion - extension of biocyc ontology to capture mutant phenotype

How can we capture

bull growth mediumbull mutant strainbull growth phenotype

supplementaryslides

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 36: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

36Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCycrsquos existing database schema

char_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYVARCHAR(100)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

numeric_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYNUMERIC(2010)INT(4)

ltpkfkgtltpkgtltpkfkgtltpkgt

large_slot_values

kbinstanceslotnameslotvaluesequence

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARYTEXTINT(4)

ltpkfkgtltpkgtltpkfkgt

ltpkgt

frames

kbframetypedefinitionsequencenumparentstid

INT(4)VARCHAR(80) BINARYBITTEXTINT(4)INT(4)INT(4)

ltpkgtltpkgt

ltpkgt

class_supers

kbclasssuper

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

instance_types

kbinstancetype

INT(4)VARCHAR(80) BINARYVARCHAR(80) BINARY

ltpkgtltpkgtltpkgt

kbs

kbidkbnamekbdatalock_sessionlock_timelock_user

INT(4)VARCHAR(20)VARCHAR(255)INT(4)DATETIMEVARCHAR(20)

FK_REFERENCE_4

FK_REFERENCE_5

FK_REFERENCE_6

logtable

kbidtidopnnoseqnoframelogrec

INT(4)INT(4)INT(4)INT(4)VARCHAR(80)VARCHAR(255)

ltpkfk1fk2gtltpkfk1fk2gtltpkgtltpkgtltfk2gt

txntable

kbidtiduseridnopnsbegin_timestampmerge_timestamp

INT(4)INT(4)VARCHAR(40)INT(4)INT(4) UNSIGNEDINT(4) UNSIGNED

ltpkgtltpkgt

deleted_frames

kbidframetid

INT(4)VARCHAR(80) BINARYINT(4)

ltpkfkgtltpkgtltpkfkgt

Managing multiuser concurrent editingFrame Model Representation Storage

History of edition

FK_REFERENCE_1FK_REFERENCE_2FK_REFERENCE_3

indexed_slots

kbslotname

INT(4)VARCHAR(80) BINARY

ltpkgtltpkgt

bull fast data access bull multi-user editing bull acess to a transaction log of all PGDB edits

bull Donrsquot allow to query biocyc outside pathway-tools softwarebull No tables linking to each biology concept-

+

supplementaryslides

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 37: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

37Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

From FRS to java classes

Generalized ReactionsBioCyc

PolymerSegments

Chemicals

hellip

Left

CompartmentCoefficient

PGDB FRAME SLOT

SValueAnnotation

FRAME

SLOT S VALUE

FACET ANNOTATION A VALUE

A Value

Cyclone

BioCyc

Value

hellip supplementaryslides

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 38: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

38Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

BioCyc and Cyclonersquos architecture

Cyclone

MySQL

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip

+

Hibernate

Xml

Jaxb

+MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysisMade in Genoscope

Software tools

Data Files

Ocelot Oracle MySQL

Pathway-toolsGeneric Frame Protocole

Xserver

Webserver

PerlCyc

JavaCyc

Lisp

DB2 PostgreSQLOracle Sybase HSQLMicrosoft SQL server hellip

HyperJaxb

supplementaryslides

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 39: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

39Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Cyclonersquos architecture

GenoscopeSoftware toolsData Files

Schema Generator

BioCyc(Un)marshaller

ObjectModel

XmlTools

Cyclone

Flux Analyser

JungCytoscape Modulo

MetabolicModel

hellip+ MaGe

GraphAnalysis

ClusterAnalysis

Flux balanceAnalysis

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 40: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

40Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

True query in OQL

public List getAllErInhibitedBy(String org String inhibitor) try

Query hql = HibernateUtilgetSession(keySession)createQuery(selectdistinct er from

frcnsgenoscopenemohypercycimplENZYMATICREACTIONSImpl er + join erINHIBITORSALLInternal inhibitors + where +ldquo erORGANISMVALUE like organism and +ldquo inhibitor = inhibitorsVALUE +ldquo )

hqlsetString(organism org)hqlsetString(inhibitor inhibitor)return hqllist()

catch (HibernateException e) eprintStackTrace()

return null

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())

Page 41: Cyclone - SourceForgenemo-cyclone.sourceforge.net/Cyclone_Genova.pdf · • Cyclone is a relational database in your favorite RDBMS • Cyclone is an import / export of Biocyc in

41Genoscope Network Modeling Team (nemogenoscopecnsfr) - Pathway Tools Userrsquos Group meeting Geneva 122005

Java APIpackage frcnsgenoscopenemohypercyccycsacineto

public class Acineto public static void main(String[] args)

ENZYMATICREACTIONSDAO erdao = new ENZYMATICREACTIONSDAO(hypercyc)List r = erdaogetAllErInhibitedBy(ecoliATP)ENZYMATICREACTIONS erfor(int i=0 i lt rsize() i++)

er = (ENZYMATICREACTIONS)rget(i)Systemoutprintln(

ergetFRAME()+t+((ENZYME)ergetENZYME()get(0))getVALUE())