C-DEM: A Multi-Modal Query System for Drosophila Embryo Databases
description
Transcript of C-DEM: A Multi-Modal Query System for Drosophila Embryo Databases
C-DEM: A Multi-Modal Query System for Drosophila Embryo Databases
Fan Guo, Lei Li, Eric Xing, Christos FaloutsosCarnegie Mellon University
{fanguo, leili, epxing, [email protected]}
1http://www.db.cs.cmu.edu:8080/cdem/demo.html
Background
• Fruit-fly development in genetic study:– Genes controlling the body plan and patterning organs are
similar to higher animals including human.
• Objective: a framework for applying data mining techniques to assist biological research.
2
The Graph Representation
3
Images
Genes
Keywords
• Image-layer edges: nearest neighbors in feature space
embryonic hindgut
Proximity Measure
• Random Walk with Restart– Starting from a node s;– Randomly walk to a neighbor,
with probability 1-c;– Restart at s, with probability c;– Compute the steady-state
probability vector.– Complexity:
O(E), but faster methods exist (Tong et al., ICDM’06)
4
• Random Walk with Restart– Starting from a node s– Randomly walk to a neighbor, with probability 1-c– Restart at s, with probability c
Proximity Measure
• Computing the Steady-State Probability
Proximity Measure
Desired probability vector
Adjacency matrix Vector w/ non-zero entry for restart nodes
Complexity: O(E), but faster methods exist (Tong et al., ICDM’06)
Multi-Modal Query Results
7
2D Expression Images
GenesAnnotation Terms
More Mining Tasks• Image Auto-Caption• Gene function identification
8
Related Work• Berkeley Drosophila Genome Project (www.fruitfly.org)
• FlyExpress (www.flyexpress.net)
• Berkeley Drosophila Transcription Network Project(bdtnp.lbl.gov)
9
System Architecture
10
Browser-based UI
Tomcat Web ServerJSP Application
Computing Engine
Queries Result Pages
ResultsRemote Function
Calls
HTTP
RMI