Genomes On Rails

123
Genomes on Rails has_many :sequences

description

Originally given at RailsConf, this talk outlines how the Wellcome Trust Sanger Institute is using Ruby and Rails as part of their new sequencing platform.

Transcript of Genomes On Rails

Page 1: Genomes On Rails

Genomes on Railshas_many :sequences

Page 2: Genomes On Rails

Hello

Page 3: Genomes On Rails

➊ Previously

➋ Production

➌ Process

Page 4: Genomes On Rails

➊ Previously

Page 5: Genomes On Rails

The human genome

15 years to decode

3 billion letters

Page 6: Genomes On Rails

$3 billion

Page 7: Genomes On Rails

$3 billion ++

Page 8: Genomes On Rails

Race for the prize

Page 9: Genomes On Rails
Page 10: Genomes On Rails
Page 11: Genomes On Rails

Open data

Page 12: Genomes On Rails

Open source

Page 13: Genomes On Rails

Perl

Page 14: Genomes On Rails

Lots of Perl

Page 15: Genomes On Rails

Lots of Perl~4500 modules

Page 16: Genomes On Rails

Onwards!

Page 17: Genomes On Rails

40 species

Page 18: Genomes On Rails
Page 19: Genomes On Rails
Page 20: Genomes On Rails
Page 21: Genomes On Rails

Map evolutionaryspace

Page 22: Genomes On Rails

Compare genomes

Page 23: Genomes On Rails

Compare genomes

compare species

Page 24: Genomes On Rails

Compare genomes

compare species

compare individuals

Page 25: Genomes On Rails

More Perl~1500 modules

Page 26: Genomes On Rails
Page 27: Genomes On Rails
Page 28: Genomes On Rails
Page 29: Genomes On Rails

Quantum leap!

Page 30: Genomes On Rails

1000 personal genomes

Page 31: Genomes On Rails

1000 personal genomes

beyond 23andme

Page 32: Genomes On Rails

Hypertension

Page 33: Genomes On Rails

Diabetes

Page 34: Genomes On Rails

Coronary heart disease

Page 35: Genomes On Rails

Bipolar disorder

Page 36: Genomes On Rails

Malaria

Page 37: Genomes On Rails

➋ Production

Page 38: Genomes On Rails

Register projects

Register samples

Sample prep

Sequencing

Analysis

Page 39: Genomes On Rails
Page 40: Genomes On Rails
Page 41: Genomes On Rails
Page 42: Genomes On Rails

Change!

Page 43: Genomes On Rails

Flexible data capture

Page 44: Genomes On Rails

Virtual fields

Page 45: Genomes On Rails

Sample

Name

Organism

Concentration

Page 46: Genomes On Rails

class Sample < ActiveRecord::Base has_many :descriptors has_many :descriptor_valuesend

Page 47: Genomes On Rails

Key value pairs

Page 48: Genomes On Rails

Faster than you’d think

Page 49: Genomes On Rails
Page 50: Genomes On Rails
Page 51: Genomes On Rails

Change!

Page 52: Genomes On Rails

Sample

Name

Organism

Concentration

Sample

Name

Organism

Concentration

Origin

Quality metric

V1 V2

Page 53: Genomes On Rails
Page 54: Genomes On Rails
Page 55: Genomes On Rails

Rationalize!

Page 56: Genomes On Rails

Sample

Name

Organism

Concentration

Sample

Name

Organism

Concentration

Origin

Quality metric

V1 V2

Page 57: Genomes On Rails

Mapping!

Page 58: Genomes On Rails

Sample

Name

Organism

Concentration

Sample

Name

Species

Concentration

Origin

Quality metric

V1 V3

Origin

Page 59: Genomes On Rails

Pipeline management

Page 60: Genomes On Rails

Task 1 Task 2 Task 3

Workflow

Name

Operator

Instrument

Name

Serial number

Kit

Name

Passed

Page 61: Genomes On Rails
Page 62: Genomes On Rails
Page 63: Genomes On Rails
Page 64: Genomes On Rails

Throughput!

Page 65: Genomes On Rails
Page 66: Genomes On Rails

320Tb 450 CPU

Page 67: Genomes On Rails

320Tb 450 CPU Archive

Page 68: Genomes On Rails

75Tb

Page 69: Genomes On Rails
Page 70: Genomes On Rails
Page 71: Genomes On Rails
Page 72: Genomes On Rails
Page 73: Genomes On Rails

pilot study!

Page 74: Genomes On Rails

Multiple apps

Page 75: Genomes On Rails

Multiple instances

Page 76: Genomes On Rails

Loosely coupled

Page 77: Genomes On Rails

Loose coupling is hard

Page 78: Genomes On Rails

Deployment

Page 79: Genomes On Rails

Maintenance

Page 80: Genomes On Rails

Monitoring

Page 81: Genomes On Rails

Hard to maintain separation

Page 82: Genomes On Rails

Support novel science

Page 83: Genomes On Rails

Single code base

Page 84: Genomes On Rails

nginx reverse proxy

Page 85: Genomes On Rails

fairnginx

Page 86: Genomes On Rails

Mongrel

Page 87: Genomes On Rails

Fast deployment

Page 88: Genomes On Rails

Automate everything

Page 89: Genomes On Rails
Page 90: Genomes On Rails

Interoperability!

Play well with others!

Page 91: Genomes On Rails

Legacy databases

Page 92: Genomes On Rails

RESTful services

Page 93: Genomes On Rails

Generate API stubs

Page 94: Genomes On Rails
Page 95: Genomes On Rails

SCALE!

Page 96: Genomes On Rails

Trillionics

Page 97: Genomes On Rails

2X

Page 98: Genomes On Rails

150Tb per week

Page 99: Genomes On Rails

Over 6 months

Page 100: Genomes On Rails

More hardware

Page 101: Genomes On Rails

400 additional nodes

Page 102: Genomes On Rails

additional 360 Tb

Page 103: Genomes On Rails

Towards a Virtual Institute

Page 104: Genomes On Rails

Lots of data

Page 105: Genomes On Rails

Lots of data, lots of people

Page 106: Genomes On Rails

Lots of data, lots of people, lots of compute

Page 107: Genomes On Rails

Lots of data, lots of people, lots of compute,

lots of uses

Page 108: Genomes On Rails

Lots of data, lots of people, lots of compute, lots of uses, lots and lots

and lots and lots...

Page 109: Genomes On Rails

➌ Process

Page 110: Genomes On Rails

Concept Requirements Development Product

Page 111: Genomes On Rails

Concept Requirements Development Product

takes too long

Page 112: Genomes On Rails

RequirementsConcept Development Product

these change

takes too long

Page 113: Genomes On Rails

Concept

What we need Get ready

DevelopmentPlan

REVIEW

Page 114: Genomes On Rails

Focused

Page 115: Genomes On Rails

Project owner is key

Page 116: Genomes On Rails

Weekly releases

Page 117: Genomes On Rails

More flexible

Page 118: Genomes On Rails

Less time

Page 119: Genomes On Rails

Better transparency

Page 120: Genomes On Rails

Less software

Page 121: Genomes On Rails

Sequencing informatics

Page 122: Genomes On Rails

Thank you

Page 123: Genomes On Rails

GREENISGOOD.CO.UK