Software Sustainability Institute - Fellowship Selection Day 2014 - Joe Parker

5
Phylogenomics matters Interface of genetics, evolution, statistics, computation Disease, adaptation, climate change, medicine Distributed, parallel & high-performance computation Massively parallel phylogenomics Joe Parker Queen Mary University of London spectrum.ieee.org Illumina.com flickr/stephenjjohnson

Transcript of Software Sustainability Institute - Fellowship Selection Day 2014 - Joe Parker

Phylogenomics matters

• Interface of genetics, evolution, statistics, computation

• Disease, adaptation, climate change, medicine• Distributed, parallel & high-performance

computation

Massively parallel phylogenomicsJoe Parker

Queen Mary University of London

spectrum.ieee.orgIllumina.comflickr/stephenjjohnson

Challenges

• Size of data sets• Heterogeneous architecture• Use of non-standard workflows• Limited testing

• Time-limited projects• Versioning / legacy• Collaboration / silos• Compatibility

Languages

Python Perl C/C++ Java Javascript

Ruby R Matlab

Distribution

Source / binaries

Plugins, Git / SVN

Commercial

Open

ObjectivePortal for phylo tools(e.g. wiki / blog / searchable metdata feed / stack exchange)

“ What are we all doing?

“ Surely this has been

done before? ”

Solutions

ProjectsBinariesSource

Discovery

Tag clouds / wiki

Metadata

Catalogue

Inputs / outputs

Versioning

Architecture

“ How do you do that? ”

Approach

• Build awareness / community, develop consensus on best-practice, share hacks and develop phlyo tools portal.

• Research software engineers from academia, industry (including SMEs and startups) and hackers.

• London’s Silicon Roundabout; space hire, refreshments and dissemination.

(L-R): Joe Parker; GeorgiaTsagkogeorga; Kalina Davies; Steve Rossiter; Xiuguang Mao; Seb Bailey

Six Hackspace / networking events

“Live stack exchange for phylogenetics”PortalShareHackLearn

Discovery

Existing tools

Best practice

Problems

CommunityBuild network

Develop consensusEngage

Outputs

Live-blog & documentationLive stack exchange for phylogenetics, disseminateResource for tools metadataCommunity-specified, developed and maintained

Supporting SSI Aims

Sustainability through communityRepeatability, communication and continuity

Legacy of best-practicePeer-based dissemination of skills, problems, solutions