Wageningen phenotype meeting

34
Phenotype Foundation Why? What? How? Kees van Bochove, 6 september 2011

description

 

Transcript of Wageningen phenotype meeting

Page 1: Wageningen phenotype meeting

Phenotype Foundation

Why? What? How?

Kees van Bochove, 6 september 2011

Page 2: Wageningen phenotype meeting

History

Page 3: Wageningen phenotype meeting

Multiple instances: NMC, NuGO, TNO…

Page 4: Wageningen phenotype meeting

A few screenshots…

Page 5: Wageningen phenotype meeting

Example of a study timeline

Page 6: Wageningen phenotype meeting

Next Generation Sequencing module

Page 7: Wageningen phenotype meeting

Transcriptomics module

Page 8: Wageningen phenotype meeting

Simple Assay Module

Page 9: Wageningen phenotype meeting

Visualization – PPS1 paper figure

Page 10: Wageningen phenotype meeting

Galaxy (toolbox / visualization)

Page 11: Wageningen phenotype meeting

Core reasons

• Governance of dbNP - now dbXP– Communication – many project, many

partners– Professionalize open source

development– Guarantee sustainability of software and

data– Represent dbNP to outside

Page 12: Wageningen phenotype meeting

Mission statement

• The Phenotype Foundation mission is to empower scientists, within various communities of practice, with standardized software and knowledge stores to store, manage, and retrieve information and data on genotypes and phenotypes.

Page 13: Wageningen phenotype meeting

My reasons

• To find more testers! Engage more biologists.

• One stop place for our efforts – currently all over the web

• Professionalize open source development – it should be easy, not hard, to join in

• Deployment issues – configuring servers, databases… who is responsible?

• Sustainability!

Page 14: Wageningen phenotype meeting

Testing time

• Estimates of testing time for moderately complex GUI applications as percentage of project time range from 30% to 60%

• Which means that just hiring developers is not nearly enough to accomplish a succesful software development project

• And… testing is the most complex part of the project to manage!

Page 15: Wageningen phenotype meeting

Current state ofdbXP documentation

Page 16: Wageningen phenotype meeting

Current state of documentation

• Source code: http://trac.nbic.nl• Infrastructure: http://nmcdsp.org

(Nexus server)• Website http://dbnp.org,

http://trac.nbic.nl, http://wiki.nbic.nl

Page 17: Wageningen phenotype meeting

Advantages of open source- in theory

• Lowers adaptation barrier, thereby improving collaboration and standardization

• Transparency and clarity presumably lead to higher quality of code

• Actually, open source is the only legally feasible way of sharing program code across multiple short-lived projects, organizations and institutions

Page 18: Wageningen phenotype meeting

Advantages of open source- the reality in bioinformatics

• The 'open source hype' has lead to countless bioinformatics open source tools, some of them mere 'code dumps', others highly active collaborative projects, with or without a (large) user base, in many different programming languages

• Many projects are interesting, but have no documentation, or the source code is in bad shape (lack of software engineering skills among bioinformaticians)

• Abundance of non-interoperable web-enabled databases further complicates this picture

Page 19: Wageningen phenotype meeting

Bikeshed argument

Page 20: Wageningen phenotype meeting

Bikeshed argument

Page 21: Wageningen phenotype meeting

Bikeshed argument

Page 22: Wageningen phenotype meeting

Bikeshed argument

Page 23: Wageningen phenotype meeting

Case study: Plone

• An open source CMS (Content Management System) written in Python, nowadays backing thousands of production grade websites

• Started by 2 developers in 2000, now an active open source project with hundreds of active developers

• In 2004, the Plone Foundation was formed to formalize IP and secure the future of Plone

Page 24: Wageningen phenotype meeting

Plone - how it works, legally

• The Plone Foundation is owner of code and serves to protect and promote Plone. It guarantees that Plone will always be available under an OSI approved license

• Developers willing to contribute to the Plone core need to sign a contributors agreement, transferring IP of their contributions to the Foundation. In return, they get irrevocable rights to use and distribute their contributions

• The Foundation is governed by a Board of Directors, which is elected yearly by the members of the Foundation

• Only 1 person is employed by the Foundation: Release Manager

Page 25: Wageningen phenotype meeting

Geir Baekholt,past president of the Plone Foundation

• "The Plone Foundation is a meritocracy"

Page 26: Wageningen phenotype meeting

Plone, in practice

• The active community of developers is the basis of the success of Plone. Some are hobbyists, but many also work for companies

• There is a large Plone Collective (marketplace for Plone add-ons). Add-ons are not subject to the contributors agreement (although of course should comply with the Plone license if they import Plone code)

• Contrary to popular belief, developers in open source projects such as Apache or Plone meet regularly face to face: ApacheCon, Plone Developer Days, sprints..

Page 27: Wageningen phenotype meeting

Plone?

• Plone Foundation, Apache Software Foundation, Open Bioinformatics Foundation etc. all have one crucial detail: all members and the board are programmers

• In Phenotype Foundation, we have 2 or even3 different types of people that are both stakeholders: programmers and scientists (and senior management)

Page 28: Wageningen phenotype meeting

And how does Galaxy do it?

• Penn State University + Emory university have the lead and are the only core committers!

• If you have suggestions for improvement of the code, you can suggest them using a ‘pull request’

• Creation of a ‘tool shed’ where bioinformaticians can upload their tools

Page 29: Wageningen phenotype meeting

Beware the Empty Chair!

• Especially if we start integrating different projects, we have to make sure someone is responsible for maintaining a consistent end user experience.

The Empty Chair is a metaphor from the book"Adrenaline Junkies and Template Zombies"

Page 30: Wageningen phenotype meeting

Beware mañana

• Any goal further away than 90 days is not urgent enough to do anything about today.

• So what is our next step?

From "Adrenaline Junkies and Template Zombies"

Page 31: Wageningen phenotype meeting

caBIG

Page 32: Wageningen phenotype meeting

caBIG – WG recommendations

• Create a Scientific Advisory Group of scientific, technology and informatics experts

• Focus on scientific mission, not on creating a ‘software brand’

• Semantic data integration (interoperability) is very important

• Create a legal framework for data sharing

Page 33: Wageningen phenotype meeting

Conclusions – programmer perspective

• At this moment, communication between programmers goes reasonably well, because the same people are involved

• But we are not ready for the future• For outsiders, it is hard to find our

community and at this moment impossible to join without a lot of inside knowledge

• We need to improve this if we don’t want to end up on the ‘graveyard’!

Page 34: Wageningen phenotype meeting

Foundation outline

• Board of Directors• Technical Community– Board member(s) (liason to board)– Release manager– Database managers– Transfer of code ownership to the

Foundation?

• Scientific Community– Board members(s)– Product Liaisons– Frequent meetings