research priorities in malaysia research priorities in malaysia
Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces,...
Transcript of Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces,...
![Page 1: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/1.jpg)
Standards / models / mappings registry
Morris Swertz
BioMedBridges WP3 workshop
June 24, 2014, VUmc, Amsterdam
![Page 2: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/2.jpg)
Outline
• Background
• User stories
• Implementation pointers
• Goals of the meeting
• Open the discussion
![Page 3: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/3.jpg)
Background
3
![Page 4: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/4.jpg)
Objective
The BMS standards registry aims to facilitate syntactic operability across research infrastructure so samples and data can be integrated and analysed across ESFRI BMS domains.
4
![Page 5: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/5.jpg)
Standard?
• What do we mean by “standard”?
• So far we have limited ourselves to ‘models’ that are used to describe life science data. These may or may not be formal standards. Examples • Formats like VCF and MAGE-TAB
• Models / guidelines like MIABIS
• (Partial) Dictionaries like used in clinical and biobank studies
• I.e. ‘standard’ can by any model/format that is used by multiple parties to facilitate data sharing / integration.
5
![Page 6: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/6.jpg)
Who?
• Data consumer?
• Data producer?
• Software developer?
• Architect?
6
![Page 7: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/7.jpg)
History
• Few meetings within BMB
• Three ‘idea labs’ @ HandsOn biobanks
• Private communications
7
![Page 8: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/8.jpg)
Proposed goal of today?
• Scope • what types of standards to include
• Yes: models, formats, guidelines?
• Discussion: identifiers, value sets, ontologies?
• priorities of user stories (MoSCoW)
• priorities on content - what standards to catalogue first and why
• priorities of meta-data to be captured about each standard
• overview of stakeholders/users
• Next steps • ideas about user interfaces needed
• pointers to useful existing solutions/contents we can reuse
• identify tasks / contributors to the deliverable
• summary of the ‘how to demo’ for first public release
8
![Page 9: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/9.jpg)
User story
9
![Page 10: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/10.jpg)
Find standard
Synopsis:
As researcher / data provider I want to find relevant standard
How to demo:
There is a search box where users can type in a topic. Then the portal will return a list of available standard including contextual information to rapidly assess the relevance and value of the micro standard [what parameters / tags?]. For example, new biobanks starting up could use this to quickly find existing questionnaire modules or existing biobanks could rapidly assess to what standards they could harmonize their variables, e.g. when translating from local language to English.
10
![Page 11: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/11.jpg)
Biosharing.org 11
http://www.biosharing.org
![Page 12: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/12.jpg)
EDAM 12
http://bioportal.bioontology.org/ontologies/EDAM/?p=classes&conceptid=http%3A%2F%2Fedamontology.org%2Fformat_3162&jump_to_nav=true
![Page 13: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/13.jpg)
Drill down on schema to evaluate fit for purpose
Synopsis:
As a researcher I want suitable details on the data elements of the standard so I can assess if the standard is fit for my purpose.
How to demo:
The focus group wanted to see essential data and metadata to ease the search and evaluation process of assessing whether a biobank / sample collection / study is fit for the purpose of research or experiments currently under consideration. I.e. the portal should ideally have a tight integration between the model definitions and annotation and external uses of the standard such as other repositories containing information that define use of the models.
13
![Page 14: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/14.jpg)
Proof of concept 14
http://www.molgenis08.target.rug.nl
![Page 15: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/15.jpg)
Proof of concept 15
http://www.molgenis08.target.rug.nl
![Page 16: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/16.jpg)
Mapping between standards
Synopsis:
As a user I want to easily find mapping between standards
How to demo:
I can easily generate / curate mappings between standards and its elements so I can rapidly evaluate if standards are related and how I could move between standards. This is in particular motivated because ESFRIs have been developing in parallel resulting in some overlap. Moreover, within individual studies and biobanks data has been collected before standardization took place. Hence, there is a need to rapidly assess the mapping of data elements between models and formats. In collaboration with the BioSHaRE project an ontology based method has been developed to facilitate this process, which will be integrated in the repository. Figure 2 shows an example of ‘target data elements’ on the left and then proposes mappings for these elements across three data sources.
16
![Page 17: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/17.jpg)
Proof of concept (presentation of Chao) 17
http://biobankconnect.org/
![Page 18: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/18.jpg)
Using standards as tool for integration
Synopsis:
As a user I want to integrate my data with other data using standards registry as a way
How to demo:
Given data in hand I can use the mapping tool to integrate into standards and then discover data sources / annotation services that I can use to integrate with. WP8 personalized medicine has provided good example use case around leukemia where mutation data needs to be enhanced with knowledge on existing cancer cases (integration with COSMIC) and whether gene expression can be influenced using drugs (integration with Chembl).
18
![Page 19: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/19.jpg)
Proof of concept (WP8) 19
Discover annotation services available
based on attribute meta data
![Page 20: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/20.jpg)
Evaluate ‘maturity’
Synopsis:
As user I want to evaluate the maturity / quality / support of the standard as a basis to decide whether to use it
How to demo:
I want to see to what extent the standard is used in databases and software tools and institutes, and what their experiences are. Also I want to know if there is active support of the standard in term of software tools and expert groups that can help using the standard.
20
![Page 21: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/21.jpg)
Could not find good example? 21
![Page 22: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/22.jpg)
Access to expert knowledge
Synopsis:
As researcher I want to access to expert knowledge about the standard
How to demo:
I can easily drill down on the models to discover background information that support the models. For example, for questionnaire modules it would be discoverable if there are pitfalls when changing the order of the questions, or if there is knowledge about the stability when using in a longitudinal setting or when using in repetition. Moreover, it should be visible what persons or institutes have provided the information so that users can assess to what extent they want to trust the information provided.
22
![Page 23: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/23.jpg)
Could not find good source for this: 23
E.g. http://biostars.org
![Page 24: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/24.jpg)
Find collaborators
Synopsis:
As a user I want to get into contact with colleagues having similar questions / objectives
How to demo:
I can easily find colleagues who are or have been dealing with similar questions as the current user is struggling with. For example, a new biobankers developing a new laboratory protocol may wonder what information should be captured to ensure research use and may want to learn more than currently available in the portal. The focus group expected that in these cases a forum application was not enough and that the portal would provide a perfect platform to even enable drill down to experts who have indicated willing to be found.
24
![Page 25: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/25.jpg)
Could not find a example ... Out of scope? 25
![Page 26: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/26.jpg)
Other stories
• As group I want to develop define the data elements of a new standard (e.g. MIABIS working group)
• As user I want to see version history of the standard (why?)
• As user I want to convert between standards (e.g. ETL)
• As database owner I want to move my data to sustainable resource
• As user I want to upload my data to a public repository (e.g. EGA)
• As user I want to protocols related to standards (http://www.molmeth.org/
• And associated data items? E.g. blood pressure
• As user I want to choose between overlapping formats
• PLEASE EXTEND
26
![Page 27: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/27.jpg)
Proof of concept implementation
27
![Page 28: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/28.jpg)
Proof of concept
• Code in http://github/molgenis/molgenis
• Early demo on http://molgenis08.target.rug.nl
![Page 29: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/29.jpg)
Basic data capabilities (fine grained meta data / schema) 29
![Page 30: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/30.jpg)
Basic capabilities (any data) 30
![Page 31: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/31.jpg)
REST interface (metadata/data, used in WP4 federation) 31
![Page 32: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/32.jpg)
Excel upload details 32
https://www.dropbox.com/s/1r26jfh8lmupvqj/MAGE-TAB.xlsx
![Page 33: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/33.jpg)
Now simplifying and detailing a bit more 33
https://github.com/molgenis/molgenis/wiki/EMX-upload-format
![Page 34: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/34.jpg)
Looking forward 34
https://docs.google.com/spreadsheets/d/1rDqQ4hz4uWs4JcVKmJTZza__ztBivExVTbkGOx3ev_0
![Page 35: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/35.jpg)
Integrations
• Tool registry • To link to tools who can implement a mapping / consume /
produce a standard
• Identifiers.org • For merging datasets across identifier spaces (which you need in
combination with format / model mapping)
• Biosharing.org • To not duplicate meta data about standards and formats and
their usage?
35
![Page 36: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/36.jpg)
Goal of this meeting
36
![Page 37: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/37.jpg)
Goals of the meeting
• Scope • what types of standards to include
• Yes: models, formats, MIAx guidelines? (syntax?)
• Discussion: identifiers, value sets, ontologies? (semantics?)
• Tools registry: services, query interfaces, tools, ...
• priorities of user stories (MoSCoW)
• priorities on content - what standards to catalogue first and why
• priorities of meta-data to be captured about each standard
• overview of stakeholders/users
• Next steps • existing solutions/contents and gap analysis
• ideas about user interfaces needed
• identify tasks / contributors to the deliverable
• summary of the ‘how to demo’ for first public release
• Case studies
37
![Page 38: Morris Swertz June 24, 2014, VUmc, Amsterdam · • Tools registry: services, query interfaces, tools, ... • priorities of user stories (MoSCoW) • priorities on content - what](https://reader034.fdocuments.net/reader034/viewer/2022042304/5ecfb95122a8b704ef77405c/html5/thumbnails/38.jpg)
Notes
• License?
38