HVP Country Node Development Workshop

Post on 10-May-2015

379 views 1 download

Tags:

Transcript of HVP Country Node Development Workshop

Human Variome ProjectCountry Node Development Workshop

Timothy D. Smithtim@variome.org

@tim_d_smith

Purpose of Today

• An interactive discussion about HVP Country Nodes

– What they are

– What they do

– How they do it

• Get some feedback from you on what Country Nodes need to be

– I’ll be putting you on the spot at certain points

• Hopefully inspire some of you to start a Node in your own

Country

Outline

• How to initiate an HVP Country Node

• The role of the Node within the country

• Genetics Capacity within the country or region

• Data Collection: who, what, where and how

• Ethical issues, Access policies and Data Ownership

• Data Model

Here’s my problem

• There is no “official” definition of what a Country Node is

– We’re not quite there yet

• No “instruction manual” for how to build one

– Not yet, anyway

Working Definition

An HVP Country Node is an electronic repository of information on genetic variations discovered in patients residing in a specific

country or belonging to a specific population. Ideally the repository contains information on genetic variations discovered during

both routine diagnostic testing and during research. The repository is managed locally by a committee or organisation that has

sufficient representation of stakeholder groups and the backing or support of the country’s human genetics society or similar

professional body. This committee would be responsible for ensuring the sustainability of the repository and compliance with

local laws, regulations and ethics requirements, as well as determining policies for the repository (e.g. data access policy, data

retention policy, curation policy, etc.) The government need not be directly involved in the operation or financing of the

repository, although such involvement is desirable. At the very least, the Ministry of Health should be aware of the Node, its

operations, benefits to the local health system and its relationship to the international Human Variome Project. The minimum

level of information that should be collected on each variant is that described in AlAama et al. (2011) and the repository includes

information on variants considered to be non-pathogenic as well as those affecting function (or pathogenic). In addition, the

repository should capture all instances of each variation that are reported, not just the first case.

• From R05-2012: HVP Country Nodes: a partial definition-A report of the Country Node Development Workshop, Paris, 2012

available at http://www.humanvariomeproject.org/

It’s actually pretty simple

“An HVP Country Node is an electronic repository of information on genetic variations discovered in patients residing in a specific country or belonging to a specific population.”

“Ideally the repository contains information on genetic variations discovered during both routine diagnostic testing and during research.”

HVP Country Node

• Repositories of variation within a country

• Service with in-country benefits– Diagnostic labs– Clinics– Policy making and healthcare

delivery planning– Registries?

• Plays a part in global collection efforts

It’s actually pretty simple

HVP Country Node

Global Collection Architecture

Outline

• How to initiate an HVP Country Node

• The role of the Node within the country

• Genetics Capacity within the country or region

• Data Collection: who, what, where and how

• Ethical issues, Access policies and Data Ownership

• Data Model

An Organisation

“The repository is managed locally by a committee or organisation

that has sufficient representation of stakeholder groups and the

backing or support of the country’s human genetics society or

similar professional body.”

• The first step in establishing a Country Node is not to build the

database, but to build the organisation

An Organisation

• Need to decide what that organisation looks like

– A local issue

• Must be suitably representative

– All stakeholder groups

• What does this organisation do?

– Will determine what the organisation looks like.

“This [organisation] would be responsible for ensuring the

sustainability of the repository and compliance with local laws,

regulations and ethics requirements, as well as determining policies

for the repository (e.g. data access policy, data retention policy,

curation policy, etc.)”

Decisions to make - establishment

• How the development and operation of the Node will be funded;

• Where the Node will collect data from and how collection will be

achieved;

• What information the Node will collect on each variant; and

• How the collected data will be made available and to whom.

Decisions to make - running

• How operational and managerial decisions will be handled during

the development and continuing operations of the Node;

A robust and representative organisation is required to make these

decisions.

Organisation Types

• Consortium of interested individuals

• University Department

• Human Genetics Society

• Government

• Other

Stakeholders

• Diagnostic laboratories

• Medical geneticists

• Genetic Counsellors

• Researchers

• Ethicists

• Officials from the Ministry of Health and Ministry of Science and Technology

• Genetics societies

• Professional Bodies in charge of certifying labs and medical geneticists

• Patients

Funding & Sustainability

• The funding requirements of the Node fall in to three categories:

– supporting the organisational framework in an ongoing fashion;

– maintaining the operations of the technical systems and infrastructure;

and

– developing new technical systems and infrastructure in response to user

needs.

Funding & Sustainability

• Organisational component most likely to be the highest recurring

cost:

• Recurring technical costs are generally low

• Most likely sources of funds to support organisational

framework:

– Government

– University

– Professional society

Human Variome Project ICO

• We can help…

– Human Variome Project/China Country Development Programme

– Access to the economic and public health evidence for the importance of

HVP Country Nodes;

– Provide assistance to local organisations in their efforts to generate funds;

and

– Work with UNESCO and WHO to improve the knowledge of genetic

disorders and their economic and public health impacts within the science

and health ministries of member states.

Role of the Node

FundingData

CollectionData Model

Make Initial Decisions

Identify stakeholders

Hold a meetingBuild an

organisation

Initiation

Outline

• How to initiate an HVP Country Node

• The role of the Node within the country

• Genetics Capacity within the country or region

• Data Collection: who, what, where and how

• Ethical issues, Access policies and Data Ownership

• Data Model

Role of the Node

• Service for diagnostic laboratories

• Service for clinicians/genetic counsellors

• Service for medical researchers

• Source of data for the Human Variome Project

• Source of statistics for health service delivery planning

Role of the Node

• driving activity around medical genetics and genomics

– Education – public, CME, etc.

– ELSI

– public funding of genetic testing and genetic health care services

– submission to databases as part of the licensing requirements of

diagnostic laboratories

• act as a “spokesperson” for the Human Variome Project within

the country on issues of importance to the international Project.

• Decide on the role of the Node early, as every choice that needs

to be made will be informed by this decision.

Role of the Node

• Funding• Data Collection

• Data Model

Make Initial Decisions

Identify stakeholders

Hold a meetingBuild an

organisation

Initiation

Outline

• How to initiate an HVP Country Node

• The role of the Node within the country

• Genetics Capacity within the country or region

• Data Collection: who, what, where and how

• Ethical issues, Access policies and Data Ownership

• Data Model

Genetics Capacity

• What tests are available?

• Where are these tests performed?

• What data is generated during these tests?

• What data is shared?

– Locally, nationally, internationally

• Who pays for the tests?

• How will things change in 1, 5, 10 years?

Genetics Capacity

• The Node will need to know all this

– Or, at the very least, be able to find the answers

• The Human Variome Project would like to know this information

as well

– Currently compiling the HVP Country Node Baseline Report

• Capacity will shape what role the Node performs

Role of the Node

• Funding• Data Collection

• Data Model

Make Initial Decisions

Survey labs & clinics

Appreciate Capacity

Identify stakeholders

Hold a meetingBuild an

organisation

Initiation

Outline

• How to initiate an HVP Country Node

• The role of the Node within the country

• Genetics Capacity within the country or region

• Data Collection: who, what, where and how

• Ethical issues, Access policies and Data Ownership

• Data Model

Where data comes from

• Molecular data

– Labs – Research and/or Diagnostic

– Clinics/Genetic Counsellors

– Literature

• Clinical Data

– Labs

– Clinics/Genetic Counsellors

– Patients (Registries)

Data Types

• Classified variants

• Unclassified variants

• Benign variants

• NGS/Incidental findings

• Negative results

Depends on…

• Ethical, legal and social restrictions (difficult to change)

– Government/regulatory bodies

– Professional codes of practice

• What data sources are willing to share (political)

• What data is able to be collected (technical)

• How useful the data will be to the end-users of the system (non-

negotiable)

How will collection be achieved

• Technical

– Electronic or Paper

– Manual or automatic

• Process

– At what point in the pipeline

• Do collection activities differ by data source category

Role of the Node

• Funding• Data Collection

• Data Model

Make Initial Decisions

Survey labs & clinics

Appreciate Capacity

Identify stakeholders

Hold a meetingBuild an

organisation

Initiation

Outline

• How to initiate an HVP Country Node

• The role of the Node within the country

• Genetics Capacity within the country or region

• Data Collection: who, what, where and how

• Ethical issues, Access policies and Data Ownership

• Data Model

ELSI

The Organisation operating the Node will be responsible for

ensuring the data is collected, stored and used in a manner that

complies with all local ethical, legal and social concerns.

Local organisations are best placed to determine how to do this

Need to develop

• Collection policy

• Collection Agreement

• Access policy

• Data ownership/IP policy

– Public/private labs

Collection Policy/Agreement

• What is collected

– Identifiability issues

– Phenotype

• How is it collected

• How is it stored

• When is it collected

• Why is it collected

Access Policy

• Depends on the role

• Who can access data locally and for what purpose

– Controlled Access?

• What use classes

• Who can access data internationally

• What data can be shared and who with

– This is very important

– specific data elements that must be shared are yet to be specified by the

International Confederation of Countries Advisory Council

Ownership/IP

• Who ‘owns’ the data submitted to a Node?

– The lab or the Node

• Private/commercial labs

– License terms

– Withdrawal from the agreement

• Data already shared internationally

Remember

• The issues of what data sources to collect from, ownership of

data and access rights are interconnected

• Permission to collect from certain sources may only be granted if

the Node agrees to certain access rights and ownership

provisions

• Need to be determined before technical development begins

Data Access Policy

Ownership/IP Policy

Sign up collection

sources

Mature as an Organisation

Role of the Node

• Funding• Data Collection

• Data Model

Make Initial Decisions

Survey labs & clinics

Appreciate Capacity

Identify stakeholders

Hold a meetingBuild an

organisation

Initiation

Outline

• How to initiate an HVP Country Node

• The role of the Node within the country

• Genetics Capacity within the country or region

• Data Collection: who, what, where and how

• Ethical issues, Access policies and Data Ownership

• Data Model

Data Model

• what data is stored in the database

• how each data element is related to each other element

• what format each element can be stored in

• which elements are mandatory

• how much intervention will be made during and after submission by human

beings to:

– standardise format

– assess data quality

– correct errors

– add additional information or combine records

Curation

• Australia

• No curation

• relies on a combination of automated processes, data model and data

submitters

• only allows submission by registered diagnostic laboratories via software

tools that the Australian Node provides

• Only accepts submissions of data that has been recorded in a pathology

report

this change

• Protein, DNA, RNA level

• Coding DNA or genomic DNA

• rs#

• HGVS Nomenclature

• Old school methods

• Bespoke methods (BIC)

this gene

• HGNC name: MC1R, melanocortin 1 receptor, HGNC:6929

• Synonyms: MSH-R

• OrphaNet ID: ORPHA139778

• OMIM: 155555

• Entrez Gene: 4157

• Sequence Accession: NM_002386

• Chromosomal Location: 16q24.3

• Coordinates: 16:89,984,286 - 89,987,384

means this

• Simple

• Free text

• Coding

– ICD-10 (9)

– SNOMED

• Ontologies

– HPO

– GO

– PATO

• 20,000 genes = 20,000 disorders

• All different

patient X

• Ethical, legal and social issues

• Global context

• Patient ID

• Assigned by LSDB

• Useful?

• Human Variome Project ICCAC will set standard for minimum content

• Each Node represented on the Council will have input to this process

Data Model

Quality

• Once standards exist you can measure

• Transition from research grade to clinical grade

• These resources ultimately need to be useful as clinical decision

support tools

AlAama et al. (2011)

• Gene Name—described in the form of both the HUGO Nomenclature

Committee approved gene name and a sequence accession number and

version number.

• Variant Name—written as HGVS nomenclature (http://

www.hgvs.org/mutnomen/).

• Pathogenicity—classified as five levels of pathogenicity (see Plon et al., 2008).

• Test date—the date that the results where produced.

• Patient ID—a deidentified code which is unique to a patient.

• Patient Age—the age of the patient when tested.

AlAama et al. (2011)

• Patient Gender.

• Submission date.

• Disease associated with the mutation—if diagnosed.

• Lab Operator ID—a code that identifies the operator who uploaded the data.

• Laboratory Name/ID.

• Country/Region Name/ID—if a regional repository is used.

• Level of consent obtained.

• Can the patient be recontacted for other studies?

• Can clinical and/or molecular data be used for statistical analyses (with options for

local laboratory, country, and/or international)?

Australian Node – Phase 1

• Mandatory– Gene Accession # and version– Variant Name (as HGVS

nomenclature)– Pathogenicity Classification

(Plon)– Date of classification

• Optional– Test Details

• Date of test• Method• Sample Type• Start• Stop

– Age (at test)– Disease– Misc.

• Sample Stored• Pedigree Available

Australian Node – Phase 2

• Mandatory– Gene Accession # and version– Variant Name (as HGVS

nomenclature)– Pathogenicity Classification

(Plon) + justification– Date of classification– Disease– Reason for Test

– Predictive– Carrier– Diagnostic

• Optional– Test Details

• Date of test• Method (inc. NGS Platform)• Sample Type• Sampling Date• Start• Stop

– Interpretation Method (deviation from standard)

– Age (at test)– Misc.

• Sample Stored

Software & Deployed Infrastructure

• DBMS

– Implements data model

– Allows querying of the data

• User interface

– What the user sees

– Access control

• Collection tools

A Node is a living thing

• All technical and organisational elements will require updating

– Needs change

– Laws change

– Technology changes

– HVP Standards & Guidelines

• Role of the Node Organisation is to manage this change

Data ModelDBMS

UICollection

Tools

Technical Development

Data Access Policy

Ownership/IP Policy

Sign up collection

sources

Mature as an Organisation

Role of the Node

•Funding•Data Collection

•Data Model

Make Initial Decisions

Survey labs & clinics

Appreciate Capacity

Identify stakeholders

Hold a meetingBuild an

organisation

Initiation

Help is available

• AlAama, J., Smith, T. D., Lo, A., Howard, H., Kline, A. a, Lange, M., Kaput, J., et al. (2011). Initiating a

Human Variome Project Country Node. Human Mutation, 32(5), 501–6. doi:10.1002/humu.21463

• Cotton, R. G. H., Al Aqeel, A. I., Al-Mulla, F., Carrera, P., Claustres, M., Ekong, R., Hyland, V. J., et al.

(2009). Capturing all disease-causing mutations for clinical and research use: toward an effortless

system for the Human Variome Project. Genetics in Medicine, 11(12), 843–9. doi:

10.1097/GIM.0b013e3181c371c5

• Patrinos, G. P. (2006). National and ethnic mutation databases: recording populations’ genography.

Human Mutation, 27(9), 879–87. doi:10.1002/humu.20376

• Patrinos, G. P., Al Aama, J., Al Aqeel, A., Al-Mulla, F., Borg, J., Devereux, A., Felice, A. E., et al. (2011).

Recommendations for genetic variation data capture in developing countries to ensure a

comprehensive worldwide data collection. Human mutation, 32(1), 2–9. doi:10.1002/humu.21397

Solutions

• HGVS nomenclature

• Open Source DBMS products

– LOVD, UMD, MutBase

– Ethnos, Australian Node

• Minimum Content - recommendations

• Ethics

• Submission Forms

• Forum for discussion, debate, consensus

Australian Node

• HVP Portal (v1.0, r512) - A web application which features the basic interface for browsing and

querying a HVP node.

– Open source – MIT License

– Python/django

• HVP Exporter (v1.0, r512) - Basic HVP exporting tool for laboratories. Features simple GUI and error

checking interface, plug-in architecture for customisation between sites and common libraries for

working with MS Access and MS Excel data sources

– Open source – MIT License

– .NET C#, python/ironpython

• HVP Importer (v1.0, r512) - A series of tools and web services that receive, decrypt and process

information by submitting laboratories using the standard transaction XML format

– Open source – MIT License

– python

Still work to do

• Minimum content

• Infrastructure to enable sharing

• Attribution

– ELSI

– Incentive for submission

• Phenotype description

– Useful – humans and computers

– Language differences

• Describing ethnicity of patients

Human Variome ProjectCountry Node Development Workshop

Timothy D. Smithtim@variome.org

@tim_d_smith