E-Science Institute Neuro- workshop, 28 th November 2006 Virtual Organisations for Trials and...

34
E-Science Institute Neuro- workshop, 28 th November 2006 Virtual Organisations for Trials and Epidemiological Studies (VOTES) – Experiences & Prototypes after 1 year Prof Richard Sinnott Technical Director National e-Science Centre University of Glasgow [email protected]

Transcript of E-Science Institute Neuro- workshop, 28 th November 2006 Virtual Organisations for Trials and...

E-Science Institute Neuro- workshop, 28th November 2006

Virtual Organisations for Trials and Epidemiological Studies (VOTES) –

Experiences & Prototypes after 1 year

Prof Richard Sinnott Technical Director National e-Science Centre

University of [email protected]

E-Science Institute Neuro- workshop, 28th November 2006

Clinical Trials 101

Need to answer questions such asHow many men in Scotland between the ages of 45-65 had a heart attack in the last 5 years? Of those that did, would they be interested in trialling a new drug to prevent possible further serious major events?

Recruitment!

For recruited men, are they regularly taking the new drug (or placebo)? Do they visit their GP/hospital regularly for the drug/placebo, to give samples, for monitoring purposes? Did they have any further major events (or side-effects) in taking the drug?

Data collection!

Who can see the information associated with this trial? Can a hospital doctor, nurse see all of given patients data? Only their GP? A clinical trials researcher? Who ensures that a study is in the patients interest? Can we simplify the ethical review process? Who checks the validity of trial results?

Study management!

E-Science Institute Neuro- workshop, 28th November 2006

VOTES

Virtual Organisations for Trials and Epidemiological Studies 3 year (£2.8M) MRC funded project started October 2005Plans to develop framework for producing Grid infrastructures to address key components of clinical trial/observational study

Recruitment of potentially eligible participants Data collection during the study Study administration and coordination

– Involves Glasgow, Oxford, Leicester/Nottingham, Manchester, Imperial » Strong links with UK Biobank

Clinical Virtual Organisation Framework

IMP

CVO-2 (e.g. for

recruitment)

Used to realise

GPs

Lei- Nott GLA

OX

Disease registries

Hospital databases

Transfer Grid

CVO-1 (e.g. for data collection)

Clinical trial data sets

E-Science Institute Neuro- workshop, 28th November 2006

Grid Background

What is a Grid?Data Grid vs Compute Grid vs Information Grid vs Campus Grid vs Enterprise Grid vs …

Technologies for GridsWeb servicesGlobus OMII-UKEGEE/gLite…

E-Science Institute Neuro- workshop, 28th November 2006

E-Health Grids…

Essential that they offerFine grained security

AAAA

Access/integration of rich variety of clinical data setsEase of use for end usersSingle sign-on to various remote resourcesSite autonomy/manageability for local adminsScalability for large scale virtual organisationsControlled dynamicity of users, resources, policies……

– HYPOTHESIS: Shibboleth + Grid + advanced authorisation infrastructures can address these issues

E-Science Institute Neuro- workshop, 28th November 2006

Usability

Grid Security

AAAA

Users like usernames/passwordsProvide them (once!)

Users don’t like/understand X.509 based PKI

Forget training, education for most users! $> openssl pkcs12 -in cert.p12 -clcerts -nokeys -out usercert.pem!

The vast majority most certainly won’t jump through hoops to get on the Grid

“me-Science” culture

E-Science Institute Neuro- workshop, 28th November 2006

“A”AAA

Identity management issuesCertificate Revocation ListsWhen revoked? By whom? How timely?

Strong passwords for private keysUsers write them down, share them, forget them

Privilege ManagementNumerous domains where never get access to local account to “do stuff”

I need to access your NHS DB to run queries, change tables, run arbitrary code…

At NeSC Glasgow we have focused on improving AAAA and AAAA

E-Science Institute Neuro- workshop, 28th November 2006

Improving “A”AAA

Best to exploit local authenticationSites know best if users still at institution and are best placed to state what their privileges are/should be

Introducing Shibboleth

E-Science Institute Neuro- workshop, 28th November 2006

Shibboleth (http://shibboleth.internet2.edu)

DefinitionShibboleth [Hebrew for an ear of corn, or a stream or flood] 1. A word which was made the criterion by which to distinguish the Ephraimites from the Gileadites. The Ephraimites, not being able to pronounce sh, called the word sibboleth. See --Judges xii. 2. Hence, the criterion, test, or watchword of a party; a party cry or pet phrase. ] Shibboleth will replace Athens as access mgt system across UK

academia– i.e. this is main stream and not (weird) Grid solutions!

Federations based on trust or more accurately trust but verify numerous international federations exist MAMS, SWITCH, HAKA, SDSS…

Introducing Shibboleth

E-Science Institute Neuro- workshop, 28th November 2006

Typical Shibboleth Scenario

Service provider

5. User accesses resource

Grid resource

/ portal

Identity Provider

Home Institution

W.A.Y.F.

Federation

User1. User points browser at Grid

resource/portal (or non-Grid resource)

2. Shibboleth redirects

user to W.A.Y.F. service

3.User selects their

home institution

4. Home site authenticates user

AuthNLDAP

E-Science Institute Neuro- workshop, 28th November 2006

It’s a start, but…

Benefit from local authentication but really want finer grained control…

I know you have authenticated, but I need to know that you have sufficient/correct privileges to access my VO resources

can also return various other information needed to support authorisation decisions

At NeSC we have been working extensively with PERMIS

E-Science Institute Neuro- workshop, 28th November 2006

Role Based Access Controls

Basic idea is to define:roles applicable to specific VO

roles often hierarchical– Role X ≥ Role Y ≥ Role Z– Manager can do everything (and more) than an employee can do

who can do everything (and more) than a trainee can do

actions allowed/not allowed for VO membersresources comprising VO infrastructure (computers, data resources etc)

A policy then consists of sets of these rules { Role x Action x Target }

– Can user with VO role X invoke service Y on resource Z? Policy itself can be represented in many ways, e.g. XML, XACML, …

Tools available for policy editing, associating users with roles, signing policies etc

Policies stored as attribute certificates in LDAP server Digitally signed/tamper proof!

E-Science Institute Neuro- workshop, 28th November 2006

Finer Grained Shibboleth Scenario

Service provider

ShibFrontend

5. Pass authentication info and attributes to authZ function

Grid Portal

6. Make final AuthZ decision

Grid Application

Identity Provider

Home Institution

W.A.Y.F.

Federation

User1. User points browser at Grid

resource/portal

2. Shibboleth redirects

user to W.A.Y.F. service

3.User selects their

home institution

4. Home site authenticates user and

pushes attributes to the service provider

AuthNLDAP

E-Science Institute Neuro- workshop, 28th November 2006

Ok, but…

I can do authorisation but I want single-sign on to lots of distributed resources

Browser allows to keep session information so can access other resources without signing in again

Provided authorisation information valid for different service providers

– Each service provider completely autonomous Can configure attribute release/attribute acceptance

policies per identity provider/service provider

E-Science Institute Neuro- workshop, 28th November 2006

Trials & Tribulations of Scottish Clinical Data Space

Scottish Data Space…Scottish Care Information (SCI) StoreScottish Morbidity Records (SMR)General Practitioners Administration System for Scotland (GPASS)Data dictionary…Consent database

E-Science Institute Neuro- workshop, 28th November 2006

SCI Store

Batch-type system that regional health authorities use

Includes lab results, biochemical, haematology, pathology, microbiology, radiology …

Front end web based tools input data, querying

E-Science Institute Neuro- workshop, 28th November 2006

SCI Store…ctd

16 SCI stores across ScotlandAtos Origin commercial supplier of technologyeach have their own schemas collecting different data sets

NeSC been given SCI store softwareIncludes training data sets

These data sets are partial at best right now – ~100 tables in schema, but only 10 tables used in data

provided

SQLServer back-end database

E-Science Institute Neuro- workshop, 28th November 2006

A Quick Tour of SCI Store

E-Science Institute Neuro- workshop, 28th November 2006

Scottish Morbidity Records

Scottish Morbidity RecordsGood quality data sets put together by ISD

Historic SMR1 Discharges January 1981 - March 1997 COPPISH SMR01 Discharges April 1997 onwards Historic SMR4 Discharges 1981 – March 1997 COPPISH SMR04 Admissions April 1996 onwards GRO Death Records January 1980 - December1995 GRO Death Records January 1996 onwards SOCRATES (Cancer Registrations) 1980 onwards

(Still) negotiating access to anonymised SMR data sets

E-Science Institute Neuro- workshop, 28th November 2006

GPASS

General Practice Administration System for Scotland (GPASS)

used by over 85% of GPs in Scotland links from SCI Store to GPASS

access to GPASS software with training data sets

XML API available for querying– www.gpass.co.uk

E-Science Institute Neuro- workshop, 28th November 2006

Data Dictionary

Includes vocabulary for SMR dataClinical dataSocial care data

Negotiating access to DB back end or web service front end to this

Will link to data federation framework / tools

E-Science Institute Neuro- workshop, 28th November 2006

Consent…

E-Science Institute Neuro- workshop, 28th November 2006

Data Linkage

Achieved through Community Health Index (CHI) number

10-character code consisting of 6-digit date of birth (DDMMYY) two digits 9th digit which is always even for females and odd for

males arithmetical check digit

Was scheduled for complete roll-out by 6-6-6

E-Science Institute Neuro- workshop, 28th November 2006

Distributed Data Framework

OGSA-DAIService

GlobusContainer

PortalGrid Server Data Server

DrivingDB

SCI Store 2(SQL Server)

SCI Store 1(SQL Server)

Consent DB(Oracle 10g)

RCB Test Trials DB

(SQL Server)

User Authentication

GlasgowOther

Transfer Grid

Nodes

Remote Trust Policies

Authorisation Access Matrix Security Policies

Access Security Policies

Local Trust

Policies

Local Trust

Policies

Local Trust

Policies

OGSA-DAIService

GlobusContainer

PortalGrid Server Data Server

DrivingDB

SCI Store 2(SQL Server)

SCI Store 1(SQL Server)

Consent DB(Oracle 10g)

RCB Test Trials DB

(SQL Server)

User Authentication

GlasgowOther

Transfer Grid

Nodes

Remote Trust Policies

Authorisation Access Matrix Security Policies

Access Security Policies

Access Security Policies

Local Trust

Policies

Local Trust

Policies

Local Trust

Policies

E-Science Institute Neuro- workshop, 28th November 2006

VOTES Demonstrator(s)

Various proof of concept clinical trials linking SCIStore, GPASS, Consent DBsBrain Trauma network (www.brainit.org)

Collecting various data sets from brain trauma patients across EuropeCentrally maintained repository in Glasgow Southern General Hospital MRI imagesPhysiological data setsWe have been given anonymised versions of these data sets

E-Science Institute Neuro- workshop, 28th November 2006

E-Science Institute Neuro- workshop, 28th November 2006

E-Science Institute Neuro- workshop, 28th November 2006

E-Science Institute Neuro- workshop, 28th November 2006

E-Science Institute Neuro- workshop, 28th November 2006

E-Science Institute Neuro- workshop, 28th November 2006

Dynamicity, Scalability…?

UK Shibboleth federation based around small set of pre-agreed attributes based on eduPerson schema

eduPersonScopedAffiliation: indicates the user’s relationship (e.g., staff, student, etc) within the institution; eduPersonTargetedID: needed when an SP is presented with an anonymous assertion only, e.g. eduPersonScopedAffiliation. This attribute provides a persistent user pseudonym; eduPersonPrincipalName: used where a persistent user identifier consistent across different services is needed; eduPersonEntitlement: enables an institution to assert that a user satisfies an additional set of specific conditions that apply for access to a particular resource

Grid vision for dynamic virtual organisations Add, remove, change people, institutes, their privileges on the fly for changing sets of resources as required by the VO

E-Science Institute Neuro- workshop, 28th November 2006

Dynamicity, Scalability…?

Dynamic Virtual Organisations for e-Science Education (DyVOSE) project

Delegation issuing service Remote Source of Authority trusts me to assign their roles to

my users– Also allows me to delegate to someone else potentially at a

remote site– I trust them to assign roles to my users directly

E-Science Institute Neuro- workshop, 28th November 2006

Future Plans

Several other projects looking to exploit these kinds of things

Major EPSRC pilot project (£5.3M) on “Meeting the Design Challenges of nanoCMOS Electronics” (project just started)

Security essential in this domain including support for IP of data, simulations, processes, licenses,…

Many other life science projects Grid Enabled Microarray Expression Profile Search Scottish Bioinformatics Research Network Biochemical Pathway Simulator

Further proposals building on these solutions Scottish Grid Service

E-Science Institute Neuro- workshop, 28th November 2006

Questions?