Anthony J brookes
-
Upload
eduserv -
Category
Technology
-
view
1.038 -
download
1
Transcript of Anthony J brookes
Big data and
Knowledge Engineering
for Health
May 2012, London Eduserv Symposium 2012: Big Data, Big Deal?
Prof. Anthony J Brookes: University of Leicester, UK
Different or No Big Data problems?
- Changing or stable rate of data generation / availability
- Changing or stable complexity of data
- Changing or stable requirement to use the data
- Changing or stable tooling to use the data
- Changing or stable mass of ‘useless’ data (vs knowledge)
Knowledge engineering was first defined in 1983 as “an engineering discipline that involves integrating knowledge into computer systems in order to solve complex problems normally requiring a high level of human expertise” (Feigenbaum and McCorduck, 1983).
‘KNOWLEDGE ENGINEERING’ for HEALTH
Knowledge Engineering
Building and engaging with the community:
- presentation & discussion at many international meetings and forums
- 1/2 day workshop as satellite to ESHG (6 invited speakers)
- workshop session at MIE2011 (3 invited speakers, audience discussion)
- I-Health 2011 workshop in Brussels, 3-4 Oct 2011
- growing community, currently >150 academics, companies, healthcare providers
Integration and Interpretation of Information for Individualised Healthcare http://www.i4health.eu/
150,000 published <100 routinely used
Mostly unknown to Healthcare
BIO-INFORMATICS MED-INFORMATICS
ACADEMICS COMPANIES
Data
Data
Biobanks
Registries
RESEARCH HEALTHCARE
RES
EAR
CH
HEA
LTH
CA
RE
I 4 H E A L T H
‘KNOWLEDGE ENGINEERING’ for HEALTH
RESEARCH WORLD
‘KNOWLEDGE GENERATION’ ...make sense of these entities
CLINICAL WORLD
‘KNOWLEDGE ENGINEERING’ ...identify & use the bits you understand
STANDARDS
• Semantic Standards (to allow unambiguous understanding of the data) – Terminologies, Ontologies, Vocabularies, Coding systems – Need cross-mapping between semantic standards, and across languages
• Syntactic Standards (to make data structures interoperable) – Data and Metadata object models, and Exchange formats – Minimal content specifications, harmonised across domains – Robust core requirements, with general principles that bring flexibility
• Technical Standards (to build a system that works efficiently)
– Database models, Search systems, and User interfaces (e.g., browsers) – Web-service specifications, Web 2.0 technologies – ID solutions for data, databases, publications, biobanks, researchers – Technologies for controlling data access and user permissions – Ethical and Legal policies, implementation, and recognition-rewards structures
• Quality Standards (to match data to needs)
– measuring and representing quality in a meaningful way – Important role here for metadata – Recording and standardising SOPs
2012-02-07 DCC roadshow East Midlands - CC-BY-SA 15
..personal data
Electronic Healthcare Records
EHR
Terminology
Information Model Communication
Models
Collection Models
Search and Retrieval Models
Classifications
Expressiveness Precision/rigour
Searchability Comparability Best Practice
Structure Detail Search Storage
Interoperability
Utility Categorisation Secondary use Decision
Making
Recording
Registration and Location
Models
Notify, Find
Data sharing - Incentive/reward systems
- 3 categories of risk, with ‘speed pass’ access control
- Compulsion/sanctions
- Researcher IDs (ORCID)
- Open data discovery (e.g., Cafe Variome)
- Remote pooled analysis (e.g., Data Shield, EU-ADR/EMIF)
Modelling:
‘Patient Avatars’ / ‘Virtual Patients’
Personalised medicine
Stratified medicine
BIG DATA:
21
The answer is not a data warehouse !
ARCHITECTURE:
Biosensors EHR
Modalities
Systems data
Text & Web pages
Computer Models
Decision Support Systems
BioScience & Omics
Databases
Fee
db
ack
/ O
pti
mis
atio
n
Self- Optimising
Emerging architectural Concept
DISORGANISED DIGITAL INFORMATION RELEVANT TO PERSONALIZED HEALTHCARE
Personal
Imaging Instrumentation Omics Clinical
Population Models
Data +
Information +
Knowledge
Knowledge Portals
Health Care Utility
Optimised Healthcare
Big Data can mainly stay at ‘source’, feeding the Knowledge Extraction process
Knowledge Extraction/Distillation filters therefore need to be created
Policy and Strategy
- To kick start the field: Put money into research, development, and application projects based upon the Knowledge Engineering concept
- To create the needed expertise: Cross-train people who have a talent for engineering in computer science + bioscience + healthcare
- To ensure interoperability across the total system: Organise activities on a middle-out basis, rather than the usual top-down or bottom-up approaches
- To ensure innovation and sustainability: explore ways to get academic and commercial players working together
- To start bringing the system to life: Emphasize knowledge ‘filtration’, ‘distillation’, and ‘provision’ from sources of (Big) Data
Knowledge Engineering
Acknowledgments
• GEN2PHEN Partners
• My team: Robert Free, Rob Hastings, Adam Webb, Tim Beck, Sirisha Gollapudi, Gudmundur Thorisson, Owen Lancaster
• Some key discussants: Søren Brunak, Debasis Dash, Carlos Diaz, Norbert Graf, Johan van der Lei, Heinz Lemke, Ferran Sanz
This work received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement number 200754 - the GEN2PHEN project.
“Data-to-Knowledge-for-Practice” (D2K4P) Center