BIG DATA, BIOBANKS AND PREDICTIVE ANALYTICS FOR A...
Transcript of BIG DATA, BIOBANKS AND PREDICTIVE ANALYTICS FOR A...
BIG DATA, BIOBANKS AND
PREDICTIVE ANALYTICS FOR
A BETTER CLINICAL OUTCOME
BIG DATA, BIOBANKS AND
PREDICTIVE ANALYTICS FOR
A BETTER CLINICAL OUTCOME
Π. Ε. Βάρδας MD, PhD(London)Π. Ε. Βάρδας MD, PhD(London)
DISCLOSURESDISCLOSURES
My great love to innovative ideasMy great love to innovative ideas
BIG DATABIG DATA
It is a broad term for data sets, so large or
complex that traditional data processing
applications are inadequate.
It is a broad term for data sets, so large or
complex that traditional data processing
applications are inadequate.
BIG DATABIG DATA
To qualify as “big” in the sense that
information scientists use the term, a dataset
much reach a level of size and complexity, that
it becomes a challenge to store, process and
analyze by standard computational methods.
It is estimated that per capita computing
capacity has been doubling every 40 months
since the 1980’s
To qualify as “big” in the sense that
information scientists use the term, a dataset
much reach a level of size and complexity, that
it becomes a challenge to store, process and
analyze by standard computational methods.
It is estimated that per capita computing
capacity has been doubling every 40 months
since the 1980’s
BIG DATABIG DATA
• Challenges include analysis, capture, data curation,
search, sharing, storage, transfer, visualization and
information privacy.
• Data sets grow in size in part, because they are
increasingly being gathered by cheap and
numerous information sensing mobile devices,
remote sensing, cameras, microphones, radio-
frequency identification readers & wireless sensors.
• Challenges include analysis, capture, data curation,
search, sharing, storage, transfer, visualization and
information privacy.
• Data sets grow in size in part, because they are
increasingly being gathered by cheap and
numerous information sensing mobile devices,
remote sensing, cameras, microphones, radio-
frequency identification readers & wireless sensors.
LARGE DATA SIZESLARGE DATA SIZES
• BYTE
1 byte: A single character
10 bytes: A single word
100 bytes: A telegram
• KILOBYTE (1.000 bytes)
1 kilobyte : A very short story
10 kilobytes: An encyclopedic page
50 kilobytes: A compressed document image page
• BYTE
1 byte: A single character
10 bytes: A single word
100 bytes: A telegram
• KILOBYTE (1.000 bytes)
1 kilobyte : A very short story
10 kilobytes: An encyclopedic page
50 kilobytes: A compressed document image page
LARGE DATA SIZESLARGE DATA SIZES
• MEGABYTE (1.000.000 bytes)
1 Megabyte: A small novel
10 Megabytes: A minute of high fidelity sound
100 Megabytes: One meter of shelved books
500 Megabytes: A CD-ROM
• GIGABYTE (1.000.000.000 bytes)
1 Gigabyte : A pickup truck filled with paper, or a movie
at TV quality
100 Gigabytes: A floor of academic journals
• MEGABYTE (1.000.000 bytes)
1 Megabyte: A small novel
10 Megabytes: A minute of high fidelity sound
100 Megabytes: One meter of shelved books
500 Megabytes: A CD-ROM
• GIGABYTE (1.000.000.000 bytes)
1 Gigabyte : A pickup truck filled with paper, or a movie
at TV quality
100 Gigabytes: A floor of academic journals
LARGE DATA SIZESLARGE DATA SIZES
• TERABYTE (1.000.000.000.000 bytes)
1 Terabyte: All the X-Ray films in a large hospital
10 Terabytes: The printed collection of the US
Library of Congress
50 Terabytes: The contents of a large Mass Storage
System
• TERABYTE (1.000.000.000.000 bytes)
1 Terabyte: All the X-Ray films in a large hospital
10 Terabytes: The printed collection of the US
Library of Congress
50 Terabytes: The contents of a large Mass Storage
System
LARGE DATA SETSLARGE DATA SETS
• PETABYTE (1.000.000.000.000.000 bytes)
2 Petabytes: All US academic research libraries
20 Petabytes: All production of hard-disk drivers in
1995
200 Petabytes: All printed material
• PETABYTE (1.000.000.000.000.000 bytes)
2 Petabytes: All US academic research libraries
20 Petabytes: All production of hard-disk drivers in
1995
200 Petabytes: All printed material
LARGE DATA SETSLARGE DATA SETS
• EXABYTE (1.000.000.000.000.000.000 bytes)5 Exabytes: All words ever spoken by human beings
• ZETABYTE (1.000.000.000.000.000.000.000 bytes)
• YOTTABYTE …
• XENOTTABYTE …
• SHILENTNOBYTE…
• DOMEGEMEGROTTEBYTE…
• EXABYTE (1.000.000.000.000.000.000 bytes)5 Exabytes: All words ever spoken by human beings
• ZETABYTE (1.000.000.000.000.000.000.000 bytes)
• YOTTABYTE …
• XENOTTABYTE …
• SHILENTNOBYTE…
• DOMEGEMEGROTTEBYTE…
LARGE DATA SETSLARGE DATA SETS
• According to International Data Corporation, the
total amount of global data was expected to grow to
2.7 zettabytes during 2012. This is 48% up from 2011
• In 2020 it is estimated there will be 44 times more
data than in 2009
• That means 35 zettabytes compared to 800.000
Petabytes
• According to International Data Corporation, the
total amount of global data was expected to grow to
2.7 zettabytes during 2012. This is 48% up from 2011
• In 2020 it is estimated there will be 44 times more
data than in 2009
• That means 35 zettabytes compared to 800.000
Petabytes
PLEASE IMAGING..PLEASE IMAGING..
BIG DATA BASICSBIG DATA BASICSVOLUMEIn 2020, it is estimated there will be 44 timesmore data than 2009. Thirty-nine Zetabytes comparedto 800.000 Petabytes
VELOCITYRepresents the increasing frequency withwhich data is delivered
VARIETYIt signifies the many forms in which data exists
VOLUMEIn 2020, it is estimated there will be 44 timesmore data than 2009. Thirty-nine Zetabytes comparedto 800.000 Petabytes
VELOCITYRepresents the increasing frequency withwhich data is delivered
VARIETYIt signifies the many forms in which data exists
THE NEED FOR ELECTRICAL MEDICALRECORDS (EMR)
THE NEED FOR ELECTRICAL MEDICALRECORDS (EMR)
Development of EMR will permit integration ofbiological data, clinical information, patientinformation and clinical outcomes
Large population or specific groups of patient withselected characteristics could be easily identifiedwith the availability of electronic medical records
In genomic research EMR, facilitate analysis geneticand molecular information from large subjectpopulations allowing studies to be more powerfulthan small cohort studies.
Development of EMR will permit integration ofbiological data, clinical information, patientinformation and clinical outcomes
Large population or specific groups of patient withselected characteristics could be easily identifiedwith the availability of electronic medical records
In genomic research EMR, facilitate analysis geneticand molecular information from large subjectpopulations allowing studies to be more powerfulthan small cohort studies.
PREDICTIVE ANALYTICSPREDICTIVE ANALYTICS
PREDICTIVE ANALYTICSPREDICTIVE ANALYTICS
It Is the practice of extracting information fromexisting data sets, in order to determine patternsand predict future outcomes and trends.
Predictive analytics does not tell you what willhappen in future.
It Is the practice of extracting information fromexisting data sets, in order to determine patternsand predict future outcomes and trends.
Predictive analytics does not tell you what willhappen in future.
PREDICTIVE ANALYTICSPREDICTIVE ANALYTICS
Predictive Analytics (PA) uses technology and
Statistical methods to search through massive amounts
of information, analyzing it to predict outcomes for
individual patients.
That information can include data from past treatment
outcomes as well as the latest medical research
published in peer-reviewed journals and databases.
Predictive Analytics (PA) uses technology and
Statistical methods to search through massive amounts
of information, analyzing it to predict outcomes for
individual patients.
That information can include data from past treatment
outcomes as well as the latest medical research
published in peer-reviewed journals and databases.
…IN HEALTHCARE…IN HEALTHCARE
BIG DATA AND PREDICTIVE ANALYTICSBIG DATA AND PREDICTIVE ANALYTICS
They will study whether we can predict anindividual's disease course as early as possible, byinferring their subtype.
In many diseases, is not one, but many differentsubtypes.
Analyzing patterns allow us to ask why differentindividuals show different disease trajectories
They will study whether we can predict anindividual's disease course as early as possible, byinferring their subtype.
In many diseases, is not one, but many differentsubtypes.
Analyzing patterns allow us to ask why differentindividuals show different disease trajectories
BIG DATA AND CLINICALOUTCOMES
BIG DATA AND CLINICALOUTCOMES
HOW BIG DATA HELPS HEALTHCAREHOW BIG DATA HELPS HEALTHCARE
• Big Data has tremendous potential to add value in
all healthcare settings.
• Big Data solutions can help organizations
personalize care, engage patients, reduce variability
and cost and improve quality.
• Personalization whether based on genomic data,
standard test data, or a combination of the two,
requires the integration and analysis of much larger
volumes of data.
• Big Data has tremendous potential to add value in
all healthcare settings.
• Big Data solutions can help organizations
personalize care, engage patients, reduce variability
and cost and improve quality.
• Personalization whether based on genomic data,
standard test data, or a combination of the two,
requires the integration and analysis of much larger
volumes of data.
BIG DATA IN THE DIGITAL HEALTHBIG DATA IN THE DIGITAL HEALTH
1. Web & Social media data(Smart phone apps, healthplan websites)
2. Machine-to-machine data(Sensors, meters,different devices)
3. Transactions data(Health care claims, billingrecords, in both semi-structured and unstructuredformats)
4. Biometric data5. Human general data6. Pharmaceutical & Medtech, R&D data
1. Web & Social media data(Smart phone apps, healthplan websites)
2. Machine-to-machine data(Sensors, meters,different devices)
3. Transactions data(Health care claims, billingrecords, in both semi-structured and unstructuredformats)
4. Biometric data5. Human general data6. Pharmaceutical & Medtech, R&D data
Usually addresses the following six categories ofinformation
Usually addresses the following six categories ofinformation
FACTORS DRIVING THE BIG DATA MARKET INTHE HEALTHCARE SECTOR
FACTORS DRIVING THE BIG DATA MARKET INTHE HEALTHCARE SECTOR
• The need for improved clinical outcomes• The need for increased efficiency in managing
healthcare data• The presence of Federal healthcare mandates in some
segments• The double digit growth in the HER• The increased focus on value-based medicine• The need for personalized medicine that’s based on
analytics• The need for improved decision support• The need to reduce pharmaceutical cost• The need to reduce clinical testing costs
• The need for improved clinical outcomes• The need for increased efficiency in managing
healthcare data• The presence of Federal healthcare mandates in some
segments• The double digit growth in the HER• The increased focus on value-based medicine• The need for personalized medicine that’s based on
analytics• The need for improved decision support• The need to reduce pharmaceutical cost• The need to reduce clinical testing costs
FACTORS INHIBITING THE GROWTHOF BIG DATA
FACTORS INHIBITING THE GROWTHOF BIG DATA
• A resistance to a systems-approach by the medicalcommunity
• The operational gap between payer & provider frontoffice
• The acute IT staff shortage in healthcare• A lack of comparable & transparent data in healthcare• Financial constraints• Concerns regarding ensuring patient confidentiality• The low costs of traditional analytics techniques• The lack of interoperability between healthcare
systems
• A resistance to a systems-approach by the medicalcommunity
• The operational gap between payer & provider frontoffice
• The acute IT staff shortage in healthcare• A lack of comparable & transparent data in healthcare• Financial constraints• Concerns regarding ensuring patient confidentiality• The low costs of traditional analytics techniques• The lack of interoperability between healthcare
systems
BIOBANKSBIOBANKS
BIOBANKSBIOBANKS
A collection of biological material (e.g.animal, plant, human-skin, blood, organs,hair, saliva etc) with correspondingdocumentation that can be used for researchpurposes.
A collection of biological material (e.g.animal, plant, human-skin, blood, organs,hair, saliva etc) with correspondingdocumentation that can be used for researchpurposes.
Personalized medicine is a new model ofhealthcare treatment.
It delivers targeted diagnostics, treatment andadvice on nutrition, which are tailored to anindividual.
Personalized or precision medicine will beeffective for the majority of people once geneticand molecular information derived from theirsamples, has been systematically understood.
Personalized medicine is a new model ofhealthcare treatment.
It delivers targeted diagnostics, treatment andadvice on nutrition, which are tailored to anindividual.
Personalized or precision medicine will beeffective for the majority of people once geneticand molecular information derived from theirsamples, has been systematically understood.
CAN BIOBANKS BE USED FORPERSONALISED MEDICINE ?
CAN BIOBANKS BE USED FORPERSONALISED MEDICINE ?
CAN BIOBANKS BE USED FORPERSONALISED MEDICINE ?
CAN BIOBANKS BE USED FORPERSONALISED MEDICINE ?
To achieve the targets we need to study geneticand molecular information.
We need to predict the risk of disease, identifynew targets for treatments and also identifymarkers predicting positive or negative reactionto treatment options.
Therefore biobanks are needed as they containa large pool of resources in the form of codedgenetic materials
To achieve the targets we need to study geneticand molecular information.
We need to predict the risk of disease, identifynew targets for treatments and also identifymarkers predicting positive or negative reactionto treatment options.
Therefore biobanks are needed as they containa large pool of resources in the form of codedgenetic materials
“WHAT WE’RE TALKING ABOUT HERE ISTHE TRANSFORMATION OF MEDICINE”Scott Zeger,Vice Provost for ResearchJohns Hopkins University, USA
The biomedical sciences have been the pillar ofthe health care system for a long time.The new system will have two equal pillars:The biomedical sciences and the data sciences
“WHAT WE’RE TALKING ABOUT HERE ISTHE TRANSFORMATION OF MEDICINE”Scott Zeger,Vice Provost for ResearchJohns Hopkins University, USA
The biomedical sciences have been the pillar ofthe health care system for a long time.The new system will have two equal pillars:The biomedical sciences and the data sciences
To physicians of the time, the appropriate treatment
for “apparent death” was warmth and stimulation.
For this purpose, artificial respiration and the blowing
of smoke into the lungs or the rectum were thought to
be interchangeably useful. The smoke enema was
considered the most potent method, however, due to
the warming and stimulating properties associated
with tobacco in the pharmacopoeia of the period. At
the turn of the 19th century, tobacco smoke enemas
had become an established practice in Western
medicine, considered by Humane Societies to be as
important as artificial respiration. In the 1780s, the
Royal Humane Society installed resuscitation kits,
including smoke enemas, at various points along the
Thames.
To physicians of the time, the appropriate treatment
for “apparent death” was warmth and stimulation.
For this purpose, artificial respiration and the blowing
of smoke into the lungs or the rectum were thought to
be interchangeably useful. The smoke enema was
considered the most potent method, however, due to
the warming and stimulating properties associated
with tobacco in the pharmacopoeia of the period. At
the turn of the 19th century, tobacco smoke enemas
had become an established practice in Western
medicine, considered by Humane Societies to be as
important as artificial respiration. In the 1780s, the
Royal Humane Society installed resuscitation kits,
including smoke enemas, at various points along the