Learning from medical data streamspprodrigues/lmds11/lemeds... · Learning from Medical Data...
Transcript of Learning from medical data streamspprodrigues/lmds11/lemeds... · Learning from Medical Data...
Learning frommedical data streams
an introduction
Pedro Pereira Rodrigues, Mykola PechenizkiyMohamed Medhat Gaber, João Gama
[email protected] [email protected]@port.ac.uk [email protected]
6th July 2011
2LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Roadmap
Data stream paradigm
● Medical data streams
Machine learning
● Learning from data streams
Learning from medical data streams
● Recent trends
LEMEDS Workshop
● Objectives
● Program
3LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Data Stream Paradigm
Many sources produce data continuously:
● web clicks logs
● telephone records
● sensors and RFIDs
● information systems
● …
These continuous flows of data are called data streams1.
After processing, a data point is either discarded or archived2.
1. Muthukrishnan, S. (2005). Data Streams: Algorithms and Applications. […]
2. Gama, J. and Rodrigues, P. P. (2007). “Data stream processing”. [...]
4LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Data Stream Paradigm
Set Stream
Size finite infinite
Distribution i.i.d. non i.i.d.
Ordering independent dependent
Evolution static nonstationary
5LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Ubiquitous Data Streams
Multiple sources producing streams of data:
● web sites and users
● smart phones and other mobile devices
● integrated health information systems
● sensor networks
● …
Most devices are now able to sense and process information1.
Distributed sources, each one producing a high-speed data stream2.
1. Rodrigues, P.P, Gama, J., and Lopes, L. (2010). “Knowledge discovery for sensor network comprehension”. [...]
2. Rodrigues, P. P. and Gama, J. (2009). “A system for analysis and prediction of electricity load streams”. [...]
6LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Medical Data Streams
Anatomical sensors
● movement● position● location
Physiological sensors
● blood pressure● heartbeat rate● ECG● EEG● ...
BiomedicalSignals
(high rate)
7LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Medical Data Streams
Hospital data
● episode descriptions● consultation records● diagnosis● executed procedures
Incidence records
● personal health records● collaborative disease surveilance● bacterial surveilance● pharmacologic surveilance
HealthRecords
(large size)
8LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Medical Data Streams
Medical settings are using sensors and networks of health information systems to integrate data from patients from which it is necessary to extract some sort of knowledge.
New services like Google Health appear allowing users to store and track information about their medical history, to connect to and stream data from medical devices.
Medical data streams become widespread and call for development of intelligent tools for making use of these data.
9LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Medical Data Streams
Possible uses:
● Decision support
● Alerting services
● Ambient intelligence
● Assisted leaving
● Personalization services
All of them are characterized by the huge amounts of data being produced, and often require fast and accurate information retrieval and analysis, that can effectively support clinical decisions.
10LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Medical Data Streams
Medical domains introduce extra peculiarities to the learning problem:
Structured and uncertain data
Health information systems deal with heterogeneous data sources
Possibly distributed across healthcare institutions
Data integration requirement yields privacy-preserving issues
Integration forces the system to manage:
● time● resources, and● costs.
11LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Medical Data Streams
Particular issues to address include:
● Summarization of infinite data
● Incremental and decremental learning
● Resource-awareness
● Real-time monitoring of changes
● Novelty detection
This is an incremental task that requires incremental learning algorithms that integrate artificial intelligence in medical domains.
12LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Learning from Data Streams
Learning in the presence of drift requires monitoring of the learning process1.
1. Gama, J. and Rodrigues, P.P. (2007). “Data stream processing”. [...]
13LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Medical Data Streams
● Intelligent analysis and learning from streaming data are widely spread in the machine learning research community.
● In medical domains, technological issues of data collection and storage, access, integration and fusion are also widely studied in the health informatics research community.
● However, adoption and development of tailored techniques for medical stream mining and clinical decision support is still to come.
14LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Knowledge Discovery
1. Patel, V., Shortliffe, E., Stefanelli, M., Szolovitz, P.,Berthold, M., et al. (2009) “The coming of age of artificial intelligence in medicine”. [...]
15LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Learning from Medical Data Streams
1. Zhou, X., et al. (2006) “Monitoring Abnormal Patterns with Complex Semantics over ICU Data Streams” [...]
16LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Learning from Medical Data Streams
Recent trends
● Carolyn McGregor et al.
● Event correlation in neonatal ICU using SOA for integration of clinical and physiological streams
● Daby Sow et al.
● Body sensor stream processing
● Mykola Pechenizkiy et al.
● Concept drift in antibiotic resistace in nosocomial infections
● Raquel Sebastião et al.
● Concept drift detection for CTG and BIS
● Xinbiao Zhou et al.
● Mining abnormal patterns in ICU
17LEMEDS @ AIME 2011Pedro Pereira Rodrigues
SOA for Integration and Event Correlation
Service-oriented arquitecture (SOA)
● integrates physiological and clinical data
● based on temporal abstractions
● aims to correlatetemporal events
1. Kamaleswaran, R., et al. (2009) “Service oriented architecture for the integration of clinical and physiological data for realtime event stream processing” [...]
18LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Body Sensor Stream Processing
Language to support processing of body sensor streaming data
● extracts medically significant events
● interacts with external data mining systems
● allows streaming applications to make use of historical data
● improve their detection and prediction performance
1. Sow, D., et al. (2010) “Body Sensor Data Processing using Stream Computing” [...]
19LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Antibiotic Resistance in Nosocomial Infections
Concept drift detection on laboratory data
● emsemble learning
● multiple learning models● output value is a weighted average of the single predictions
● dynamic integration of classifiers
● weights managing local drift detection
● improves classification results
● captures seasonal trends in infections and resistance effects
1. Tsymbal, A. et al. (2006) “Handling Local Concept Drift with Dynamic Integration of Classifiers: Domain of Antibiotic Resistance in Nosocomial Infections” [...]
20LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Concept Drift Detection in CTG
Cardiotocography (CTG)
● fetal heart rate (FHR)
● uterine contractions (UC)
SisPorto®
● 'Normal' – green
● 'Suspicious' – yellow/orange
● 'Problematic' – red
1. Sebastião, R.. et al. (2010) “Monitoring Incremental Histogram Distribution for Change Detection in Data Streams” [...]
21LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Concept Drift Detection in BIS
Bispectral Index (BIS)
● Monitor anesthesia depth
● Detect sudden drift of the BIS signal
1. Sebastião, R.. et al. (2009) “Contributions to an Advisory System to Enhanced TCI in SedationAnalgesia Procedures” [...]
22LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Abnormal Patterns in ICU Streams
Monitoring abnormal patterns over ICU data streams
● complex repetitive phenomena
● defines the deviation of the process from true periodicity
● efficiently identifies the anomalies just in a single pass
1. Zhou, X., et al. (2006) “Monitoring Abnormal Patterns with Complex Semantics over ICU Data Streams” [...]
23LEMEDS @ AIME 2011Pedro Pereira Rodrigues
AIME 2011
This year at AIME 2011 several papers addressed data streams (mainly biomedical signals) and related fields. Two of them were even distinguished as the two best student papers:
● Fernando García-García, Gema García-Sáez, Paloma Chausa, Iñaki Martínez-Sarriegui, Pedro José Benito, Enrique J. Gómez and M. Elena Hernando
Statistical Machine Learning for Automatic Assessment of Physical Activity Intensity Using Multi-axial Accelerometry and Heart Rate
● Catherine G. Enright, Michael G. Madden and Niall Madden
Clinical Time Series Data Analysis Using Mathematical Models and DBNs
24LEMEDS @ AIME 2011Pedro Pereira Rodrigues
LEMEDS Workshop
Objectives:
● Bring together researchers from the machine learning and the clinical data streams communities
● Promote discussion for requirements and recommendations for learning from medical data streams
Program:
● 1 invited talk
● 5 contributed papers
● 1 expert panel discussion
25LEMEDS @ AIME 2011Pedro Pereira Rodrigues
LEMEDS Workshop
The fact that more and more medical data is being produced by sensors that measure physiological parameters [6] includes new challenges to artificial intelligence in general, and machine learning in particular.
Peter Lucas talk “Disease Monitoring and Clinical Decision Support” will focus on the properties of biomedical data streams (e.g. from sensors) and the constraints on how collected data can be exploited, and the new opportunities to monitor the progress of diseases in patients.
26LEMEDS @ AIME 2011Pedro Pereira Rodrigues
LEMEDS Workshop
Biomedical signals have been addressed by most papers.
Jones et al. [3] proposed to interpret biosignals to improve mobile health monitoring for clinical decision support, using body sensor networks.
The paper presents two possible applications and discusses the possibilities of applying machine learning in this ubiquitous streaming scenario.
27LEMEDS @ AIME 2011Pedro Pereira Rodrigues
LEMEDS Workshop
Rodrigues et al. [10] propose to improve cardiotocography monitoring using streaming statistics of both the fetal heart rate and the uterine contractions signals.
The statistics will then be used to early detect changes in the monitored signals, and help in the prediction of birth outcome.
It is a position paper that has not experimentation yet, but should foster a sound discussion on the subject.
28LEMEDS @ AIME 2011Pedro Pereira Rodrigues
LEMEDS Workshop
Sebastião et al. [11] developed a learning-based advisory system for detecting changes in depth of anesthesia signals.
The paper addresses an important problem in the operational settings that can help to adapt administered doses for patients.
The problem is formulated and addressed as the problem of handling concept drift in online settings, with the obtained experimental results, based on real data collected at one of the hospitals, being very promising.
29LEMEDS @ AIME 2011Pedro Pereira Rodrigues
LEMEDS Workshop
McGregor et al. [7] presents a process mining framework to improve clinical guidelines in clinical care.
The paper presents a very interesting approach to knowledge discovery in a challenging scenario where data is produced as several heterogeneous streams.
Intensive care units are, undoubtedly in current healthcare services, the main clinical setting where data streams are being produced.
30LEMEDS @ AIME 2011Pedro Pereira Rodrigues
LEMEDS Workshop
But other medical streams exist differing from biomedical signals.
In Rodrigues et al. [9], the authors describe a setting of integrated electronic health records, trying to improve the visualization mechanism of the increasing amount of clinical documents available in a central hospital.
As a position paper, this paper clearly presents the problem spaces and the research presents a valid and realistic problem that exists within health-care today.
31LEMEDS @ AIME 2011Pedro Pereira Rodrigues
LEMEDS Workshop
Three experts will guide us through a discussion on:
main domains where medical data is produced as a stream
applications for LEMEDS
related fields of research
issues that differentiate this research area from other fields;
best forums for researchers to publish and discuss LEMEDS.
Panel:
Carlo Combi
Carolyn McGregor
João Gama
32LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Acknowledgments
AIME 2011 organization.
The authors of contributed papers.
Invited speaker: Peter Lucas.
Experts panel: Carlo Combi, Carolyn McGregor, João Gama.
PC members:
Miguel Coimbra, Antoine Cornuéjols, Matjaz Kukar, Mark Last, Florent Masseglia, Ernestina Menasalvas, Josep Roure Alcobé, Cristina Santos, Alexey Tsymbal and Indre Zliobaite.
33LEMEDS @ AIME 2011Pedro Pereira Rodrigues
Learning from Medical Data Streams
Thank you and Welcome!