ICTP-Sir jamia millia
Transcript of ICTP-Sir jamia millia
-
8/14/2019 ICTP-Sir jamia millia
1/29
1Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
WEB BASED DATA MININGWEB BASED DATA MININGSYSTEM FOR HEALTHCARESYSTEM FOR HEALTHCARE
Harleen Kaur Jamia Millia Islamia, New Delhi, India.
Email- [email protected] Tel No- +91-9891174111
Special Thanks to Prof Siri Krishan Wasan (Jamia Millia Islamia, Deptt. of Mathematics, New Delhi, India) and
Dr Vasudha Bhatnagar (Deptt. of Computer Science, University of Delhi, New Delhi, India).
-
8/14/2019 ICTP-Sir jamia millia
2/29
2Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Outline
Why and What is Data Mining
Data Mining and Healthcare
Web Databases
Proposed System
Summary
References
-
8/14/2019 ICTP-Sir jamia millia
3/29
3Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
DBMS and Data Mining
A Database Management System (DBMS) is a software packagedesigned to store and manage databases
A very large, integrated collection of data DBs data are retrieved as stored DBs results are subset of data Models real-world enterprise
Entities Relationships
Extraction of interesting (non-trivial, implicit, previously unknown andpotentially useful) patterns or knowledge from huge amount of data
As databases grow larger, decision-making from the data is not possible;need knowledge derived from the stored data
Data Mining data need to be cleaned (some what) before producing theresults
Data Mining results are the analysis of the data
-
8/14/2019 ICTP-Sir jamia millia
4/29
4Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Why data mining?
Large volume of data (voluminous)
Dimensionality of data
High data growth rateThere is need to discover Valid, Useful, Structural, Understandable
patterns
Alternative names:
Knowledge discovery in databases (KDD),
Knowledge extraction,Data/pattern analysis
Process of discovering knowledge/patterns in data
-
8/14/2019 ICTP-Sir jamia millia
5/29
5Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Knowledge Discovery in Databases
(KDD) and Data MiningKnowledge Discovery in Databases
Knowledge Discovery in Databases is the nontrivial process of identifying valid, novel, potentiallyuseful, and ultimately understandable patterns in
data : Fayyad (1996). Process of Searching trends and Valuable
anomalies in large datasets
Data Mining
Data Mining is the non-trivial extraction of implicit
previously unknown & potential usefulinformation about data
Core step results in the discovery of knowledge
-
8/14/2019 ICTP-Sir jamia millia
6/29
6Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Knowledge discovery process
Knowledge discovery in databases (KDD) process - Data selection: Identify target datasets and
relevant fields
- Data cleaning and transformation(preprocessing) - Remove noise and outliers - Create common units (common data repository from all sources) - Generate new fields - Data mining model construction - Model evaluation and visualization for the generated results
-
8/14/2019 ICTP-Sir jamia millia
7/29
7Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
adapted from:Usama M. Fayyad, et al. (1996), From Data Mining toKnowledge Discovery : An Overview, Advances in KnowledgeDiscovery and Data Mining, U. Fayyad et al. (Eds.), AAAI/MIT
Press
Data TargetData
Selection
Knowledge
Knowledge
Preprocessed/ TransformedData
Patterns
DataMining
ModelEvaluatio
n
Knowledge Discovery inDatabases: Process
Preprocessingand Transformation
-
8/14/2019 ICTP-Sir jamia millia
8/29
8Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Data Mining: Types of Data
- Relational data- Spatial / Temporal data
- Numeric data - Categorical data
- Time-series data
- Text- Images/ Video/ Multimedia- Web data
-
8/14/2019 ICTP-Sir jamia millia
9/29
9Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Integration of Multiple Technologies
MachineLearning
DatabaseManagemen
t
ArtificialIntelligence
Statistics
DataMining
Visualization
Algorithms
HighPerformanc
ecomputing
-
8/14/2019 ICTP-Sir jamia millia
10/29
10Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Applications adapted
Retail marketing
Telecommunication
Banking
Fraud analysis
Bio-data mining
Stock market analysis
Web mining
-
8/14/2019 ICTP-Sir jamia millia
11/29
11Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Data Mining task
There are two main tasks of data mining:
Predictive data mining
Predicts future values, or unknown values Example Classification - rule induction, decision tree,
neural networks, Bayesian networks, Regression,genetic algorithms, support vector machines
Descriptive mining
Produces the model that describes the observed data Such as Association rules, Clustering
-
8/14/2019 ICTP-Sir jamia millia
12/29
12Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Data Mining Techniques
Common Mining Techniques
Classification
Clustering Associations
Others techniques are
Sequential Patterns Regression
Deviation Detection
-
8/14/2019 ICTP-Sir jamia millia
13/29
13Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Classification
Given a collection of record (training set)
Each record contains a set ofattributes, one of theattributes is the class
Find a model for class attributes as a function of the valuesof other attributes
Goal : Previously unseen records should be assigned a classas accurately as possible
A test set is used to determine the accuracy of the
model. Usually, the given data set is divided intotraining and test sets, with training set used to buildthe model & test set used to validate it.
-
8/14/2019 ICTP-Sir jamia millia
14/29
14Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Clustering
Clustering is unsupervised
Unlike classification, in clustering, no pre-classified data
Search for groups or clusters of data points (records) that aresimilar to one another.
Data points in cluster have high intra-cluster similarity and lowinter-cluster
Distance is used as a measure of similarity
Applications
As a stand-alone tool to get insight into data distribution As a preprocessing step for other algorithms
-
8/14/2019 ICTP-Sir jamia millia
15/29
15Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Association Mining
Association rule mining:
Finding frequent patterns, associations, correlations among sets ofitems or objects in transaction databases, relational databases, andother information repositories.
Frequent pattern: pattern (set of items, sequence, etc.) that occurs
frequently in a database
Motivation: finding regularities in data
What products were often purchased together?
What are the subsequent purchases after buying a PC?
What kinds of DNA are sensitive to this new drug?
Broad applications
Market basket data analysis, cross-marketing, catalog design, salecampaign analysis
Web log analysis, DNA sequence analysis, etc.
-
8/14/2019 ICTP-Sir jamia millia
16/29
16Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Commercial/Research DataMining tools
Some commercial data mining tools are
WEKA (The university of Waikato) http:// www.cs.waikato.ac.nz/ml/weka/
Clementine (SPSS Inc Integral Solutions).
http://www.spss.com/clementine/ Bayesialab (Bayesia SA )
http://www.bayesia.com/ MineSet (Silicon Graphics Inc. - SGI)
http://www.sgi.com/products/
Intelligent Miner (IBM Corp.)
http://www.ibm.com/legal/copytrade.shtml Web Analyst (Megaputer Intelligence Inc.)
http://www.megaputer.com/products
SurfAid Analysis (IBM Corp.) http:// www.nwc.com/
v
http://www.sgi.com/products/http://www.sgi.com/products/ -
8/14/2019 ICTP-Sir jamia millia
17/29
17Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Need for mining healthcare data
Extraction of knowledge for diagnostic, screening, prognostic,monitoring and overall patient management task
Hospital Administration
Strategic decision making
Control cost
Quality of service
Reduce adverse drug events
Analysis of epidemiological data
Predicting patterns of disease Need to develop a system that can support the sharing and reuse of
medical knowledge
-
8/14/2019 ICTP-Sir jamia millia
18/29
18Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Disciplines of Healthcare system where dataDisciplines of Healthcare system where data
mining tools can be appliedmining tools can be applied
Data
Mining
Tools
Treatment
Hospital Informationsystem
Clinical
Modeling
Medical
Imaging
Diagnosis
DrugDevelopment
-
8/14/2019 ICTP-Sir jamia millia
19/29
19Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Mining Issues in Healthcare dataIssues to be addressed
Handling heterogeneous data
Distributed data
High dimensional data
Visual data mining
Privacy-preserving mining
-
8/14/2019 ICTP-Sir jamia millia
20/29
20Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Mining of Web Databases Web is a collection of inter-related files on one or moreWeb servers. Application of Data Mining Techniques : Web Mining -Web content mining
Process of extracting information discovery from onlinesources
- Web usage mining Process of discovering/ mining structure information
from user-browsing and access patterns Some web medical databases are
TRIP Database, one of the Internet's leading medical resources. The TRIP Database allows users torapidly and easily identify high quality medical literature from a wide range of sources athttp://tripdatabase.com
Ovid database provide with the information to tackle scientific or medical questions athttp://ovid.com
MEDLINE on BioMedNetMEDLINE on Scirus is the search engine for science, at http://www.scirus.com
Pharmacological Targets Database (PTBase)Pharmacological Targets Database (PTBase) is no longer available. MDL Elsevier has nowreleased xPharm: a fully interactive Pharmacological database, with 800% more target data thanany other online source at http://bmn.com
http://ovid.com/http://www.scirus.com/http://www.scirus.com/http://ovid.com/ -
8/14/2019 ICTP-Sir jamia millia
21/29
21Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Web Medical Databases
Some on-line medical information include:
CancerNet provides information about cancer, including state-of-the-art information on cancerscreening, prevention, treatment and supportive care, and summaries of clinical trials. (http://www.nci.nih.gov)
CancerNet for Patients and the Public includes access to PDQ (Physician Data Query) andrelated information on treatments; detection, prevention and genetics information; supportivecare information; clinical trial information. (http://cancernet.nci.nih.gov/patient.htm)
CancerLit a comprehensive archival file of more than one million bibliographic records (mostwith abstracts) describing 30 years of cancer research published in biomedical journals,
proceedings of scientific meetings, books, technical reports, and other documents. (http://wwwicic.nci.nih.gov/ canlit/canlit.htm)
CancerNet for Health Professionals includes access to PDQ and related information ontreatments, screening, prevention and genetics;supportive care and advocacy issues; clinicaltrials; a directory of genetic counselors. (http://wwwicic.nci.nih.gov/ health.htm)
http://www.nci.nih.gov/http://cancernet.nci.nih.gov/patient.htmhttp://wwwicic.nci.nih.gov/http://wwwicic.nci.nih.gov/http://cancernet.nci.nih.gov/patient.htmhttp://www.nci.nih.gov/ -
8/14/2019 ICTP-Sir jamia millia
22/29
22Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Challenges to Medical InformationSystems
Integrating various medical data sources such as server accesslogs, referrer logs, patient registration or patient profileinformation
Resolving difficulties in the diagnosis of diseases due to unique
key attributes in the patient record which can easily bepredicted
Predicting patient treatment
Prescribing patient medication
To help patients maintain their independence and maximum levelof function within their own homes and communities
The goal is to educate patient in self-care and prolonged medicalmonitoring and supervision
-
8/14/2019 ICTP-Sir jamia millia
23/29
23Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Proposed Web based Healthcaresystem
Medical Tele-monitoring and Tele-care facilities
System can easily expand to cover all the healthcarespectra
Interface provides a friendly environment both for thepatient and for the physician
Patient
s
Doctor Medical
Staff
KnowledgeRefereed ServerComponents of Medical System
-
8/14/2019 ICTP-Sir jamia millia
24/29
24Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Features of the system
Patients is able to send his/her medical problem via www bycompleting the online web forms
The doctor is able to browse the data and check the patients
Proposed system is user-friendly, cost-effective and Powerful tool
Proposed medical system includes not only diagnosis, medicaltreatment but also prolonged medical monitoring andsupervision
The goal of medical care is to control disease processes and tohelp patients maintain their maximum level of function withintheir own homes and communities
-
8/14/2019 ICTP-Sir jamia millia
25/29
25Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Motivation
The World Wide Web is the richest and most dense source ofinformation
The Web/WAP portal is able to gather information from patients
regardless of their location As the data in the database expand as result of the wide use of the
portal, it becomes difficult to find information manually
Data mining provides algorithms, which allow automatic patterndiscovery and interactive analysis
The system can support the doctors effort by posting up alertswhenever a patients health is in a critical position.
-
8/14/2019 ICTP-Sir jamia millia
26/29
26Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
Summary
Web has been adopted as a critical communication andinformation medium by a majority of the population
Web data is growing at a significant rate
A number of new data mining concepts and techniques have beendeveloped using this concept
Many successful applications exist
Fertile area of research
Privacy
-
8/14/2019 ICTP-Sir jamia millia
27/29
27Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
References :References :
1. Andreassfor interp
Conferen
CA Aug
-
8/14/2019 ICTP-Sir jamia millia
28/29
28Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
References :References :
8. Han, J. and M.
Kauffmann Pu
9 Lu H R Seti
-
8/14/2019 ICTP-Sir jamia millia
29/29
29Harleen Kaur, Jamia Millia Islamia, New Delhi, India.
In our move towards becoming a developed nation,In our move towards becoming a developed nation,to provide an honorable and comfortable life toto provide an honorable and comfortable life to
IndiansIndians THANK YOU !!!THANK YOU !!!