11 Data Mining

11.3 Additional T hemes on Da ta Mining • Due to the broad scope of data mining and the large variety of data mining methodologies, not all of the themes on data mining can be thoroughly covered 11.3.1 Theoretical Foundations of Data Mining • A solid and system atic theoretical foundation is imp ortant because it can hel p provide a coherent framework for the development, evaluation, and practice of data mining technology 1. Data reduction: • In this theory, the basis of data mining is to reduce the data representat ion. • Data reduction trades accur acy for speed in response to the need to obtain quick approximate answers to queries on very large databases. • Data reduction techniques include singular value decomposition (the driving element behind principal components analysis), wavelets, regression, log-linear models, histograms, clusterin g, sampling, and the construction of index trees 2. Data compression: • According to this theory, the basis of data mining is to compress the given data by encoding in terms of bits, association rules, decision trees, clusters, and so on. • Encoding based on the minimum description length principle states that the “best” theory to infer from a set of data is the one that minimizes the length of the theory and the length of the data when encoded, using the theory as a predictor for the data. • This encoding is typically in bits. 3. Pattern discovery: • In this theory, the basis of data mining is to discover patterns occurring in the database, such as associations, classification models, sequential patterns, and so on. • Are as such as machine learni ng, neural net work, association min ing , sequential pattern mining, clustering, and several other subfields contribute to this theory. 4. Probability theory: • This is based on statistical theory . • In this theory, the basis of data mining is to discover joint probability distributions of random variables, for example, Bayesian belief networks or hierarchical Bayesian models. 5. Microeconomic view: • The microeconomic view considers data mining as the task of finding patterns that are interesting only to the extent that they can be used in the decision-making process of some enterprise (e.g., regarding marketing strategies and production plans). • This view is one of utility, in which patterns are considered interes ting if they can be acted on. • Enterpri ses are rega rded as facing opti mization problems, where the object is to maximize the utility or value of a decision. • In this theory, data mining becomes a nonlinear optimization problem. 6. Inductive databases: • According to this theory, a database schema consists of data and patterns that are stored in the database. • Data mining is therefore the problem of performing induction on databases, where the task is to query the data and the theory (i.e., patterns) of the database. • This view is popular among many researchers in database systems.

Upload
bharathimanian
Category

Documents
view
219
download
0

Embed Size (px):

Transcript of 11 Data Mining

11.3 Additional Themes on Data Mining

• Due to the broad scope of data mining and the large variety of data mining

methodologies, not all of the themes on data mining can be thoroughly covered

11.3.1 Theoretical Foundations of Data Mining

• A solid and systematic theoretical foundation is important because it can help

provide a coherent framework for the development, evaluation, and practice of data mining

technology

1. Data reduction:

• In this theory, the basis of data mining is to reduce the data representation.

• Data reduction trades accuracy for speed in response to the need to obtain quick

approximate answers to queries on very large databases.

• Data reduction techniques include singular value decomposition (the driving element

behind principal components analysis), wavelets, regression, log-linear models,

histograms, clustering, sampling, and the construction of index trees

2. Data compression:

• According to this theory, the basis of data mining is to compress the given data by

encoding in terms of bits, association rules, decision trees, clusters, and so on.

• Encoding based on the minimum description length principle states that the “best”

theory to infer from a set of data is the one that minimizes the length of the theory and

the length of the data when encoded, using the theory as a predictor for the data.

• This encoding is typically in bits.

3. Pattern discovery:

• In this theory, the basis of data mining is to discover patterns occurring in the

database, such as associations, classification models, sequential patterns, and so on.

• Areas such as machine learning, neural network, association mining, sequential

pattern mining, clustering, and several other subfields contribute to this theory.

4. Probability theory:

• This is based on statistical theory.

• In this theory, the basis of data mining is to discover joint probability

distributions of random variables, for example, Bayesian belief networks or hierarchical

Bayesian models.

5. Microeconomic view:

• The microeconomic view considers data mining as the task of finding patterns that are

interesting only to the extent that they can be used in the decision-making process of

some enterprise (e.g., regarding marketing strategies and production plans).

• This view is one of utility, in which patterns are considered interesting if they can

be acted on.

• Enterprises are regarded as facing optimization problems, where the object is to

maximize the utility or value of a decision.

• In this theory, data mining becomes a nonlinear optimization problem.

6. Inductive databases:

• According to this theory, a database schema consists of data and patterns that are

stored in the database.

• Data mining is therefore the problem of performing induction on databases, where the

task is to query the data and the theory (i.e., patterns) of the database.

• This view is popular among many researchers in database systems.

What is Data Mining? Data Mining Motivation Data Mining Applications Applications of Data Mining in CRM Data Mining Taxonomy Data Mining Techniques.

Data Mining: Chapter 8. Mining Stream, Time- Series, and ...liacs.leidenuniv.nl/.../07_dbdm2007_Mining_Biological_Data.pdf · 1 11/18/2007 Data Mining: Principles and Algorithms 1

Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.

Oracle Data Mining Overview and Demo Oracle 11

ICT619 Intelligent Systems Topic 6: Data Mining. ICT6192 Data Mining Introduction Business Applications of Data Mining Data Mining Activities

Visual Data Mining: An Overview What is Visual Data Mining? Survey of techniques Data Visualization Visualizing Data Mining Results Visual Data Mining.

Repetitorium Data Warehousing und Data Mining 11/12wandelt/DWDM/Repetitorium.pdf · Inhalt •„Data Warehousing und Data Mining“ auf 40 Slides –Weder in-, noch exklusiv! –Subjektive

CISC 4631 Data Mining Lecture 11: Neural Networks.

Data Mining: What is Data Mining?

Data Mining: Introduction. Chapter 1. Introduction zMotivation: Why data mining? zWhat is data mining? zData Mining: On what kind of data? zData mining.

Data Mining - The opportunity lies with Data!Data mining

Data mining - TPA · เทคโนโลยี Data mining ประกอบไปด้วยขั้นตอนเทคโนโลยีอันสลับ เพิ่มมูลค่าข้อมูลทางธุรกิจด้วยData

DATA MINING LECTURE 11

Lecture 2: Data Mining 1. Roadmap What is data mining? Data Mining Tasks – Classification/Decision Tree – Clustering – Association Mining Data Mining.

Chapter 11 Data Management: Warehousing, Analyzing, Mining & Visualization.

Data Mining Lectures Lecture 11: Pattern Discovery Padhraic Smyth, UC Irvine ICS 278: Data Mining Lecture 11: Pattern Discovery Algorithms Padhraic Smyth.

BUS5EIS Lec 11 - CRM Data Mining Example

CS590D: Data Mining Chris Clifton - Purdue University · Data Mining: Classification Schemes • General functionality – Descriptive data mining – Predictive data mining ... –

Unsupervised Data Mining (Clustering)mmartin/DMClustering.pdfJavier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 2012 11 / 51. Clustering in Data Mining Dimensionality

Chapter 11 Spatial data mining & decision support systems

11 Data Mining

Documents

Transcript of 11 Data Mining