ORACLE Data Warehousing Guarino
-
Upload
tunjungputra -
Category
Documents
-
view
233 -
download
0
Transcript of ORACLE Data Warehousing Guarino
-
8/6/2019 ORACLE Data Warehousing Guarino
1/62
Copyright 2008 - Oracle Corporation
-
8/6/2019 ORACLE Data Warehousing Guarino
2/62
Oracle Data Warehounsing
Vincenzo GuarinoTechnology Sales Consultant
-
8/6/2019 ORACLE Data Warehousing Guarino
3/62
Copyright 2008 - Oracle Corporation
Agenda
Scenario
Oracle Optimized Warehouse Initiative Data Warehousing with Oracle Database Oracle Data Warehouse Platform
-
8/6/2019 ORACLE Data Warehousing Guarino
4/62
Copyright 2008 - Oracle Corporation
Scenario
-
8/6/2019 ORACLE Data Warehousing Guarino
5/62
Copyright 2008 - Oracle Corporation
Oracles BI & DW Product Strategy
Integrated Data Warehouse Database
Scalability, Availability, Manageability Advanced analytic content, Data Quality and ETL/EL-T, DataMining services, Spatial data
Integrated Business Intelligence Tools Next Generation Business Intelligence Technology Platform
Integrated Analytic Applications Enterprise Wide, Industry Specific Analytic and Corporate
Performance Management Applications
Exploits Any Information Exploits Any Information Exploits Any Information
-
8/6/2019 ORACLE Data Warehousing Guarino
6/62
Copyright 2008 - Oracle Corporation
The Database Market
http://www.oracle.com/database/number-one-database.html
Source IDC
http://www.oracle.com/solutions/business_intelligence/feature_dw_leadership.html
40.9%
22.8%
15.0%
9.6%
11.7%
Oracle
IBM
Microsoft
Teradata
Other
-
8/6/2019 ORACLE Data Warehousing Guarino
7/62
Copyright 2008 - Oracle Corporation
Introducing Oracle OptimizedWarehouse Initiative
-
8/6/2019 ORACLE Data Warehousing Guarino
8/62
Copyright 2008 - Oracle Corporation
-
8/6/2019 ORACLE Data Warehousing Guarino
9/62
Copyright 2008 - Oracle Corporation
A bit of methodology
OperationalSystems
OperationalSystems
Centralized Repository (1 st level Data Warehouse) 3rd Normal Form Data Model Optimization for large volume of data
Enterprise Reporting and general queries Data and Metadata integration
Dipendent Data Marts (2 nd level Data Warehouse) Dimensional schemas and views Multidimensional objects Complex analytical queries High query performance
Staging area Reconciling, transforming and integrating source data Data quality checks and corrections, recycling data
loading errors
Data Sources Applications, Transactions, external data
Staging AreaStaging Area
Atomic Data LayerAtomic Data Layer
Performance Data LayerPerformance Data Layer
-
8/6/2019 ORACLE Data Warehousing Guarino
10/62
Copyright 2008 - Oracle Corporation
Problem
This chain is composed by complex processes for transforming source data to analytical information Source applications, ETL/EL-T processes, dimensional data
modeling, metadata menagement and integration, and by Software products that allow to build up and use
the information Database, ETL/EL-T tool, Front-end tool, Data Hub, Multi-
dimensional engine, Reporting environment,
But the Hardware components are also involved in thiscontext Servers, CPU, Memory, Disks, Network, Devices,
Some percentage of customers inevitably end up with poorly configured data warehouses
Performance is the essence of theOracle Optimized Warehouse program
OperationalSystems
OperationalSystems
OperationalSystems
OperationalSystems
Staging AreaStaging AreaStaging AreaStaging Area
Atomic Data LayerAtomic Data LayerAtomic Data LayerAtomic Data Layer
Performance Data LayerPerformance Data LayerPerformance Data LayerPerformance Data Layer
-
8/6/2019 ORACLE Data Warehousing Guarino
11/62
Copyright 2008 - Oracle Corporation
But how to start with?
-
8/6/2019 ORACLE Data Warehousing Guarino
12/62
Copyright 2008 - Oracle Corporation
Full Range of DW Solution Options
Flexibility for the mostdemanding data warehouse
Benefits: High performance
Unlimited scalability Completely
customizable
Industry-leadingdatabase and hardware
CustomCustom
DatabaseOptions
Management
Packs
Partitioning RAC
OptimizedWarehouseOptimizedWarehouse Scalable systems pre-
installed and pre-configured: ready to runout-of-the-box
Benefits:
High performance Simple to buy
Fast to implement
Easy to maintain
Competitively priced
Flexibility
Pre-configured, Pre-installed, Validated
DatabaseOptions
ManagementPacks
Documented best-practiceconfigurations for datawarehousing
Benefits: High performance
Simple to scale; modularbuilding blocks
Industry-leadingdatabase and hardware
Available today with HP,IBM, Sun, EMC/Dell
ReferenceConfigurationReferenceConfiguration
-
8/6/2019 ORACLE Data Warehousing Guarino
13/62
Copyright 2008 - Oracle Corporation
Oracle OptimizedWarehouse
< 1 - 2 week
Take delivery ofOracle
Optimized Warehouse
Build from Scratchwith Components
Pre-implementationsystem sizing
Acquisition ofcomponents
Installation andconfiguration
Testing andValidation
Weeks to Months
Accelerate implementations and lower risk
Oracle Optimized Warehouse Initiative
Faster deploymentLower Risk
ReferenceConfigurations
Acquisition ofcomponents
Installation andconfiguration
Testing andValidation
Weeks to Months
-
8/6/2019 ORACLE Data Warehousing Guarino
14/62
Copyright 2008 - Oracle Corporation
Oracle Optimized Warehouse Initiative - OWI
Goals for Oracle data warehouse solutions:
Provide superior system performance Provide a superior customer experience
One product for data warehouse Database and options software, servers, storage Pre-installed, pre-configured Validated performance Sold as a single product Supported as a single product
-
8/6/2019 ORACLE Data Warehousing Guarino
15/62
Copyright 2008 - Oracle Corporation
Soon
OptimizedWarehouses
ReferenceConfigurations
Partner
OWI availability
-
8/6/2019 ORACLE Data Warehousing Guarino
16/62
Copyright 2008 - Oracle Corporation
Oracle Optimized Warehouse Initiative - OWI
SolarisAIXLinuxO/S
E20KP570 Power 6PE2950Server 10 TB5-20 TB1-4 TBSize
-
8/6/2019 ORACLE Data Warehousing Guarino
17/62
Copyright 2008 - Oracle Corporation
OWI Building Block Scale-Out
Validation and testing of incremental growth path
-
8/6/2019 ORACLE Data Warehousing Guarino
18/62
Copyright 2008 - Oracle Corporation
Oracle Optimized Warehouse Reference
Configurations
What is it? Documented balanced system
configurations for pre-definedDWBI environments
Starting point for sizing a system Balanced system consists of CPU,
memory, I/O, and cabling
Leverages scalable, modular components
Enables incremental growth(scale-in, scale-out)
Mitigates implementation risks Available on HP, Sun, IBM, andDell/EMC Example Reference
Configuration, with HP
-
8/6/2019 ORACLE Data Warehousing Guarino
19/62
Copyright 2008 - Oracle Corporation
http://www.oracle.com/solutions/business_intelligence/optimized-warehouse-initiative.html
-
8/6/2019 ORACLE Data Warehousing Guarino
20/62
Copyright 2008 - Oracle Corporation
Data Warehousing withOracle Database
-
8/6/2019 ORACLE Data Warehousing Guarino
21/62
Copyright 2008 - Oracle Corporation
Oracle for Data WarehousingContinuous innovation
PerformancePerformance
D W H & A n a
l y t i c a l
f e a t u r e s
D W H & A n a
l y t i c a l
f e a t u r e s
-
8/6/2019 ORACLE Data Warehousing Guarino
22/62
Copyright 2008 - Oracle Corporation
Total Security
Oracle Database Enterprise Editions and Options
Storing and managing eachtype of data
ETL functionalities, advancedAnalytic content, Data Miningbuilt in the Database Kernel
High performance on largevolume of data
(VLDB and VLDW)
-
8/6/2019 ORACLE Data Warehousing Guarino
23/62
Copyright 2008 - Oracle Corporation
Oracle Data WarehousePlatform
-
8/6/2019 ORACLE Data Warehousing Guarino
24/62
Copyright 2008 - Oracle Corporation
Oracle Data Warehouse Platform
ELT & Data Quality
DataIntegration
DataModeling
MetadataManagement
DataProfiling SOA
Analytic PlatformMulti
DimensionalCalculations
Time Series Forecasting Statistics DataMining
Scalable Data Management
AutomaticStorage
ManagementPartitioning ParallelOperations
AggregationManagement
RealApplication
Clusters
-
8/6/2019 ORACLE Data Warehousing Guarino
25/62
Copyright 2008 - Oracle Corporation
Oracle Data Warehouse Platform
ELT & Data Quality
DataIntegration
DataModeling
MetadataManagement
DataProfiling SOA
Analytic PlatformMulti
DimensionalCalculations
Time Series Forecasting Statistics DataMining
Scalable Data Management
AutomaticStorage
ManagementPartitioning ParallelOperations
AggregationManagement
RealApplication
Clusters
-
8/6/2019 ORACLE Data Warehousing Guarino
26/62
Copyright 2008 - Oracle Corporation
Oracle Enterprise Grid
Computing as a utility A network of clients and service
providers Client-side: Simplicity
Request computation or information and receive it Server-side: Sophistication
Availability, load balancing,utilization
Information sharing, datamanagement Virtualization
Clients see a large virtual server Underlying infrastructure hidden
Manageability Easy and automated managementfrom a unique Web console
High availability & Scalability S t o r a g e
S t o r a g e
D a t a b a s e D a t a b a s e
M i d d l e w a
r e M i d d
l e w a r e
A p p l i c a t i
o n s
A p p l i c a
t i o n s
GridGridControlControl
-
8/6/2019 ORACLE Data Warehousing Guarino
27/62
Copyright 2008 - Oracle Corporation
Oracle Real Application Cluster - RACCapacity on demand for the Grid
Database clustering with shareddisk
Low cost highest quality of service Scalability & availability
Add/drop servers as needschange
Automatically balance loadacross servers
On-line configuration of services and priorities
Proven Hundreds of customers
running enterpriseapplications
High
Med
Low
High
Low
Priority0.5/0.75 ms.
0.5/1.00 ms.
1.0/1.5 ms.
1.0/1.5 ms.
3.0/5.0 ms.
High
Med
Low
High
Low
0.5/0.75 ms.
Warning/CriticalThreshold
0.5/1.00 ms.
1.0/1.5 ms.
1.0/1.5 ms.
3.0/5.0 ms.
TransactionServices
ERPCRM
SS
HOT
STD
ERPCRM
SS
HOT
STD
ERPCRM
SS
HOT
STD
ERPCRM
SS
HOT
STD
RAC01 RAC02 RAC03 RAC04
Batch JobServices
Instances
C L U S N OD E - 1
C L U S N OD E - 2
C L U S N OD E - 3
C L U S N OD E - 4
High
Med
Low
High
Low
Priority0.5/0.75 ms.
0.5/1.00 ms.
1.0/1.5 ms.
1.0/1.5 ms.
3.0/5.0 ms.
High
Med
Low
High
Low
0.5/0.75 ms.
Warning/CriticalThreshold
0.5/1.00 ms.
1.0/1.5 ms.
1.0/1.5 ms.
3.0/5.0 ms.
TransactionServices
ERPCRM
SS
HOT
STD
ERPCRM
SS
HOT
STD
ERPCRM
SS
HOT
STD
ERPCRM
SS
HOT
STD
RAC01 RAC02 RAC03 RAC04
Batch JobServices
Instances
C L U S N OD E - 1
C L U S N OD E - 2
C L U S N OD E - 3
C L U S N OD E - 4
-
8/6/2019 ORACLE Data Warehousing Guarino
28/62
Copyright 2008 - Oracle Corporation
Extend and growth as needed
3 6 9 12 15 18 21 24Months
100%
200%
300%W
or
k
l
o
a
d
-
8/6/2019 ORACLE Data Warehousing Guarino
29/62
Copyright 2008 - Oracle Corporation
RAC for Data WarehousingManageability
ETL
OLAP
ReportsDuranteDurante le orele oredi picco delledi picco dellequery equery e analisianalisiutenteutente
During peakDuring peakworking hours of working hours of usersusers queries andqueries andanalysisanalysis
ETL
OLAP
Reports
DuranteDurante lala finestrafinestratemporale dedicatatemporale dedicataai caricamenti deiai caricamenti deinuovi datinuovi dati
During intervalsDuring intervalswhen the DW iswhen the DW isloaded with newloaded with newand modified dataand modified data
ETL
OLAP
Reports
Subito dopoSubito dopo iicaricamenticaricamenti
After having loadedAfter having loadedthe datathe data
ETL
OLAP
Reports
Possibilit diPossibilit dibilanciarebilanciaretotalmente tuttitotalmente tutti iiserviziservizi
Without responseWithout responsetime requirements alltime requirements alltypes of workload cantypes of workload canrun on all nodesrun on all nodes
-
8/6/2019 ORACLE Data Warehousing Guarino
30/62
Copyright 2008 - Oracle Corporation
Partitioning
Partitioning addresses key issues in supporting verylarge tables and indexes
Decompose them into smaller and more manageable piecescalled partitions
SQL queries and DML statements do not need to be modifiedin order to access partitioned tables
DDL statements can access and manipulate individualspartitions rather than entire tables or indexes
Add a new partition, organize an existing partition, or dropa partition with minimal to zero interruption to a read-onlyapplication
Partitioning is entirely transparent to applicationsMarJan
Feb
Sales
SQL
Application
MarMarJanJan
FebFeb
SalesSales
SQLSQL
ApplicationApplication
-
8/6/2019 ORACLE Data Warehousing Guarino
31/62
Copyright 2008 - Oracle Corporation
Apr2007
Feb2007
Jan2007
Oct2007
May2007
Jul2007
Aug2007
MarMar20072007
JunJun20072007
Nov2007
Dec2007
SepSep20072007
Partitioning
Using the partitioning methods can help tune SQL statements to avoidunnecessary index and table scans (using partition pruning)
Improve the performance of massive join operations when largeamounts of data (for example, several million rows) are joined together by using partition-wise joins
Partitioning data greatly improves manageability of very large databasesand dramatically reduces the time required for administrative tasks suchas backup and restore
AprApr20072007
FebFeb20072007
JanJan20072007
OctOct20072007
MayMay20072007
JulJul20072007
AugAug20072007
MarMar20072007
JunJun20072007
NovNov20072007
DecDec20072007
SepSep20072007
SELECT sum(revenue)SELECT sum(revenue)FROM SalesFROM SalesWHERE sales_date INWHERE sales_date IN
(to_date((to_date( MARMAR--1515 --20072007 ,, MONMON--DDDD--YYYYYYYY),),
to_date(to_date( JUNJUN --1010 --20072007 ,, MONMON--DDDD--YYYYYYYY),),to_date(to_date( SEPSEP --2828 --20072007 ,, MONMON--DDDD--YYYYYYYY););
-
8/6/2019 ORACLE Data Warehousing Guarino
32/62
-
8/6/2019 ORACLE Data Warehousing Guarino
33/62
Copyright 2008 - Oracle Corporation
Rolling Window Operations
Q4 06 Q1 07 Q2 07 Q3 07
Order Table(partitioned by quarter)
Drop
Other data & queries not affected
Q4 07Add
-
8/6/2019 ORACLE Data Warehousing Guarino
34/62
Copyright 2008 - Oracle Corporation
DIGITALDATA STORAGE
High PerformanceStorage Tier
Low CostStorage Tier
Online ArchiveStorage Tier
Active LessActive Historical Archive
Offline ArchiveStorage Tier
Use Flashback Data Archive for long-term storage of old data Use table, index partitioning to separate data into different tiers
Use new ILM assistant to establish policies, create scripts
Information Lifecycle ManagementOptimize storage cost and performance
-
8/6/2019 ORACLE Data Warehousing Guarino
35/62
Copyright 2008 - Oracle Corporation
Oracle Data Warehouse Platform
ELT & Data Quality
DataIntegration
DataModeling
MetadataManagement
DataProfiling SOA
Analytic PlatformMulti
DimensionalCalculations
Time Series Forecasting Statistics DataMining
Scalable Data Management
AutomaticStorage
ManagementPartitioning ParallelOperations
AggregationManagement
RealApplication
Clusters
-
8/6/2019 ORACLE Data Warehousing Guarino
36/62
-
8/6/2019 ORACLE Data Warehousing Guarino
37/62
Copyright 2008 - Oracle Corporation
Key points
Declarative, graphical and Wizard driven development The transformation engine is the target Oracle
Database Configurable ETL and/or EL-T mechanism
The transformation language is PL/SQL, automatically
generated and optimized depending on the Databaserelease
Open and standard (CWM) Metadata Repository
-
8/6/2019 ORACLE Data Warehousing Guarino
38/62
Copyright 2008 - Oracle Corporation
Oracle Warehouse Builder
Licensing Option Informationhttp://download.oracle.com/docs/cd/B28359_01/license.111/b28287/toc.htm
-
8/6/2019 ORACLE Data Warehousing Guarino
39/62
Copyright 2008 - Oracle Corporation
Oracle Warehouse Builder Core ETL features Included in any edition of Oracle Database Advanced Relational AND OLAP Modeling Design Experts for automate complex tasks
OWB Core
Oracle Warehouse Builder
* To be licensed separately
OWBData QualityOption *
Advanced Data Profiling Auto-derived or Custom Data Rules and Mappings Data Auditors in the context of ETL Process Flows 6-Sigma Quality Indices
OWBEnterprise ETLOption *
Advanced ETL Features Good for Large Scale, Complex ETL Deployments Slow Changing Dimensions, Pluggable Mappings Guided Change Propagation, Complex Process
Flows
OWBConnectorsOption *
Enterprise ETL Connectors for Oracle E-Business Suite Peoplesoft Siebel SAP R/3
-
8/6/2019 ORACLE Data Warehousing Guarino
40/62
Copyright 2008 - Oracle Corporation
Oracle Data Warehouse Platform
ELT & Data Quality
DataIntegration
DataModeling
MetadataManagement
DataProfiling SOA
Analytic PlatformMulti
DimensionalCalculations
Time Series Forecasting Statistics DataMining
Scalable Data Management
AutomaticStorageManagement
Partitioning ParallelOperationsAggregationManagement
RealApplicationClusters
-
8/6/2019 ORACLE Data Warehousing Guarino
41/62
Copyright 2008 - Oracle Corporation
Bring the algorithms to the data, not the data to the
algorithmsUnparalleled Analytic Power
Analytic computations done bythe database
Statistics OLAP Data Mining
Scalability
Security Simplicity Single source of Truth Low information latency
OLAP
Data Mining
Statistics
-
8/6/2019 ORACLE Data Warehousing Guarino
42/62
Copyright 2008 - Oracle Corporation
SQL Analytic and Statistic functions
Window Aggregate functions (moving andcumulative)
Avg, sum, min, max, count, variance, stddev,first_value, last_value
Ranking functions rank, dense_rank, cume_dist, percent_rank, ntile
LAG/LEAD functions Direct inter-row reference using offsets
Reporting Aggregate functions Sum, avg, min, max, variance, stddev, count,
ratio_to_report
Statistical Aggregates Correlation, linear regression family, covariance
Linear regression Fitting of an ordinary-least-squares regression line to a
set of number pairs. Frequently combined with the COVAR_POP,
COVAR_SAMP, and CORR functions.
Descriptive Statistics average, standard deviation, variance, min, max,
median (via percentile_count), mode, group-by & roll-up
DBMS_STAT_FUNCS: summarizes numericalcolumns of a table and returns count, min, max, range,mean, stats_mode, variance, standard deviation,median, quantile values, +/- n sigma values,top/bottom 5 values
Correlations Pearsons correlation coefficients, Spearman's and
Kendall's (both nonparametric). Cross Tabs
Enhanced with % statistics: chi squared, phi
coefficient, Cramer's V, contingency coefficient,Cohen's kappa Hypothesis Testing
Student t-test , F-test, Binomial test, Wilcoxon SignedRanks test, Chi-square, Mann Whitney test,Kolmogorov-Smirnov test, One-way ANOVA
Distribution Fitting Kolmogorov-Smirnov Test, Anderson-Darling Test,Chi-Squared Test, Normal, Uniform, Weibull,
Exponential Pareto Analysis
80:20 rule, cumulative results table
-
8/6/2019 ORACLE Data Warehousing Guarino
43/62
Copyright 2008 - Oracle Corporation
Oracle OLAP 11g
Enhance the Oracle Data Warehouse and improvebusiness intelligence applications by:
Delivering rich analytic content Advanced analytic calculations using simple SQL Any combination of Total and Sub-total available
Building and managing Multi-dimensional structures inside the
database Pre-calculated indicators always available
Time-series, non additive measures across dimensions,wide range of functions used with respect of the Timedimension
Every drill operation delivers coherent level data detail Accelerating query performance
Completely transparent to the application
-
8/6/2019 ORACLE Data Warehousing Guarino
44/62
Copyright 2008 - Oracle Corporation
-
8/6/2019 ORACLE Data Warehousing Guarino
45/62
Copyright 2008 - Oracle Corporation
Calculated Measures
-
8/6/2019 ORACLE Data Warehousing Guarino
46/62
Copyright 2008 - Oracle Corporation
Olap-based Materialized ViewsBreakthrough Performance
Tables
Relational Star Schema
SQL Query
A single cube provides theequivalent of thousands of MVs
Efficiently computed,compressed, maintained
The 11g SQL Query
Optimizer treats OLAPcubes as MVs andrewrites queries to accesscubes transparently OLAP Cube
Query Rewrite
Ol b d M i li d Vi
-
8/6/2019 ORACLE Data Warehousing Guarino
47/62
Copyright 2008 - Oracle Corporation
Olap-based Materialized ViewsBreakthrough Manageability
Tables
Relational Star SchemaRelational Star Schema
Like 10g MVs, providesfast incremental refresh of the cube as underlyingdata changes
A single object to maintainrather than thousands
Simple - Cube refreshsyntax is identical to MVRefresh syntax
OLAP cubeOLAP cube
Cube Cube Refresh Refresh
A l i W k M
-
8/6/2019 ORACLE Data Warehousing Guarino
48/62
Copyright 2008 - Oracle Corporation
Analytic Workspace Manager
Graphical interface for designing, creating and managing MultidimensionalStructures
O l I D t b D t Mi i
-
8/6/2019 ORACLE Data Warehousing Guarino
49/62
Copyright 2008 - Oracle Corporation
Oracle In-Database Data MiningA Disruptive Technology
Provides a rich set of predictive algorithms High efficiency Predictive Analysis Multiple state-of-the-art supported algorithms
Easy to integrate and deploy SQL functions, Java API JSR-73, PL/SQL API Graphical easy-to-use interface Third party products that extend the coverage
Scalable
Secure and Reliable Changes the economics of analytics
Oracle Data Mining
-
8/6/2019 ORACLE Data Warehousing Guarino
50/62
Copyright 2008 - Oracle Corporation
Oracle Data MiningAlgorithms & Example Applications
A1 A2 A3 A4 A5 A6 A7A1 A2 A3 A4 A5 A6 A7
Support Vector Machine
Generalized Linear Models multivariate linear regression logistic regression
Regression Predict a numeric value
Predict a purchase amount or costPredict the value of a home
Decision TreesNave BayesSupport Vector MachineAdaptive Bayes Network*
*Deprecated
Classification and Prediction Predict customers most likely to
respond to a campaign or offer, incur the highest costs, etc.
Target your best customers Develop customer profiles
Minimum Description LengthAttribute Importance Identify most influential attributes for a
target attribute
Income
Gender
Status Gender HH Size
>$50K 4
Age
Buy = 0 Buy = 1 Buy = 1 Buy = 0
-
8/6/2019 ORACLE Data Warehousing Guarino
51/62
Copyright 2008 - Oracle Corporation
F1 F2 F3 F4F1 F2 F3 F4
Non-Negative MatrixFactorizationFeature Extraction Reduce a large dataset into
representative new attributes Useful for clustering and text mining
Apriori Association RulesAssociation Rules Find co-occurring items in a market
basket Suggest product combinations Design better item placement on
shelves
Enhanced k -meansOrthogonal PartitioningAnomaly Detection
Clustering Find naturally occurring groups
Market segmentation Find disease subgroups Identify frauds, anomalies
Oracle Data MiningAlgorithms & Example Applications
Oracle Data Mining
-
8/6/2019 ORACLE Data Warehousing Guarino
52/62
Copyright 2008 - Oracle Corporation
Text MiningText Mining Combine data and text for better
models Add unstructured text e.g. physicians
notes to structured data e.g. age,weight, height, etc., to predictoutcomes
Classify and cluster documents Combined with Oracle Text to develop
advanced text mining applications
Oracle Data MiningAlgorithms & Example Applications
SQL Data Mining
-
8/6/2019 ORACLE Data Warehousing Guarino
53/62
Copyright 2008 - Oracle Corporation
SQL Data Mining
Given a previously built response model (classification),predict who will respond to the campaign,and why
select cust_name,
prediction(campaign_model using *)as responder,
prediction_details(campaign_model using *)as reason
from customers;
Real-time Prediction
-
8/6/2019 ORACLE Data Warehousing Guarino
54/62
Copyright 2008 - Oracle Corporation
Real time Prediction
With records as(select
178255 ANNUAL_INCOME,0 CAPITAL_GAIN,83 SAVINGS_BALANCE,246 AVE_CHECKING_BALANCE,
30 AGE,HIGH EDUCATION,Mngr WORKCLASS,Married MARITAL_STATUS,Sales OCCUPATION,Husband RELATIONSHIP,White RACE,
Male SEX,70 HOURS_PER_WEEK,? NATIVE_COUNTRY,98 PAYROLL_DEDUCTION from dual)
select s.prediction prediction, s.probability probabilityfrom (
select PREDICTION_SET( CD_BUYERS76485_DT , 1 USING *) pset
from records) t, TABLE(t.pset) s;
On-the-fly, single recordapply with new data
Oracle Data Miner GUI interface
-
8/6/2019 ORACLE Data Warehousing Guarino
55/62
Copyright 2008 - Oracle Corporation
Oracle Data Miner GUI interface
Oracle Data Miner sActivity Guidessimplify & automatedata mining for business users
Oracle Data Miner providesmodel performance andevaluation viewers
Technology Partnership
-
8/6/2019 ORACLE Data Warehousing Guarino
56/62
Copyright 2008 - Oracle Corporation
Technology PartnershipSPSS Clementine
Combine SPSS Clementine ease of use with ODM in-Database functionality & scalability
Build, store, browse and score models in the Database for optimal performance
InforSenseInforSense -- A Single Optimized Environment forA Single Optimized Environment forReal Time Predictive Analytics withinReal Time Predictive Analytics within the Databasethe Database
-
8/6/2019 ORACLE Data Warehousing Guarino
57/62
Copyright 2008 - Oracle Corporation
Oracle DataSources
Data Mining
Preprocess
Statistics
Text
OLAP
Scheduler
OracleFunctionalities:
Deploy the analytic workflowas a WebService
OracleDecision TreeModel
yy
SQL free analytics : drag-drop application buildVisual analytics : interactive visualisation
Integrative analytics : unified analytical environmentAutomated analytics : deploy to Oracle Portal and BPEL
InforSenseService
Deploy the analytic workflowas a service embedding toBPEL, SFA, CRM
Interact with (visualize) dataat any step in the workflow
Deployment
-
8/6/2019 ORACLE Data Warehousing Guarino
58/62
Copyright 2008 - Oracle Corporation
And last but not least
Oracle Database value innovation
-
8/6/2019 ORACLE Data Warehousing Guarino
59/62
Copyright 2008 - Oracle Corporation
Oracle Data GuardOracle ClusterwareOnline OperationsFlashback OperationsRolling UpgradesAdvanced Backup/RecoveryStreams, ReplicationOracle Real Application ClustersOracle Secure Backup
Automatic Workload RepositoryAutomatic Memory Management
Automatic Database Diagnostic MonitorParallel Operations
Database Control & Grid ControlTuning Pack
Diagnostic PackChange Management Pack
Configuration Management PackProvisioning Pack
Automatic Storage ManagementRecovery ManagerOracle Cluster File System
Information Lifecycle ManagementTransportable TablespacesExternal TablesCompressionPartitioning
Virtual Private DatabaseLabel Security (Fine Grained Audit)
Identity Management
Secure Application RolesTransparent Data EncryptionDatabase Vault
Audit Vault
Manageability
Security
Availability
Storage
The Data Warehousing Books
-
8/6/2019 ORACLE Data Warehousing Guarino
60/62
Copyright 2008 - Oracle Corporation
http://www.oracle.com/technology/documentation/database11gR1.html
-
8/6/2019 ORACLE Data Warehousing Guarino
61/62
Copyright 2008 - Oracle Corporation
-
8/6/2019 ORACLE Data Warehousing Guarino
62/62
Copyright 2008 - Oracle Corporation