Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |...
Transcript of Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |...
![Page 1: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/1.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Mark Hornick January 2016
Oracle Advanced Analytics Oracle R Enterprise 1.5 – Hot New Features
![Page 2: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/2.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
2
![Page 3: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/3.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Agenda
• Oracle’s Advanced Analytics and R Technologies
• Overview of Oracle R Enterprise
• ORE 1.5 Features
3
![Page 4: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/4.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
OBIEE
Oracle Database Enterprise Edition
Oracle’s Advanced Analytics Multiple interfaces across platforms — SQL, R, GUI, Dashboards, Apps
Oracle Advanced Analytics - Database Option SQL Data Mining & Analytic Functions + R Integration
for Scalable, Distributed, Parallel in-Database ML Execution
SQL Developer/ Oracle Data Miner
Applications
R Client
Data / Business Analysts R programmers Business Analysts/Mgrs Domain End Users Users
Platform
Oracle Database 12c
Hadoop
ORAAH Parallel,
distributed algorithms
Oracle Cloud
4
![Page 5: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/5.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Oracle’s R Technologies Supporting R, Oracle Database, and Big Data Appliance/Hadoop
• Oracle R Distribution
• ROracle
• Oracle R Enterprise Component of the Oracle Advanced Analytics Option to Oracle Database
• Oracle R Advanced Analytics for Hadoop Component of the Big Data Connectors Software Suite
Software available to R Community for free
5
![Page 6: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/6.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Oracle R Enterprise Component of Oracle Advanced Analytics option
• Scale R to Big Data
• Use Oracle Database as HPC environment
• Use in-database parallel and distributed machine learning algorithms
• Manage R scripts and R objects in Oracle Database
• Integrate R results into applications and dashboards via SQL
6
Client R Engine
ORE packages
Oracle Database User tables
In-db stats
Database Server Machine
SQL Interfaces SQL*Plus, SQLDeveloper, …
![Page 7: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/7.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
IoT Use Case: Energy Demand • Model each customer’s usage to understand
behavior and predict individual usage and overall aggregate demand
• 200 thousand households, each with a utility “smart meter”
• 1 reading / meter / hr
• 200K x 8760 hrs / yr 1.752B readings
• 3 years worth of data 5.256B readings
• Each customer has 26280 readings
• If each model takes 10 seconds to build, 555.6 hrs (23.2 days) …with 128 DOP 4.3 hrs
![Page 8: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/8.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
f(dat,args,…) {
}
Oracle Database + ORE
Data c1 c2 ci cn
R Script build model
f(dat,args,…) f(dat,args,…) f(dat,args,…) f(dat,args,…)
Model c1
Model c2
Model cn
Model ci
R Datastore R Script Repository
Scalable Analysis – Model Building Smart meter scenario
![Page 9: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/9.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Build models, partitioned on CUST_ID, and store in database
ore.groupApply (CUST_USAGE_DATA,
CUST_USAGE_DATA$CUST_ID,
function(dat, ds.name) {
cust_id <- dat$CUST_ID[1]
mod <- lm(Consumption ~ . -CUST_ID, dat)
mod$effects <- mod$residuals <- mod$fitted.values <- NULL
name <- paste("mod", cust_id,sep="")
assign(name, mod)
ds.name1 <- paste(ds.name,".",cust_id,sep="")
ore.save(list=paste("mod",cust_id,sep=""), name=ds.name1, overwrite=TRUE)
TRUE
},
ds.name="myDatastore", ore.connect=TRUE, parallel=128
)
9
![Page 10: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/10.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
So, what’s new in ORE 1.5?
10
![Page 11: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/11.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Oracle R Enterprise 1.5 – New Features
• Upgraded R version compatibility: R 3.2.0
• Parallel distributed algorithms
– ore.randomForest
– svd
– prcomp
• ore.summary performance enhancement
• ore.grant and ore.revoke on R scripts and datastores
• Datatypes CLOB and BLOB supported for embedded R execution input and output, as well as for ore.create, ore.pull, and ore.push
• ore.groupApply supports partitioning on multiple columns
11
![Page 12: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/12.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
• ORE 1.5 is certified >= R-3.2.0
– Open source R
– Oracle R Distribution
• R-3.2.0
– Performance improvements
– big in-memory data objects
– compatibility with more than 7000
community-contributed R packages
• Supporting packages for ORE
– New package: randomForest 4.6-10
– Updates to other packages
• arules 1.1-9
• cairo 1.5-8
• DBI 0.3-1
• png 0.1-7
• ROracle 1.2-1
• statmod 1.4.21
12
Upgraded R 3.2.x version compatibility
![Page 13: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/13.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Oracle R Enterprise Predictive Analytics algorithms in-Database
Decision Tree Logistic Regression
Naïve Bayes RandomForest
Support Vector Machine
Regression
Linear Model Generalized Linear Model
Multi-Layer Neural Networks Stepwise Linear Regression Support Vector Machine
Classification
Attribute Importance
Minimum Description Length
Clustering
Hierarchical k-Means Orthogonal Partitioning Clustering
Feature Extraction
Nonnegative Matrix Factorization Principal Component Analysis Singular Value Decomposition
Market Basket Analysis
Apriori – Association Rules
Anomaly Detection
1 Class Support Vector Machine
Time Series
Single Exponential Smoothing Double Exponential Smoothing
New in ORE 1.5
13
…plus open source R packages for algorithms in combination with embedded R data- and task-parallel execution
![Page 14: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/14.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Random Forest Algorithm
• Ensemble learning technique for classification and regression
• Known for high accuracy models
• Constructs many “small” decision trees
• For classification, predicts mode of classes predicted by individual trees
• For regression, predicts mean prediction of individual trees
• Avoids overfitting, which is common for decision trees
• Developed by Leo Breiman and Adele Cutler combining the ideas of “bagging” and random selection of variables resulting in a collection of decision trees with controlled variance
14
![Page 15: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/15.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
ore.randomForest supports classification
• Enables performance and scalability for larger data sets
• Executes in parallel for model building and scoring
– ore.parallel global option used for preferred DOP
• Oracle R Distribution new randomForest function
– Reduces memory requirements over standard R (~7X)
– As a result, reduces memory requirements for ore.randomForest
– ORD randomForest supports classification only
• Can use Oracle R Distribution’s or R’s randomForest package
15
![Page 16: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/16.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | 16
ore.randomForest – parallel distributed implementation Exadata 5-2 – half rack, ORE DOP = 40
273
1389
6366
28 70
360
1
10
100
1000
10000
10K 100K 1M
Tim
e (
seco
nd
s)
# rows (ntree = 500)
R vs. ORE Random Forest Build Time
R
ORE
Order of magnitude faster ~2.8 hours
~17 minutes
![Page 17: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/17.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
ore.randomForest
• ore.randomForest() builds a random forest model by growing trees in parallel
• Scoring method 'predict' runs in parallel
options(ore.parallel=4)
IRIS <- ore.push(iris)
mod <- ore.randomForest(Species~., IRIS)
tree10 <- grabTree(mod, k = 10, labelVar = TRUE)
ans <- predict(mod,IRIS,type="all",supplemental.cols="Species", cache.model=FALSE)
table(ans$Species, ans$prediction)
17
![Page 18: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/18.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | 18
ore.randomForest Results
![Page 19: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/19.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Singular Value Decomposition Principal Component Analysis
• The functions svd and prcomp overloaded –Execute in parallel
–Accept ore.frame objects
• In-database execution to improve scalability and performance
• No data movement
19
![Page 20: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/20.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
SVD example using ore.frame # Set up the data
dat <- iris[,-5]; dat$IDX <- seq_len(nrow(dat))
ore.create(dat,table="DAT")
ore.exec("alter table DAT add constraint DAT primary key (\"IDX\")")
ore.sync(table = "DAT", use.keys = TRUE)
# Compute svd on ore.frame
sol <- svd(DAT[,-5])
plot(cumsum(sol$d^2/sum(sol$d^2))) # % explained variance
# Derive the U matrix since not provided with model
sol.U <- as.matrix(DAT[,-5]) %*% (sol$v) %*% diag(1./sol$d)
class(sol.U) # ore.tblmatrix
k<-1 # use one singular vector
recon1 <- (sol.U)[,1:k,drop=FALSE] %*%
diag((sol$d)[1:k,drop=FALSE],nrow=k,ncol=k) %*%
t((sol$v)[,1:k,drop=FALSE])
class(recon1) # ore.tblmatrix
myviz(mat,recon1,lab1="Iris data", lab2="Recon 1")
20
Example inspiration: StackExchange Cross Validated
![Page 21: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/21.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Performance Enhancement – ore.summary
• ore.summary(data, var, stats = c("n", "mean", "min", "max"), class = NULL, types = NULL, ways = NULL, weight = NULL, order = NULL, maxid = NULL, minid = NULL, mu = 0, no.type = FALSE, no.freq = FALSE)
• More than an order of magnitude performance improvement
21
> options("ore.parallel")
$ore.parallel
[1] 40
> system.time(res <- ore.summary(ONTIME_10M, var=c("ARRDELAY","DEPDELAY","DISTANCE"),
+ class=c("YEAR", "MONTH", "DAYOFWEEK", "UNIQUECARRIER", "CANCELLED"), order="-type"))
user system elapsed
0.018 0.000 17.248
> system.time(res <- ore.summary(ONTIME_1B, var=c("ARRDELAY","DEPDELAY","DISTANCE"),
+ class=c("YEAR", "MONTH", "DAYOFWEEK", "UNIQUECARRIER", "CANCELLED"), order="-type"))
user system elapsed
0.016 0.000 55.141
![Page 22: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/22.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
ore.summary
• Wide range of statistical functions available for stats argument
22
"n" or "freq" Count of non-missing values "count" or "cnt" Count of all observations "nmiss" Count of missing values "mean" or "avg" Average of values "min" Minimum of values "max" Maximum of values "css" Corrected sum of squares "uss" Uncorrected sum of squares "cv" Coefficient of variation "sum" Sum of values "sumwgt" Weighted sum of values "range" Range of values "stddev" or "std" Standard deviation of values "stderr" or "stdmean" Standard error for the mean "variance" or "var" Variance of values "kurtosis" or "kurt" Kurtosis "skewness" or "skew" Skewness
"loccount<" or "loc<" # observations whose values < supplied mu "loccount>" or "loc>" # observations whose values > supplied mu "loccount!" or "loc!" # observations whose values != supplied mu "loccount" or "loc" # observations whose values == supplied mu Percentiles Types: "p0", "p1", "p5", "p10", "p25" or "q1", "p50" or "q2" or "median", "p75" or "q3", "p90", "p95", "p99", "p100" --> Percentile or quantile "qrange" or "iqr" Interquartile range, Q3-Q1 "mode" Most frequently occurring value "lclm" 2-sided left confidence limit with confidence level of interval = 0.95 "rclm" 2-sided right confidence limit with confidence level of interval = 0.95 "clm" 2-sided confidence interval with confidence level of interval = 0.95 "t" Student's t-test statistic "probt" or "prt" Two-tailed p-value for student's t-test
![Page 23: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/23.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
ORE 1.5 Datastore – grant and revoke
• Save and load R objects using Oracle Database for persistence
• In ORE 1.4.1, each schema has a single datastore table that stores all named datastores
• In ORE 1.5, users can provide read-only access to datastores created “grantable”
– “Grantable” datastores created as individual tables in the user’s schema
– “Private” datastores still reside in a common table in the user’s schema
• Functions
– ore.save, ore.load
– ore.datastore, ore.datastoreSummary
– ore.delete
– ore.grant, ore.revoke
23
![Page 24: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/24.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | 24
Datastore – granting access
Client 1 Client 2
Oracle Database
User1 Schema User2 Schema
Private Datastores (schema local) Shared Datastores
one table per datastore
mtcars df
iris df
rq$datastoreinventory
ore$ds2_20
Grant read access to User2
User1.ore$ds2_20
ore.save(iris, name="ds_1")
ore.save(mtcars, name="ds_2", grantable=TRUE)
ore.grant(name="ds_2", type="datastore", user=“User2")
ds_1
ore.datastore(type="all")
![Page 25: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/25.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
R Script Repository – granting access • To create/drop scripts, user must have RQADMIN role
• Ordinary ORE users are not allowed to create R scripts at database server
• RQADMIN user can drop any global R script
– Only script creator can drop users scripts, i.e., created with ore.scriptCreate(global = FALSE)
– A user can grant access to a script to individuals or to all (public)
• Any ORE user can execute (global or granted) repository R scripts
• Determining script visibility – ore.scriptList(type = …)
– "global” shows global scripts
– "user” shows all scripts created by the current user with global=FALSE
– "all” shows global and user scripts
– "grant" shows scripts user has granted access to others
– "granted" shows scripts user has been granted access to
25
![Page 26: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/26.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
User-defined R functions – shared functions # create an R script for the current user ore.scriptCreate("privateFunction",
function(data, formula, ...) lm(formula, data, ...))
# create a global R script available to any user
ore.scriptCreate("globalFunction",
function(data, formula, ...) glm(formula=formula, data=data, ...),
global = TRUE)
# list R scripts
ore.scriptList()$NAME # type= "user" default
ore.scriptList(pattern="Function",
type="all")$NAME
26
![Page 27: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/27.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
User-defined R Functions – ore.grant and ore.revoke # load an R script by name to an R function object ore.scriptLoad(name="privateFunction")
ore.scriptLoad(name="globalFunction", newname="privateFunction2")
# grant and revoke R script read privilege to and from public
ore.grant(name = "privateFunction", type = "rqscript")
ore.scriptList(pattern="Funct",type="grant")$NAME
ore.revoke(name = "privateFunction", type = "rqscript")
ore.scriptList(pattern="Funct",type="grant")$NAME
# drop an R script
ore.scriptDrop("privateFunction")
ore.scriptDrop("globalFunction", global=TRUE)
ore.scriptList(pattern="Funct",type="all")$NAME
27
![Page 28: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/28.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
ore.groupApply – multi-column INDEX
• INDEX: A ore.vector or ore.frame object containing ore.factor objects or columns, each of which is the same length as argument 'X'. It is used to partition the data in 'X' before sending it to function 'FUN'
28
res <- ore.groupApply(ONTIME_S[c(7,12,18,19,22)],
INDEX = ONTIME_S[,c(7,12)], # day of week, unique carrier
function(df) {
if(nrow(df) == 0)
NULL
else
list(df[1,1],df[1,2],
summary(lm(ARRDELAY ~ DEPDELAY+DISTANCE,data=df)))
},
parallel = 4)
![Page 29: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/29.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | 29
![Page 30: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/30.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. |
Oracle R Enterprise 1.5 – Summary
• Upgraded R version compatibility: R 3.2.0
• Parallel distributed algorithms
– ore.randomForest
– svd
– prcomp
• ore.summary performance enhancement
• ore.grant and ore.revoke on R scripts and datastores
• Datatypes CLOB and BLOB supported for embedded R execution input and output, as well as for ore.create, ore.pull, and ore.push
• ore.groupApply supports partitioning on multiple columns
30
![Page 31: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/31.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | 31
To Learn More about Oracle’s R Technologies…
http://oracle.com/goto/R
![Page 32: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/32.jpg)
Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | 32
![Page 33: Oracle Advanced Analytics · Copyright © 2016 Oracle and/or its affiliates. All rights reserved. | ore.summary •Wide range of statistical functions available for stats argument](https://reader034.fdocuments.net/reader034/viewer/2022050104/5f76d610c4b1c2050b213fba/html5/thumbnails/33.jpg)