The Building Data Genome Project 2, energy meter data from ...
Building Energy Data Analytics: Past, Present, Future · Data Analytics_June 2017_Reddy 3 Building...
Transcript of Building Energy Data Analytics: Past, Present, Future · Data Analytics_June 2017_Reddy 3 Building...
T. Agami Reddy, Ph.D., PEEmail: [email protected]
Website: www.auroenergy.com
Workshop on
Big Data in Building Operations
Carleton University, Ottawa, Canada
June 27-28, 2017
1
Building Energy Data Analytics:
Past, Present, Future
Data Analytics_June 2017_Reddy
Outline
Objectives– Provide panorama or overview of applications
– Identify specific aspects of some of the application categories
and provide status report
• Building Energy Data Analysis and Modeling Methods
• Brief Overview of Big Data and Relevant Applications
• Building Energy Data Analytics:- Building design
- Building operations:
+ Auditing
+ M&V
+ Day-to-day O&M (Demand response, CC, condition monitoring/FDD, forecasting and supervisory control)
Data Analytics_June 2017_Reddy 2
Data Analytics_June 2017_Reddy 3
Building Energy Data Analysis and Modeling Methods
Statistical Theory/Concepts
Classical Parametric
Classical Non-Parametric
Resampling
Data Mining and Machine Learning
Non-Parametric (data driven)
Big Data
Transparent algebraic modelsOpaque models with no clear analytical expression and output processed internally
Inferring Population Behavior from Samples Classification/ Clustering/Association(need large samples)
Data Analytics is the algorithmic implementation of these methods
• Three basic elements
- Huge amounts of heterogeneous multi-source data
- Datafication, storage, retrieval, and processing/analysis
- Big data mind set: unique and novel ways of how to tap data so as to unlock value in terms of useful and actionable knowledge
• Uses social media, open source and govt data,....
• Size of the data compensates for more noisy non-curated data
• Provides general non-causal trends and probabilities which were unanticipated
• Value of domain expertise matters less in identifying trends
• Whole suite of new tools, procedures and software for data capture, datafication, storing in databases, retrieval/quering, processing,...)
Data Analytics_June 2017_Reddy 4
V. Mayer-Schonberger and K. Cukier, “Big Data”, John Murray, 2013
Then Now
Data Analytics_June 2017_Reddy 5
- Is this simply a flashy leitmotif of the knowledge economy? Internet of Things (IoT)
- Increases dangers of false learning beliefs and unjustified confidence
- Does it provide the degree of quantification necessary for relevant and actionable
implementation?
Relevant Big Data Applications
Building Operations
Learning energy
consumption in residences
(DM using ANN
ensembles and adaptive)
Electric Utilities
DatamineSMI (cluster
customers for rate plan
modification)
Smart Grid Operations
City Level
Routine City
Operations
Extreme Events Mgmt
Development Planning for Aspirational
Goals (carbon neutrality,
sustainable cities)
Data Analytics_June 2017_Reddy 6
Bldg Energy Data Analytics for Decision Support Systems
Bldg Design
Individual BldgCluster of Bldgs
(campus, community, city)
Bldg Operations
Past
• Heuristic trial and error
• Design of experiments and regression
• Computer science techniques (expert systems, ontology-based ...)
Present
• Inclusion of daylighting, passive strategies
• Design parameter sensitivity and range of variability
• Simulation based tools -Monte Carlo, optimiz. (GA)
• Enhanced visualization
Future
• Specialized simulation environments
• Sophisticated ML tools for processing batch simulations
• Interactive design assistants
• Dynamic filters for credibility checks on simulations
1/4
Data Analytics_June 2017_Reddy 7
Individual Building Operations
Auditing
BenchmarkingWalk-thru
auditInvestment grade audit
M&VDay-to-Day
O&M
Past
• Benchmarking using EUI
• Audit done using utility bills, spot measurements
• Engg calculations or calibrated simulation
• Heuristic expert systems ECM (FEDS program)
Present
• Asset score (ASHRAE)
• EUI+ regression (EPA-PM)
• Interval data + spot meas.
• Screening parameters of inverse change point models
• Improved calibration simulation
Future (remote, peer groups)
• Leverage SMI data using sophisticated ML tools
• Creation of databases for bldg load prototypes- DM
• Simulation based ECM (USDOE Asset Score Management tool
2/4
Data Analytics_June 2017_Reddy 8
Building Operations
Auditing M&V
Indiv. BldgUtility Bills
Indiv. BldgMonitoring
Cluster of Bldgs
Day-to-Day O&M
Past
• Utility bills (VBDD- PRISM, MMT)
• Interval data: Change Point or calibrated simulation
• Disaggregated data: ECM isolation or equipment calibration
Present
• Hourly or sub-hourly Change Point models or ANN models
• Robust modeling: cross-validation methods
• Improved calibrated simulation using end-use data
Future (baseline modeling)
• Non-parametric methods
• More sophisticated ML tech.
• DM methods (random forest)
• Short-term monitoring
• Portfolio analy. (group bldgs)
• Improve Change Point modeling and error metrics
3/4
The intent of this research is to identify simple modeling techniques to determine the best time to begin in-situ monitoring of building energy use, and to determine the least amount of data required for generating acceptable long term predictions.
SCATTER PLOTS TIME SERIES PLOTS
WBE
CHW
WBE
CHW
HW HW
9
Change point behavior creates problems for short-term monitoring
Singh, Reddy and Abushakra (2014). Predicting Annual Energy Use in Buildings Using
Short-Term Monitoring: The Dry Bulb Temperature Analysis (DBTA) Method, ASHRAE
Trans., January.
How can length of monitoring be reduced to save on M&V costs?
Data Analytics_June 2017_Reddy
Data Analytics_June 2017_Reddy 10
Subbarao, Etingov and Reddy (2014) , An Actuarial Approach to Retrofit Savings in Buildings. ASHRAE Trans.
Suitable for risk analysis by financial institutions for large investment decisions
Portfolio Analysis to Determine ECM Savings in Groups of Buildings
Data Analytics_June 2017_Reddy 11
Building Operations
AuditingDay-to-Day
O&M
Individual BldgMonitoring
Residences Small/medium commercial
Large Commercial
Cluster of Bldgs
Integrated Energy Systems
M&V
Inverse Statistical Models
Data Mining and Machine Learning
Calibrated Simulation
Demand Side Mgmt.
Demand Response
Condition Monitoring
AFDDE CommissionSupervisory
Control
Forecasting, Control and
Dispatch
4/4
Data Analytics_June 2017_Reddy 12
Calibration of Detailed BldgSimulation Programs
Individual Bldg
Urban BldgEnergy Modeling
(UBEM)
Each and every bldgEmpirical modeling of
city blocks
Archtype or prototype bldg.
model (with and w/o canopy correction)
Weather research and forecasting
(WRF) mesoscale models (1 km x 1 km)
Past
• Heuristic trial and error
• Simulation search methods (MC)-parameter sensitivity
• Simulation optim(GA)
Present/Future
• Machine Learning Optimization tools (Autotune, GA)
• Automated capture of bldg. imagery (GIS, satellite, airborne, land-based, cartographic) and HVAC prototypes (databases)
• Add-ons to EnergyPlus- canyon/canopy correction ( impact of neighboring blds on radiation, airflow and temperature distributions)
• Coupling WRF grids with proper specs of buildings roads/open spaces of as grid macro inputs
Data Analytics_June 2017_Reddy 13
- Fei Zhao (2012). Agent-based modelling of commercial building stocks for
energy policy and demand response analysis, Ph.D dissertation, Georgia Tech.
- Julia Sokol (2015). Deriving archtype templates for urban building energy models
based on measured monthly energy use, Ph.D dissertation, MIT
14
Difficulties during Day-to-Day O&M
•Building energy systems and equipment routinely perform
below expectations
•Building operators have limited time and expertise
•Unnecessary or improper manual overrides in response to
occupant complaints
•Buildings are dynamic entities which need to be retuned
constantly
•Limited number of sensors and manpower for smaller buildings
•Too many sensors in large buildings with EMCS
- Sensor info may overwhelm operator
•Degradation goes unnoticed for extended periods
Data Analytics_June 2017_Reddy
Day-to-Day Operations
Condition MonitoringDefinition: the design and use of sensing equipment and analysis methods to monitor and report on current status and to detect a significant change due to improper operation or equipment degradation attributable to a fault (soft or hard)
Data Analytics_June 2017_Reddy15
Condition Monitoring
Report on current status
Improper Operation
Whole Building
Specific Equipment
Equip/System Perform. Degradation (FDDE)
Fault Detection (FD)
Fault Diagnosis
Evaluation/ Prognosis
EEOs Identification Detect Isolate Commission Replace
Day-to-Day Operations
Condition Monitoring• Advantages: reduces energy use, prolong equipment life and
reduce cost associated with service and maintenance
• Involves monitoring, and identification of improper building start/stop, equipment hard/soft faults, occupant behavior
• Suite of sensing methods (visual, thermal, electrical, vibrational, optical, tribology...)
• Suite of analysis methods- generally requires higher level of modeling sophistication than needed for M&V
• Model residual behavior acquires greater importance
• Need refined and local error metrics superior to global MBE, RMSE or CV
Data Analytics_June 2017_Reddy16
Plot of WBE[MJ]
0 200 400 600 800
predicted
0
200
400
600
800
ob
se
rve
dPhoenix- Medium Office Building- Weekdays- TMY3
Outlier Plot with Sigma Limits
Sample mean = 0.000115598, std. deviation = 44.6723
0 2 4 6(X 1000)Row number
-200
-100
0
100
200
WB
E_
Re
s_
TM
Y
-4
-3
-2
-1
0
1
2
3
4
Outlier Plot with Sigma Limits
Sample mean = -12.0185, std. deviation = 50.9113
0 2 4 6(X 1000)Row number
-200
-100
0
100
200
WB
E_
Re
s_
20
13
-4
-3
-2
-1
0
1
2
3
4
CP-MLR Model Residuals for both Training and Prediction are patterned due to Thermostat Set-point Change
COOLING schedule:Until 5 am = 26.7 C5 am - 6 am = 25.6 C6 am - 7 am = 25 C7 am - 10 PM = 24 C10 PM - midnight = 26.7 C
HEATING schedule:Until 5 am = 15.6 C5 am - 6 am = 17.8 C6 am - 7 am = 20 C7 am - 10 PM = 21 C10 PM - midnight = 15.6 C
Tdb, Tdp,Tdb
+, Tdp+
Elight, Eplug
Outlier Plot with Sigma Limits
Sample mean = -0.357654, std. deviation = 22.2957
0 2 4 6(X 1000)Row number
-200
-150
-100
-50
0
50
100
150
200
Re
s_
TM
Y_
AN
N
-4-3-2-101234
Outlier Plot with Sigma Limits
Sample mean = 5.30096, std. deviation = 26.0908
0 2 4 6(X 1000)Row number
-200
-150
-100
-50
0
50
100
150
200
Re
s_
20
13
_A
NN
-4-3-2-101234
Outlier Plot with Sigma Limits
Sample mean = 0.000115598, std. deviation = 44.6723
0 2 4 6(X 1000)Row number
-200
-100
0
100
200
WB
E_
Re
s_
TM
Y
-4
-3
-2
-1
0
1
2
3
4
Outlier Plot with Sigma Limits
Sample mean = -12.0185, std. deviation = 50.9113
0 2 4 6(X 1000)Row number
-200
-100
0
100
200
WB
E_
Re
s_
20
13
-4
-3
-2
-1
0
1
2
3
4
CP-MLR Models
ANN_MLP-BP (using CP terms)
X-bar Chart for WBE_Res_TMY
0 50 100 150 200 250 300
Subgroup
-26
-16
-6
4
14
24
34
X-b
ar
CTR = 0.00UCL = 24.41
LCL = -24.41
Range Chart for WBE_Res_TMY
0 50 100 150 200 250 300
Subgroup
0
50
100
150
200
250
300
Ran
ge
CTR = 155.24
UCL = 240.39
LCL = 70.10
CP-MLR Models
ANN_MLP- BP (using CP terms)X-bar Chart for Res_TMY_ANN
0 50 100 150 200 250 300
Subgroup
-26
-16
-6
4
14
24
34
X-b
ar
CTR = -0.36UCL = 13.88
LCL = -14.60
Range Chart for Res_TMY_ANN
0 50 100 150 200 250 300
Subgroup
0
50
100
150
200
250
300
Ran
ge
CTR = 90.59
UCL = 140.27
LCL = 40.90
Constant Deviation
Base load Deviation
Late Shutdown
Early Startup
Example 1: Condition Monitoring
Automatic Identification of Actionable Energy Efficiency Opportunities (EEOs) from Interval Data for Small-Medium Buildings
Types of EEOs Studied
Howard, Reddy and Runger (2016) Automated Data Mining Methods for Identifying Energy Efficiency Opportunities Using Whole-Building Electricity, ASHRAE Trans., January
Overview of Analytical Method
Infer constant & baseload deviation EEOs from clusters
Infer early startup & late shutdown
EEOs
Quantify EEO savings potential
Cluster temperature-
adjusted values
Adjust electricity for temperature
effects
Determine occupied &
unoccupied period
Preprocess Data
Data Analytics_June 2017_Reddy 21
Stage 1: Schedule EEO Detection, CO Office
Data Analytics_June 2017_Reddy 22
• Letting 𝑦𝑖(𝑡) denote the electricity consumption during hour 𝑡 of day 𝑖, we fit the following model for each day:
ො𝑦𝑖 𝑡= 𝛽0 + 𝛽1𝑡 + 𝛽2 𝑡 − 𝑘𝑖1 + + 𝛽3 𝑡 − 𝑘𝑖2 + + 𝛽4 𝑡 − 𝑘𝑖3 +
+ 𝛽5 𝑡 − 𝑘𝑖4 + + 𝛽6 𝑡 − 𝑘𝑖5 + + 𝛽7 𝑡 − 𝑘𝑖6 +
• Knot points 𝑘𝑖𝑗 in equation above chosen for each day to minimize the
residual sum of squares:
𝑅𝑆𝑆𝑖 =
𝑡=1
24
[𝑦𝑖 𝑡 − ො𝑦𝑖 𝑡 ]2
Six knot spline regression found to be best fordetecting startup and shutdown ofoffice bldgs
Schedule EEO Results: CO Office
Data Analytics_June 2017_Reddy
5,21 5,23
23
Schedule EEO Results: CO Office
Data Analytics_June 2017_Reddy 24
Schedule EEO Results: CO Office
BehaviorNumber of
days
Savings opportunity
percent
Annual savings opportunity
(kWh)
normal 159 0% 0
late shutdown 40 6.67% 23,985
early startup 25 3.41% 7,969
early shutdown 3 -3.51% -1,028
early startup, late shutdown
21 10.27% 20,167
late startup 2 -3.68% -709
other 3 -1.88% -549Data Analytics_June 2017_Reddy 25
Stage 2: Amplitude EEO Detection
• Using the knot points identified in stage 1, we determine the hours in which the building was occupied and unoccupied for each day as follows:– Occupied period: 𝑘𝑖2 < 𝑡 < 𝑘𝑖5– Unoccupied period: 𝑡 < 𝑘𝑖1 or 𝑡 > 𝑘𝑖6
• Let ҧ𝑥𝑖,𝑜 and ത𝑦𝑖,𝑜denote the mean hourly external temperature and the mean hourly electricity consumption during occupied period of day 𝑖– ҧ𝑥𝑖,𝑈 and ത𝑦𝑖,𝑈 denote the same quantities but for unoccupied period of day 𝑖
• We fit the following spline regression models using robust regression, where 𝑘𝑜 and 𝑘𝑈 are chosen to minimize robust residual sum of squares:
– ො𝑦𝑖,𝑜 = 𝛽0,𝑜 + 𝛽1,𝑜 ҧ𝑥𝑖,𝑜 + 𝛽2,𝑜 ҧ𝑥𝑖,𝑜 − 𝑘𝑜 +
– ො𝑦𝑖,𝑈 = 𝛽0,𝑈 + 𝛽1,𝑈 ҧ𝑥𝑖,𝑈 + 𝛽2,𝑈 ҧ𝑥𝑖,𝑈 − 𝑘𝑈 +
• The residuals from these model are then divided by their actual values ത𝑦𝑖,𝑜and ത𝑦𝑖,𝑈, and the resulting values are clustered using DBSCAN
Data Analytics_June 2017_Reddy 26
Amplitude Results: NM Office
BehaviorNumber of
days
Savings opportunity
percent
Annual savings opportunity
(kWh)
Cluster 1 amplitude 182 -0.84% -1526
Cluster 2 amplitude 44 -13.59% -5903
Cluster 3 amplitude 12 13.17% 1971
Cluster 0 amplitude 4 -2.66% -116
Data Analytics_June 2017_Reddy 27
Example 2: Mining of Monitored Data from Residences
“Mining Hidden Knowledge from Measured Data for Improving
Building Energy Performance”, Zhun Yu, Ph.D thesis, Concordia
University, January 2012
• Developed a classification decision tree methodology for establishing a
predictive model for energy demand in residences (low or high EUI)
• Cluster analysis to identify occupant behavior patterns that could save
energy
• Association rules identified correlations in building operational data which
could lead to energy savings by modifying mechanical ventilation
equipment
• Above three methods combined into a methodology to allow identifying
occupant behavior which needs to be modified and provide feasible
recommendations
Data Analytics_June 2017_Reddy 28
29
Types of Maintenance Related to
Equipment Degradation
Reactive Maintenance(done in small bldswith limited staff)
PreventiveMaintenance(done by servicecompanies)
PredictiveMaintenance(results in continuouscommission)
Data Analytics_June 2017_Reddy
Continuous-Commissioning/Re-Tuning
30
Ene
rgy
Co
nsu
mp
tio
n
Time
Typical commercial building behavior over time
Periodic Re-tuning Ensures Persistence
Continuous Re-tuning Maximizes Persistence
S. Katipamula (2012), Lessons learnt from Building Re-tuning Training,
ASHRAE Conf., Jan
Data Analytics_June 2017_Reddy
Data Analytics_June 2017_Reddy 31
Massieh Najafi (2010). Fault Detection and Diagnosis in Building HVAC Systems,Doctoral dissertation, Univ of California, Berkeley, Fall.
32
Definitions: FDDE
Fault: Abnormal operation
- hard and soft/incipient faults
- types: process, sensor, control
Detection: Signaling occurrence of a faulty condition, situation or operation
Diagnosis: Identifying the root cause of the fault- reasoning in the reverse direction: ascertain cause given effect
Evaluation: Impact of the fault on $ or equipment life
Action: Corrective measures taken --- Commissioning
Data Analytics_June 2017_Reddy
Data Analytics_June 2017_Reddy 33
D.M. Himmelblau (1978),
Fault Detection and Diagnosis
in Chemical and Petrochemical
Processes, Elsevier
Fault Detection
34
COP predicted
COP measured
tttt xy
** *
**
*
***
**
*
*
*
**
*
*
Uncertainty band
*
* ok*
*alarm
Model Based (Analytical) FD Methods
Data Analytics_June 2017_Reddy
Data Analytics_June 2017_Reddy 35
RP1043 Lab Chiller
2.0
2.5
3.0
3.5
4.0
4.5
5.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Test Number
CO
P
Normal2
CF6
CF12
CF20
CF30
FDD method should be very sensitive:
Differences in COP between normal and faulty states are small as against
the large variation during normal operation
Variation of measured COP with operating conditions for Normal operation and under condenser fouling tests at four severity conditions
3/6
36
Fault Diagnosis Phil Haves, LBNL, 2004
Data Analytics_June 2017_Reddy
• Diagnosis requires knowledge of how different faults affect operation
• Generally, three methods of diagnosing faults:
- Analyze how differences between predicted and actual performance
vary with operating points, e.eg, using IF/THEN rules
- Compare actual performance with different mathematical models
of faulty performance
- Estimate parameters of an on-line model extended to treat faults
(e.g. low estimated UA value indicates coil fouling)
Heuristic
Fault signatures
pattern recognition
Parameter estimation
D.M. Himmelblau (1978)
Data Analytics_June 2017_Reddy 37
Chris Hobbs (2011?) Fault Tree Analysis with Bayesian Belied Networks for Safety-Critical
Software, QNX Software Systems
Appealing Method
38
Evaluating Different FDD MethodsNeed to weight Type I and Type II
errors by cost consequences
(false alarms and missed opportunities)
Data Analytics_June 2017_Reddy
39
Evaluation Methodology
where
J1 cost penalty due to false positives
J2 cost in increased energy use due to missed detection of faults (or false negatives);
ce cost of electric energy
CFP cost of technician’s time to verify complete system in response to a false alarm
FN,f false negative rate for fault f (or missed opportunity rate)
FP false positive rate (or false alarm rate)
f index for fault type
NF total number of possible faults in system
Pf probability of occurrence of fault type f
P0 probability of occurrence of no-fault (i.e., fractional time of fault-free chiller operation)
Ef extra electric power required to provide necessary cooling as a result of fault type f
ft time period (hours) for which the fault f has gone undetected or un-rectified
FN
1 2 0 P FP e f f N,f f
f=1
min { + +...} min{P .F . c (P . E .F . t )}J J J C
Data Analytics_June 2017_Reddy
Simple to formulate- hard to implement realistically
40
Past and Current Status of FDD in HVAC
• Heuristic process based FDD systems most common in the past
• Often relies on steady state data
• Largely limited to thermal sensors and analysis techniques
• Lot of papers on rooftop units, chillers and AHU
• Component isolation approach is best if feasible (need more sensors)
• Grey-box models have appeal but not lived up to expectation (such as characteristic parameter approach)
• ANN tend to give better modeling accuracy- but how does one train them for fault-free and faulty states?
• Grey-box model trained online (with forgetting factor) more sensitive than stationary models but prone to volatility
• High initial costs of sensing equipment and customization software limited widespread use (there are platforms such as VOLTRONwhich may alleviate the latter)
Data Analytics_June 2017_Reddy
41
Current and Future Status of FDD in HVAC
• Several published papers proposing whole building diagnosticians
• Analytical redundancy and virtual sensor concepts proposed to reduce sensing hardware requirements
• Extension of single fault methods to multiple faults
• Clustering methods (proposed 20 years back) have been expanded into more comprehensive data mining methods
• Fault trees + Bayesian Belief Networks + fuzzy logic concepts also published
• Need to develop robust and automated data cleaning routines
• How best to train models online and update them?
• Development of self correcting sensor networks
• Better analysis methods for evaluation/prognostics (of fault impacts)
• Most important: very few real-time implementation of AFFD tools in the field to demonstrate benefits and reliability
Data Analytics_June 2017_Reddy
Supervisory Control
• Heuristic
• Model based
• Adaptive
• Forecasting bldg. load, solar, PV output
• Whole building systems
• Cooling plants + thermal storage
• CHP
• Integrated energy systems with battery storage (distributed generation)
• How to explicitly include forecasting errors into scheduling and control strategies?
Data Analytics_June 2017_Reddy 42
Prime Mover 1
Solar PV
Air-Handlers
Natural Gas
Boiler 1
Electricity Meter
Building Thermal Loads
Building Electric Loads
Grid
VC Chiller 1
VC Chiller 2
𝑄𝑏𝑜𝑖𝑙𝑒𝑟
𝐹𝑏𝑜𝑖𝑙𝑒𝑟1
𝐹𝑃𝑀1
𝐸𝑃𝑀1
𝑄𝑉𝐶1 𝐸𝑉𝐶1
𝑬𝒃𝒍𝒅𝒈
𝑸𝒄𝒐𝒐𝒍𝒊𝒏𝒈
𝑸𝒉𝒆𝒂𝒕𝒊𝒏𝒈
𝑄𝑉𝐶2 𝐸𝑉𝐶2
𝑬𝑷𝑽
AC Chiller 2𝑄𝐴𝐶2
Decision variablesNon-decision variablesProvided forecasts
CT-D
AC Chiller 1
CT-VC1 CT-VC2
𝑄𝐴𝐶1
CT-AC1 CT-AC2
Boiler 2
Prime Mover 2𝐹𝑃𝑀2
𝑄𝑃𝑀𝑄𝑃𝑀1
𝑄𝑃𝑀2
𝐸𝑃𝑀
𝐸𝑃𝑀2
𝐹𝑏𝑜𝑖𝑙𝑒𝑟2
𝑇𝑃𝑀
𝑚𝑃𝑀
𝑄𝑐𝑜𝑜𝑙𝑖𝑛𝑔
𝑄ℎ𝑒𝑎𝑡𝑖𝑛𝑔
𝑇𝑉𝐶𝑐ℎ𝑜1
𝑚𝑉𝐶𝑐ℎ𝑜1
𝑇𝑉𝐶𝑐ℎ𝑜2
𝑚𝑉𝐶𝑐ℎ𝑜2
𝑇𝑉𝐶𝑐𝑑𝑖1𝑇𝑉𝐶𝑐𝑑𝑖2𝑚𝑉𝐶𝑐𝑑𝑖1
𝑚𝑉𝐶𝑐𝑑𝑖2
𝑚𝑉𝐶𝑐𝑑𝑖
𝑇𝑉𝐶𝑐𝑑𝑖
𝑇𝐴𝐶𝑐ℎ𝑜1
𝑇𝐴𝐶𝑐ℎ𝑜2
𝑚𝐴𝐶𝑐ℎ𝑜1
𝑚𝐴𝐶𝑐ℎ𝑜2
𝑄𝐴𝐶
𝑇𝐴𝐶𝑐ℎ𝑜
𝑚𝐴𝐶𝑐ℎ𝑜
𝑇𝐻𝑅𝑈𝑜𝐷
𝑇𝑃𝑀𝑜𝐷
𝑇𝑃𝑀1
𝑚𝑃𝑀1
𝑇𝑃𝑀2
𝑚𝑃𝑀2
𝑇𝑃𝑀𝑜𝐻
𝑄𝑏𝑜𝑖𝑙𝑒𝑟1
𝑄𝑏𝑜𝑖𝑙𝑒𝑟2
𝑇𝑏𝑜𝑖𝑙𝑒𝑟2
𝑚𝑏𝑜𝑖𝑙𝑒𝑟2
𝑇𝑏𝑜𝑖𝑙𝑒𝑟1
𝑚𝑏𝑜𝑖𝑙𝑒𝑟1
𝑇𝑏𝑜𝑖𝑙𝑒𝑟
𝑚𝑏𝑜𝑖𝑙𝑒𝑟
𝑇𝐴𝐶𝑐𝑑𝑖1𝑇𝐴𝐶𝑐𝑑𝑖2𝑚𝐴𝐶𝑐𝑑𝑖1
𝑚𝐴𝐶𝑐𝑑𝑖2
𝑚𝐴𝐶𝑐𝑑𝑖
𝑇𝐴𝐶𝑐𝑑𝑖
𝑇𝑃𝑀𝑜𝐶
𝑇𝐻𝑅𝑈𝑜𝐻
𝑄𝑉𝐶𝑇𝑉𝐶𝑐ℎ𝑜𝑚𝑉𝐶𝑐ℎ𝑜
𝑇𝐴𝐶𝑜𝑢𝑡
𝐸𝑉𝐶
𝐻𝑋ℎ
𝐻𝑋𝑐
𝐻𝑋𝑑
𝐹𝑃𝑀
𝐹𝑏𝑜𝑖𝑙𝑒𝑟
𝑄𝐴𝐶𝑖𝑛2𝑄𝐴𝐶𝑖𝑛1
𝑄𝑃𝑀,𝐶
𝑇𝑃𝑀,𝐶
𝑚𝑃𝑀,𝐶
𝑄𝑃𝑀,𝐻
𝑇𝑃𝑀,𝐻
𝑚𝑃𝑀,𝐻
𝑄𝑃𝑀,𝐷
𝑇𝑃𝑀,𝐷
𝑚𝑃𝑀,𝐷
𝑄𝐻𝑅𝑈,𝐻
𝑇𝐻𝑅𝑈,𝐻
𝑚𝐻𝑅𝑈,𝐻
𝑄𝐻𝑅𝑈,𝐷
𝑇𝐻𝑅𝑈,𝐷
𝑚𝐻𝑅𝑈,𝐷
𝑄𝐴𝐶𝑖𝑛𝑇𝐴𝐶𝑖𝑛𝑚𝐴𝐶𝑖𝑛
𝐸𝑔𝑟𝑖𝑑𝑏𝑢𝑦𝐸𝑔𝑟𝑖𝑑𝑠𝑒𝑙𝑙
Data Analytics_June 2017_Reddy 43
Parting Thoughts 1/2
1) How best to fulfill role of educators and academic research
- distill/assimilate past research and incorporate into curriculum (too many topics to cover and not enough course slots)
- better systemize and codify knowledge
- cross “t”s and dot “I”s (too many misconceptions/errors in current practice)
- balance between research and practice
- refrain from proposing esoteric methods with limited practicality
- theoretical development should not get too much ahead of practice
- always, always be critical of your own work!
(even if you do not express your thoughts aloud)
Data Analytics_June 2017_Reddy 44
Parting Thoughts 2/2
2) Knowledge transfer of past work to current crop of R&D
engineers is getting harder
- too many journal and conference papers, hard to
absorb, repetitive, difficult to sort mediocre from good
3) What is the role of the domain expert in the age of data
mining?
4) Everything must be made as simple as possible, but not
simpler. -Einstein-
Data Analytics_June 2017_Reddy 45
Data Analytics_June 2017_Reddy 46