Introduction to Big Data. Reference: What is “Big Data”?What is “Big Data”?
Professor Lukumon O. Oyedele...Big data analytics • Big data in vacuum is useless • Big data...
Transcript of Professor Lukumon O. Oyedele...Big data analytics • Big data in vacuum is useless • Big data...
Big Data and Sustainability: The next step for Circular Economy
Professor Lukumon O. OyedeleDirector of Bristol Enterprise, Research and Innovation Centre (BERIC)
University of the West of England (UWE), BristolUnited kingdom
July 2016
Big Data
Big Data Analytics (BDA)
Sustainability and Circular Economy
BDA & Circular Economy: Mutual Opportunities
Outline
Big Data
Definition
McKinsey Global Institute (2011)
Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Various Aspects
• Large dataset (Megabyte, Gigabyte, Terabyte, Petabyte, Exabyte)
• Unstructured data (networked data but fuzzy relationships)
• Data-driven research, business & decisions
• High skills (IT, statistics, etc.)
Big data characteristics – The 5Vs
Collectively analyzing the
broadening Variety
Responding to the
increasing Velocity
Cost efficiently processing the
growing Volume
Establishing the
Veracity of big data sources
30 Billion RFID sensors and counting
1 in 3 business leaders don’t trust the information they use to make decisions
50x 35 ZB
2020
80% of the worlds data is unstructured
2010
Identifyinghidden data
Of Value
Almost every manager is concerned about the money they spent
Big data architecture
Big data jobs in the UK
Big data analytics
Big data analytics
• Big data in vacuum is useless
• Big data analytics is the process of examining large data sets (Big data) to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information.
• Real game changer
Understanding How Data Powers Big Project
House
VA
LUE
TIME
High
Low
Past Future
Home Intelligence versus Data Science (Advanced Home Analytics)
Evolution Of The Home Analytic Process
HomeIntelligence
Home Intelligence
Typical Techniques and Data
Types
•Standard and ad hoc reporting, dashboards, alerts, queries, details on demand
•Structured data, traditional sources, manageable data sets
Common Questions
•What happened last quarter?•How much energy did we consume?•Where is the problem? In which situations?
Data Science (Advanced Home Analytics)
Typical Techniques and Data
Types
•Optimization, Predictive modeling, forecasting statistical analysis
•Structured/unstructured data, any types of sources, very large data sets
Common Questions
•What if…?•What’s the optimal scenario for our Homes?•What will happen next? What if these trends continue? Why is this happening?
DataScience
Big data analytics lifecycle
Type of Analytics
Prescriptive
Analytics
IoT Automation for Internet of Things
Decision making under uncertainty Determining best outcomes given variability
Ensemble Decision making Multiple predictive models & other techniques involved in
decision making
Self optimizing Continuously evolving looking to get the best outcome
Automatic
Analytics
Backend applications Batch processing, decision making, altering, etc.
Frontend applications Decision making, scoring, alerting, etc.
Real-time Real-time predictions as data is processed
Automatic updating Rebuild models is automated
Embedded analytics Advanced analytics is part of …
What-if analytics What happens if the data is changed?
Predictive
Analytics
Predictive modeling What will happens next?
Forecasting What if these trends continue?
Simulation What could happens if …?
Alerts An action is needed
Descriptive
Analytics
Statistical Analysis Advanced statistical techniques, correlation, etc.
Query/ Drill downs What exactly is the problem?
Ad-hoc reporting How many, how often, where?
Standard reporting What happened?
Analytics techniques & algorithms (1/3)
Technique Applicability Algorithms
Classification Most commonly used technique for predicting a
specific outcome such as response / no-response,
high / medium / low-value customer, likely to buy / not
buy.
Logistic Regression
Naive Bayes
Support Vector Machine
Decision Tree
Regression Technique for predicting a continuous numerical
outcome such as customer lifetime value, house
value, process yield rates
Multiple Regression
Support Vector Machine
Clustering Useful for exploring data and finding natural
groupings. Members of a cluster are more like each
other than they are like members of a different
cluster. Common examples include finding new
customer segments, and life sciences discovery
Enhanced K-Means
Orthogonal Partitioning
Clustering
Expectation
Maximization
Analytics techniques & algorithms (2/3)
Technique Applicability Algorithms
Attribute
Importance
Ranks attributes according to strength of
relationship with target attribute. Use cases
include finding factors most associated with
customers who respond to an offer, factors most
associated with healthy patients.
Minimum Description
Length
Anomaly
Detection
Identifies unusual or suspicious cases based on
deviation from the norm. Common examples
include health care fraud, expense report fraud, and
tax compliance.
One-Class Support
Vector Machine
Analytics techniques & algorithms (3/3)
Technique Applicability Algorithms
Association Finds rules associated with frequently co-
occuring items, used for market basket analysis,
cross-sell, root cause analysis. Useful for product
bundling, in-store placement, and defect analysis.
Apriori
Feature
Selection &
Extraction
Produces new attributes as linear combination of
existing attributes. Applicable for text data, latent
semantic analysis, data compression, data
decomposition and projection, and pattern
recognition.
Non-negative Matrix
Factorization
Principal Components
Analysis (PCA)
Singular Vector
Decomposition
business initiative
BUSINESS IMPERATIVE
The number of organizations who see
analytics as a competitive advantage is
growing.
2010 2013 2016
83%
IQIBM IBV/MIT Sloan Management Review Study 2016
substantially outperform
Studies show that organizations
competing on analytics outperform their
peers
1.6x Revenue
Growth2.0x EBITDA
Growth2.5x Stock Price
Appreciation
IBM IBV/MIT Sloan Management Review Study 2016
Sustainable & Circular Economy
Circular Economy
• A circular economy is an alternative to a traditional linear economy (make, use, dispose) in which we keep resources in use for as long as possible, extract the maximum value from them whilst in use, then recover and regenerate products and materials at the end of each service life.
Circular Economy
• Why a circular economy is important
1. As well as creating new opportunities for growth, a more circular economy will:
2. Reduce waste
3. drive greater resource productivity
4. deliver a more competitive UK economy.
5. position the UK to better address emerging resource security/scarcity issues in the future.
6. help reduce the environmental impacts of our production and consumption in both the UK and abroad.
A Networked Circular Economy
Not a Leakage Economy
Big Data Opportunities
Mutual Opportunities
Food Analytics• Sensors for occupancy, inside and outside
temperature, humidity, lighting levels, air speed, and air quality
Smart Facilities
• Utility smart meters and sub-meters for electricity, gas, and water
• Sensors for occupancy, inside and outside temperature, humidity, lighting levels, air speed, and air quality
• Smart equipment including building automation system gateways, solar inverters, and remotely monitored and controlled distributed generation
• External data sources such as weather forecasts, energy prices, and demand response signals
Energy Analytics
• Energy Analytics
• Which buildings or parts of the building have highest energy use per square foot?
• Are there similar buildings (square footage, space type, occupancy, HVAC type) that use less energy?
• What Energy Start rating do they have?
• Are the lights shutting off when the building is unoccupied?
• Is the economizer turning on when it should?
• Equipment Failure Prediction
• Are ducts leaking?
• Equipment Operations Optimization
Water Usage Analytics
• Benefits:
• Helps water engineers visualize water usage pattern
• Helps in identifying potential leakages
• Big Data Requirements:
• Data from water meters and dedicated sensors are sent toweb based warehouse
• Water usage could be visualized across service area or basedon specific account
• Level of usage are categorized and patterns are drawn
• Sudden and sharp changes in usage trends are used to traceleakages and breakages
• IBM has become a market leader in water usage analytics
Personalized Home Experience• Personalized Data are Saved:
• Usage preference data are saved for each user
• Indoor positioning system linked users with their preference.
• Techniques in Visible Light Communication (VLC), Bluetooth LowEnergy (BLE) and inertial device sensors are used.
• Huge data generated are automatically saved on cloud servers
• On Next Home Arrival
• Room automatically adjusts light.
• Sound and temperature are adjusted to your preferences everytime you enter.
• Kettle could anticipate and boil your water when you need it.
• Potential benefits
• This can let home owners feel very comfortable in their house
• Create competitive advantage for house builders
2
Supply Chain Collaborative BIM System for Minimising Waste in Projects
3
Supply Chain Collaborative BIM System for Minimising Waste in Projects
Aim
To develop an intelligent system using early supply chain involvement in projects.
Cost: £651,000Funding body: Innovate UKTimescale: 2014 - 2017Partners:
4
DRIM - Deconstruction and Recovery Information Modelling
5
DRIM - Deconstruction and Recovery Information Modelling
Aim
To develop an intelligence-based tool that will enable identification of reusable and recoverable materials at end-of-life of Projects.
Cost: £800,000Funding body: EPSRCTimescale: 2016 - 2019Partners:
6
Smart Cities Information Modelling (SCIM) Tool
7
Aim
To develop Smart Cities Information Modelling (SCIM) tool that will provide Impact estimator and optimum service planner.
Cost: £700,000Funding body: Innovate UKTimescale: 2016 - 2018Partners:
Conclusion
Big Data Analytics + Circular Economy
• Real Game Changer for delivering Sustainability• Reactive Decision making – What happened?
• Being very Proactive – What can we do better?
Thank You