Globalization and the Ladder of Development: Pushed to the ...
[email protected] @william markito · 3 - Results are pushed immediately to deployed applications...
Transcript of [email protected] @william markito · 3 - Results are pushed immediately to deployed applications...
![Page 1: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/1.jpg)
1
1 Pivotal Confidential–Internal Use Only
![Page 2: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/2.jpg)
2
William Markito@william_markito
Fred Melo@fredmelo_br
(incubating)
Implementing a highly scalable Stock prediction system with Apache Geode,
Spring XD and Spark MLib
![Page 3: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/3.jpg)
About us
Fred Melo
Technical Director for Data
@fredmelo_br
William Markito
Enterprise Architect for GemFire
@william_markito
![Page 4: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/4.jpg)
A Simple Example
Data SourcesLook for patterns
Forecast
![Page 5: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/5.jpg)
![Page 6: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/6.jpg)
"Smart System"
Applicability
![Page 7: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/7.jpg)
Smart System
Learns with HISTORICAL TRENDS
Live data becomes historical over time
Real-Time
Evaluates LIVE DATA
Historical
What do we want to build?
Trading Data
“According to historical trends, there’s an 80% chance this stock prices might go down within the next few minutes"
"How were the technical indicator readings when the latest price drops happened? "
![Page 8: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/8.jpg)
Live Data
Data Temperature
Hot
Cold
Apache Hawq
Apache Geode / GemFire1- Live data is ingested into the grid
3 - Results are pushed immediately to deployed applications
4 - “Hot" data ages, becoming part of the historical dataset
5 - Re-training triggered, ML model updated.
Spring XD
2 - Trained ML model compares new data to historical patterns
The Machine Learning Pipeline data flow
Spring XD
Machine Learning model
![Page 9: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/9.jpg)
Live Data
Data Temperature
Hot
Warm
Apache Geode / GemFire1- Live data is ingested into the grid
3 - Results are pushed immediately to deployed applications
Machine Learning model
2 - Trained ML model compares new data to historical patterns
The Machine Learning Pipeline data flow
5 - Re-training triggered, ML model updated.
Spring XD
Simplified Model
Spring XD
![Page 10: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/10.jpg)
Transform Sink
SpringXD
Extensible Open-Source Fault-Tolerant Horizontally Scalable Cloud-Native
Machine Learning
Enrich Filter
Split
Dashboard
Indicators
1
2
Predict
3
Real data
Simulator
/Stocks
/TechIndicators
/Predictions
![Page 11: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/11.jpg)
Too complex?? Eating it in small bites…
![Page 12: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/12.jpg)
SpringXD GemFire
![Page 13: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/13.jpg)
Transform Sink
SpringXD
Extensible Open-Source Fault-Tolerant Horizontally Scalable Cloud-Native
Machine Learning
Enrich Filter
Split
Dashboard
Indicators
1
2
Predict
3
Real data
Simulator
/Stocks
/TechIndicators
/Predictions
![Page 14: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/14.jpg)
/Stocks
/TechIndicators
/Predictions
• Cache • Configurable through XML, ,Java
• Region • Distributed j.u.Map on steroids • Highly available, redundant
• Member • Locator, Server, Client
• Callbacks • Listener, Writer, AsyncEventListener, Parallel/Serial
Apache Geode Concepts
![Page 15: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/15.jpg)
Apache Geode HA and Fail-Tolerance
![Page 16: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/16.jpg)
Transform Sink
SpringXD
Extensible Open-Source Fault-Tolerant Horizontally Scalable Cloud-Native
Machine Learning
Enrich Filter
Split
Dashboard
Indicators
1
2
Predict
3
Real data
Simulator
/Stocks
/TechIndicators
/Predictions
![Page 17: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/17.jpg)
Transform Sink
SpringXDEnrich Filter
Split1
2
Predict3
Streams Pipelines Sources Sinks Filters Taps
![Page 18: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/18.jpg)
Transform Sink
SpringXD
Extensible Open-Source Fault-Tolerant Horizontally Scalable Cloud-Native
Machine Learning
Enrich Filter
Split
Dashboard
Indicators
1
2
Predict
3
Real data
Simulator
/Stocks
/TechIndicators
/Predictions
![Page 19: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/19.jpg)
medium avg (x+1)
relative strength (x)
medium avg (x)
price(x)
Machine Learning Model (e.g. Linear Regression)
Features Label
![Page 20: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/20.jpg)
medium avg (x+1)
relative strength (x)
medium avg (x)
price(x)
Machine Learning Model (e.g. Linear Regression)
Features Label
![Page 21: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/21.jpg)
Demo Time
Error
![Page 22: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/22.jpg)
https://github.com/Pivotal-Open-Source-Hub/StockInference-SparkSource code and detailed instructions available at:
22
William Markito@william_markito
Fred Melo@fredmelo_br
Follow us on Twitter!
![Page 23: wmarkito@pivotal.io @william markito · 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset 5 - Re-training](https://reader033.fdocuments.net/reader033/viewer/2022050219/5f64e73b780ba673ab4b55e9/html5/thumbnails/23.jpg)
23
1 Pivotal Confidential–Internal Use Only