TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility...

19
TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance information” - SACBD@ECSA2018 Presentation Madrid, September 25 th 2018 David Scarlatti, Boeing R&T - Europe

Transcript of TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility...

Page 1: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

TransformingTransport Big Data Value in Mobility and Logistics

“Real-time prediction of flight arrival times using surveillance

information” - SACBD@ECSA2018 Presentation

Madrid, September 25th 2018

David Scarlatti, Boeing R&T - Europe

Page 2: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

TT Project Context

• EC H2020 funded Lighthouse project. +40 partners across Europe

• Demonstrator style: Focus on deployment of technology on business.

• Delivering “Pilots” in different Transport Industry Domains…

SACBD@ECSA2018 2

Page 3: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

TT Project Context

• Smart Highways– AUSOL Load Balancing Pilot

– Load Balancing Pilot – Norte Litoral

• Sustainable Connected Vehicles– Sustainable Connected Cars Pilot

– Sustainable Connected Trucks Pilot

• Proactive Rail Infrastructures– Predictive Rail Asset Management Pilot

– Predictive High Speed Network

• Ports as Intelligent Logistics Hubs– Valencia Sea Port Pilot

– Duisport Inland Port Pilot

• Smart Airport Turnaround– Smart Passenger Flow Pilot

– Smart Turnaround

• Integrated Urban Mobility– Integrated Urban Mobility: Tampere pilot

– Valladolid Integrated Urban Mobility

• Dynamic Supply Networks

SACBD@ECSA2018 3

Page 4: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

TT Project in Aviation

• Smart Airport Turnaround. Two Pilots:

– Smart Passenger Flow Pilot – @Athens Airport

– Smart Turnaround – @Milan Airport

SACBD@ECSA2018 4

Page 5: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

SACBD@ECSA2018 5

Turnaround overview.Why Arrival Time matters?

Landing Taxi-in Turnaround Taxi-out Takeoff

Aircraft Process

Objective: Increase predictability of the process and airport situational awareness. As a result we expect an improvement and optimization of the turnaround.

Arrival estimation Taxi-in predictions

Boarding time predictions

Taxi-out predictions

While the Aircraft is on ground, the airline is not making money, and the airport can’t serve another aircraft.

Page 6: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

Estimated Time of ArrivalModel/Rule Base .vs. Data Driven

• Traditional approach: You model Aircraft behaviour and context (i.e. Winds, airspace use, etc...). Then you predict, knowing where is the aircraft, what will happen till it lands. This give you the ETA.

• Data Driven: Following current approach in many fields, let the data talk... ML can discover patterns, not apparent to humans, on what influence the arrival time.

SACBD@ECSA2018 6

Page 7: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

But, which “Data“?

• We follow an iterative approach, starting with one source, and adding in each increment (Agile approach) a new source of data.

1. First, just surveillance data (4D location) is used

2. Weather conditions to be added

3. Flight Plan data to be added

4. Other sources???

SACBD@ECSA2018 7

Page 8: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

High Level Architecture

• Architecture match a simple lambda architecture.

SACBD@ECSA2018 8

Data Collection:

(surveillance, weather,

etc…)Historical data

wrangling and ML Training.

Real Time Predictions / Visualization

Page 9: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

Data Collection

• Surveillance stream. We use FR24 live feed data, based mainly in ADS-B:

– Sample records

– Latitude, Longitude, Altitude, Timestamp….

SACBD@ECSA2018 9

{"1de0d34c": ["4787B2", 51.20212, 12.81526, 354, 38000, 404, "7364", "5000", "B738", "LN-NGE", 1537045185, "MLA", "OSL", "DY1851", 0, -64, "NAX51J", 1537050813], "1de1108d": ["4B1903", 47.53462, 8.49345, 333, 2600, 178, "3002", "5000", "A343", "HB-JMH", 1537045156, "ZRH", "JNB", "LX288", 0, 1152, "SWR288", 0,], "1de1108c": ["4B17FD", 47.45358, 8.5565, 129, 0, 0, "2000", "200", "BCS3", "HB-JCF", 1537045174, "ZRH", "", "", 1, 0, "SWR97Q", 0], "1de0e21b": ["4CA175", 40.33225, 22.82972, 112, 6300, 232, "4456", "200", "B738", "EI-FEF", 1537045188, "POZ", "SKG", "RR6505", 0, -704, "RYS6505", 1537045663, "1de0e4f2": ["4BBC53", 36.99613, 35.28329, 56, 0, 51, "5101", "200", "A320", "TC-OBS", 1537045190, "IST", "ADA", "8Q16", 1, 0, "OHY016", 1537045157,]…

Page 10: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

Data Collection

• We use Apache Flume for the collection. Current Flume Flow:

SACBD@ECSA2018 10

FR24 live feed

Sourcetype = exec

selector.type = replicating

In Memory Channel

In Memory Channel

Sinktype = HDFS

path = …/date=%Y-%m-%d

Sinktype = Kafka

Page 11: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

Batch Layer

• We use Hadoop and Hive for preparing the Training datasets:

SACBD@ECSA2018

11

End User GUI (Web)

Data Lake

SQL Alike Access Layer

Raw DataMXP

airport Data

Processed Data

ETL

Interactive Data Science

NOAAData

Training Dataset

ML model for Real-Time prediction

Page 12: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

Speed Layer

• We use Kafka for the Data distribution and python/H2O framework for the real-time prediction.

SACBD@ECSA2018 12

Data Distribution Data Processing / Machine Learning

FR24Surveillance

Web GUI

VR

ML model for Real-Time prediction

Page 13: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

Visualization

SACBD@ECSA2018 13

• We adapted Virtual Radar Server to show the live ETA:

Andrew Whewell. Virtual radar server open source project, 2010. http://www.virtualradarserver.co.uk/

Page 14: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

Next iteration

Architecture-Full picture

SACBD@ECSA2018 14

Data Distribution Data Processing / Machine Learning

End User GUI (Web)

FR24Surveillance

Data Lake

SQL Alike Access Layer

Raw DataMXP

airport Data

Processed Data

NON SQLDocument DB

ECTLB2B

Flight Data

ETL

Interactive Data Science

Web GUI

NOAAData

VR

Page 15: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

Machine Learning details

• Current version only uses surveillance derived features: latitude, longitude, altitude, speeds, timestamp...

• Dependent variable: Seconds to land

• Regression problem. Algorithm: Gradient Boost Machine (GBM)

• Training dataset: Ground truth for selected airlines. January 2016 to August 2017. +10 million observations. We take a 80 % split of this dataset for training, so that approximately 2.3 million observations are employed in the testing of the model.

SACBD@ECSA2018 15

Page 16: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

ETA Results presentation.Box plots along time.

SACBD@ECSA2018 16

This aircraft is about 85 minutes to land:The ETA error is between -856 and +895 seconds ,i.e. if ETA is 8:30 it may land between 8:16 and 8:45

This aircraft is about 35 minutes to land:The ETA error is between -507 and 389 seconds ,i.e. if ETA is 8:30 it may land between 8:21 and 8:36

60 minutes

Page 17: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

ML Results evaluation

• In Figure 2 we compare the errors of the predictions issued by three services 1 hour before landing for 736 flights bound for Malpensa-Milan airport on the period ranging from September 2017 to December 2017 (new data never seen by TT algorithm).

SACBD@ECSA2018 17

Page 18: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

Future Work

• Adding new data sources:

– METAR reports as weather data

• New Flume Flow, new Hive tables for adding new features in training models, new Kafka topic with the METAR data

– Flight Plans from Eurocontrol

• Adding Mongo to Batch layer (HDFS no good for millions of small .xml files)

• Try other algorithms:

– Xtreme Gradient Boost

SACBD@ECSA2018 18

Page 19: TransformingTransport - BDVA... · 2018-10-03 · TransformingTransport Big Data Value in Mobility and Logistics “Real-time prediction of flight arrival times using surveillance

Thank You!

SACBD@ECSA2018 19

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 731932