Connected hubs: an analysis of the Lufthansa network in Europe

31
Connected Hubs 1 February 2017 CHAN Sau Yee & WANG Xi

Transcript of Connected hubs: an analysis of the Lufthansa network in Europe

Page 1: Connected hubs: an analysis of the Lufthansa network in Europe

Connected Hubs

1 February 2017

CHAN Sau Yee & WANG Xi

Page 2: Connected hubs: an analysis of the Lufthansa network in Europe

Plan

● Objective

● Lufthansa Open API

● Methodology

● Data analysis

● Data visualisation

2

Page 3: Connected hubs: an analysis of the Lufthansa network in Europe

Objective

● To produce a map that shows the location of airports in Europe and the

direct flights in-between

○ What we need...

■ list of European airports

■ direct flights between any two airports

● To analyse the importance of airports based on the number of

connections needed

○ What we need...

■ list of European airports

■ number of direct flights for each airport

■ rank of airports by number of destinations

3

Page 4: Connected hubs: an analysis of the Lufthansa network in Europe

Definitions

● Case study in Europe (defined as EU28 including Britain, plus Switzerland

and Norway)

● Connection: the smallest number of transfers needed to travel between

two airports

○ Connection = 0: direct flight (without transfer)

○ Connection = 1: with 1 transfer

○ Connection > 1: with >1 transfer

4

A

B

C

D

Page 5: Connected hubs: an analysis of the Lufthansa network in Europe

Lufthansa Open API

● Reference data: Countries, Cities, Airports● Operations: Flight Schedules

A priori, the data are not limited to Lufthansa flights

5

Page 6: Connected hubs: an analysis of the Lufthansa network in Europe

Structure of data in the APIExample: Berlin-Tegel airport TXL in XML

6

“Airport”, “RailwayStation” or "BusStation"

Page 7: Connected hubs: an analysis of the Lufthansa network in Europe

Data model

countries

countryCode

zoneCode

countryName

cities

cityCode

countryCode

cityName

lat

lon

airports

airportCode

airportName

cityCode

countryCode

lat

lon

flights

origin

des

httpcode

date

7

primary keyforeign key

Page 8: Connected hubs: an analysis of the Lufthansa network in Europe

Methodology3 MOOCs on Coursera (University of Michigan)

- “Python Data Structures”- “Using Python to Access Web Data”- “Using Databases with Python”

2 books on Python:

- “Thinking Python” - Allen B. Downey- “Python For everybody” - Charles Severance

8

Page 9: Connected hubs: an analysis of the Lufthansa network in Europe

MethodologyCharles Severance “Python for everybody”, Chap.16

9

Page 10: Connected hubs: an analysis of the Lufthansa network in Europe

MethodologyCharles Severance “Python for everybody”, Chap.16

10

Page 11: Connected hubs: an analysis of the Lufthansa network in Europe

How to GET data

Step 1: Acquire all reference data on Countries, Cities and Airports...

Problem: 1,261 airports in total

→ get all records in several loops by altering the value of offset

number of records returnedMaximum is 100!

11

Page 12: Connected hubs: an analysis of the Lufthansa network in Europe

How to GET data (2)Step 2: Information on all flights between European airports over a week (2017/01/20-2017/01/26)

Obtain a list of European airports by SQL

→ 2 loops to create all possible pairs

220 x 220 = 48400 pairs = 3 h of execution per day!

12

need to always include origin, destination and date in request

Page 13: Connected hubs: an analysis of the Lufthansa network in Europe

Authorisation : OAuth 2Token acquisition before requests can be sent

13

Page 14: Connected hubs: an analysis of the Lufthansa network in Europe

Rate Limit● 5 request / seconde● 1,000 → 10,000 requests / hour● Decorator “RateLimited”

Error is thrown when

limits are exceeded

14

Page 15: Connected hubs: an analysis of the Lufthansa network in Europe

Inserting data into our database

import sqlite3

→CREATE TABLE if not exists,

INSERT INTO ____ VALUES...

15

Page 16: Connected hubs: an analysis of the Lufthansa network in Europe

Analysis and visualisations

16

Page 17: Connected hubs: an analysis of the Lufthansa network in Europe

The most important airports around the world,According to Lufthansa

17

Page 18: Connected hubs: an analysis of the Lufthansa network in Europe

European hubs of LH

18

Page 19: Connected hubs: an analysis of the Lufthansa network in Europe

LH flights in Europe

19

Page 20: Connected hubs: an analysis of the Lufthansa network in Europe

Data analysis in SQL

- 5 airports as origine with the greatest number of direct connections

- Data over a week

- Net flights per day

- Frequency by week

- 5 hubs based on Lufthansa BD

20

Page 21: Connected hubs: an analysis of the Lufthansa network in Europe

Data analysis in SQL (2)

- 5 airports as destination with the greatest number of direct connections

21

Page 22: Connected hubs: an analysis of the Lufthansa network in Europe

Data analysis in SQL (3)- Airports in France as origin in Lufthansa DB

22

Page 23: Connected hubs: an analysis of the Lufthansa network in Europe

Data analysis in SQL (4)

- 5 airports as origin with least direct connections

23

Page 24: Connected hubs: an analysis of the Lufthansa network in Europe

Data analysis in SQL (5)

- Connections of 5 hubs as origin in Lufthansa DB

24

Airport Connection = 0 Connection = 1 Connections > 1

Frankfurt (FRA) 91 105 24

Munich (MUC) 92 103 25

Vienna (VIE) 58 132 30

Zurich (ZRH) 51 132 37

Brussel (BRU) 48 117 55

Page 25: Connected hubs: an analysis of the Lufthansa network in Europe

Data analysis in SQL (6)Airports with Connections(= 1) from Frankfurt, sorted by country

25

Page 26: Connected hubs: an analysis of the Lufthansa network in Europe

Visualisation: Force-directed graph in D3.js

● Physical model: forces of attraction and repulsion● Algorithm defined in D3.js (JavaScript), a popular

package for data visualisation

Drawings obtained with force-directed algorithms

Source: https://cs.brown.edu/~rt/gdhandbook/chapters/force-directed.pdf

26

Page 27: Connected hubs: an analysis of the Lufthansa network in Europe

Force-directed graph (1)

27

Page 28: Connected hubs: an analysis of the Lufthansa network in Europe

Force-directed graph (2)Node central : Francfort

28

Page 29: Connected hubs: an analysis of the Lufthansa network in Europe

Tree

29

Page 30: Connected hubs: an analysis of the Lufthansa network in Europe

Limitations and perspectives

- Limitations- Quality of data

- Exclusivity of data

- Perspectives- A map that shows the frequency of service between airports

- Country profile: domestic VS local flights

- Airlines: legacy VS budget

30

Page 31: Connected hubs: an analysis of the Lufthansa network in Europe

Thank you!

31