Connected hubs: an analysis of the Lufthansa network in Europe

Post on 06-Apr-2017

157 views 0 download

Transcript of Connected hubs: an analysis of the Lufthansa network in Europe

Connected Hubs

1 February 2017

CHAN Sau Yee & WANG Xi

Plan

● Objective

● Lufthansa Open API

● Methodology

● Data analysis

● Data visualisation

2

Objective

● To produce a map that shows the location of airports in Europe and the

direct flights in-between

○ What we need...

■ list of European airports

■ direct flights between any two airports

● To analyse the importance of airports based on the number of

connections needed

○ What we need...

■ list of European airports

■ number of direct flights for each airport

■ rank of airports by number of destinations

3

Definitions

● Case study in Europe (defined as EU28 including Britain, plus Switzerland

and Norway)

● Connection: the smallest number of transfers needed to travel between

two airports

○ Connection = 0: direct flight (without transfer)

○ Connection = 1: with 1 transfer

○ Connection > 1: with >1 transfer

4

A

B

C

D

Lufthansa Open API

● Reference data: Countries, Cities, Airports● Operations: Flight Schedules

A priori, the data are not limited to Lufthansa flights

5

Structure of data in the APIExample: Berlin-Tegel airport TXL in XML

6

“Airport”, “RailwayStation” or "BusStation"

Data model

countries

countryCode

zoneCode

countryName

cities

cityCode

countryCode

cityName

lat

lon

airports

airportCode

airportName

cityCode

countryCode

lat

lon

flights

origin

des

httpcode

date

7

primary keyforeign key

Methodology3 MOOCs on Coursera (University of Michigan)

- “Python Data Structures”- “Using Python to Access Web Data”- “Using Databases with Python”

2 books on Python:

- “Thinking Python” - Allen B. Downey- “Python For everybody” - Charles Severance

8

MethodologyCharles Severance “Python for everybody”, Chap.16

9

MethodologyCharles Severance “Python for everybody”, Chap.16

10

How to GET data

Step 1: Acquire all reference data on Countries, Cities and Airports...

Problem: 1,261 airports in total

→ get all records in several loops by altering the value of offset

number of records returnedMaximum is 100!

11

How to GET data (2)Step 2: Information on all flights between European airports over a week (2017/01/20-2017/01/26)

Obtain a list of European airports by SQL

→ 2 loops to create all possible pairs

220 x 220 = 48400 pairs = 3 h of execution per day!

12

need to always include origin, destination and date in request

Authorisation : OAuth 2Token acquisition before requests can be sent

13

Rate Limit● 5 request / seconde● 1,000 → 10,000 requests / hour● Decorator “RateLimited”

Error is thrown when

limits are exceeded

14

Inserting data into our database

import sqlite3

→CREATE TABLE if not exists,

INSERT INTO ____ VALUES...

15

Analysis and visualisations

16

The most important airports around the world,According to Lufthansa

17

European hubs of LH

18

LH flights in Europe

19

Data analysis in SQL

- 5 airports as origine with the greatest number of direct connections

- Data over a week

- Net flights per day

- Frequency by week

- 5 hubs based on Lufthansa BD

20

Data analysis in SQL (2)

- 5 airports as destination with the greatest number of direct connections

21

Data analysis in SQL (3)- Airports in France as origin in Lufthansa DB

22

Data analysis in SQL (4)

- 5 airports as origin with least direct connections

23

Data analysis in SQL (5)

- Connections of 5 hubs as origin in Lufthansa DB

24

Airport Connection = 0 Connection = 1 Connections > 1

Frankfurt (FRA) 91 105 24

Munich (MUC) 92 103 25

Vienna (VIE) 58 132 30

Zurich (ZRH) 51 132 37

Brussel (BRU) 48 117 55

Data analysis in SQL (6)Airports with Connections(= 1) from Frankfurt, sorted by country

25

Visualisation: Force-directed graph in D3.js

● Physical model: forces of attraction and repulsion● Algorithm defined in D3.js (JavaScript), a popular

package for data visualisation

Drawings obtained with force-directed algorithms

Source: https://cs.brown.edu/~rt/gdhandbook/chapters/force-directed.pdf

26

Force-directed graph (1)

27

Force-directed graph (2)Node central : Francfort

28

Tree

29

Limitations and perspectives

- Limitations- Quality of data

- Exclusivity of data

- Perspectives- A map that shows the frequency of service between airports

- Country profile: domestic VS local flights

- Airlines: legacy VS budget

30

Thank you!

31