Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using...

35
Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University [email protected] @Metti Hoof MaartenVanhoof.com

Transcript of Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using...

Page 1: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Lessons From the Trenches: using Mobile Phone Data for

Official Statistics

Maarten Vanhoof

Orange Labs/Newcastle University

[email protected]

@Metti Hoof

MaartenVanhoof.com

Page 2: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Mobile Phone Data (Call Detail Records)

Page 3: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Mobile Phone Data (Signaling)

Page 4: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Mobile Phone Data (Call Detail Records)

Metadata • Caller (phone)

• Called phone

• Timestamp

• Type of event

• Duration of call/Length of text

• Location of celltower

• …

Page 5: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Mobile Phone Data (Call Detail Records)

Toole et al. (2015) Coupling Human Mobilities and Social Ties.

Page 6: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Individual indicators: Bandicoot

https://github.com/yvesalexandre/bandicoot

http://bandicoot.mit.edu/demo/

• active days • number of contacts • number of interactions • call duration • percent nocturnal • percent initiated interactions • response delay text • entropy of contacts • balance of contacts • interactions per contact • inter-event time • percent pareto interactions • percent pareto durations • number of antennas • entropy of antennas • percent at home • radius of gyration • frequent antennas

Page 7: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Individual indicators for official statistics

Behavioural

Individual mobility

e.g. diversity of mobility

Contextual

• Car ownership

• Access to public transport

• Income

• Marital status

• Membership

• Home location

• Etc.

Page 8: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Individual indicators for official statistics

Pappalardo,Vanhoof, et al. (2016) An Analytical Framework to Nowcast Well-Being using Mobile Phone Data.

Page 9: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Individual indicators for official statistics

Pappalardo,Vanhoof, et al. (2016) An Analytical Framework to Nowcast Well-Being using Mobile Phone Data.

Page 10: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Individual indicators for official statistics

Page 11: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

(Geographical) Veracity

Spatial allocation

Spatial delineation

Spatial aggregation

Page 12: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

(Geographical) Veracity

Spatial allocation

Spatial delineation

Spatial aggregation

Page 13: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial allocation: Home detection

Pappalardo,Vanhoof, et al. (2016) An Analytical Framework to Nowcast Well-Being using Mobile Phone Data.

Page 14: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial allocation: Home detection

Uncertainty of home allocation algorithms

• No knowledge on how certain we can geographically pinpoint users

• Because no ground truth is available

Page 15: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial allocation: Home detection

Page 16: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial allocation: Home detection

Performance

Uncertainty

Vanhoof et al. (Submitted) Investigating Performance and Spatial Uncertainty of Home Detection Criteria for CDR data

Page 17: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University
Page 18: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial allocation: Solution?

• In short term, we need to: • Create a better understanding on the uncertainty that comes with home detection

• Test heuristics for home detection on different databases and for different countries

• Design surveys to gather ground truth at the individual level

• In long term, we need to: • Understand how change in mobile phone use/available datasets influence allocation

• Decide on standardizing home detection and error assessment

• Design a platform where all operators, researchers, policy makers can easily do this and compare results between different datasets

Page 19: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

(Geographical) Veracity

Spatial allocation

Spatial delineation

Spatial aggregation

Page 20: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial delineation

• Uneven delineations of space • Between antennas (high-density vs. low-density, operator 1 vs. operator 2,..)

• Between antennas and administrative regions (cell-tower coverage vs. municipalities)

• Between different definitions of urban areas (Urban Units vs. Urban Areas)

• Create errors that are poorly understood and challenging to address

• Is relevant for • Population Density Estimations

• Mobility Derivation

• Parameter estimation (e.g. for urban scaling laws) in statistical analysis

• Error/uncertainty assessment

Page 21: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial delineation: Mobility Entropy

Vanhoof, et al. (Submitted) Correcting Mobility Entropy from CDR data for large-scale comparison of individual movement patterns

Page 22: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial delineation: Mobility Entropy

Vanhoof, et al. (Submitted) Correcting Mobility Entropy from CDR data for large-scale comparison of individual movement patterns

Page 23: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial delineation: Urban scaling laws

Cottineau et al. (2016) Paradoxal Interpretations of Urban Scaling Laws

Page 24: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial delineation: Solution?

• In short term, we need to work on : • Minimizing the influence of spatial delineations on our measurements

• Techniques that allow translation between different spatial delineations

• Assessments of the influence of spatial delineation (geo-computation)

• In long term, we need to: • Overthink possibilities to standardize spatial delineations

• Develop practices in Official Statistics that express the effect of spatial delineation

Page 25: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial delineation: Urban scaling laws

Cottineau et al. (2016) Paradoxal Interpretations of Urban Scaling Laws

Page 26: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

(Geographical) Veracity

Spatial allocation

Spatial delineation

Spatial aggregation

Page 27: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial aggregation

• Scale does matter for: • Unintended selective filtering (e.g. highly active persons, communities)

• Objective construction of indicators (e.g. 5 km in Paris or in the Pyrenees)

• Representativeness of single operators (e.g. distorted market shares)

• Personal behaviour (e.g. long-distance vs. Short-distance trips)

• Geographical, economical, sociological, ecological,etc. context (e.g. transport infrastructure)

• Still, there is no single evidence that current (spatial) aggregation practices take into account any of these when studying mobile phone data.

• In addition, given the highly changing nature of mobile phone use, it is my hypothesis that behavioral data is even more prone to this fallacy.

Page 28: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial aggregation

Cell-tower level IRIS level

Population Density Estimation vs. Official Statistics

Relations between indicators

Page 29: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Spatial aggregation: Solution?

• In short term, we need to work on : • Techniques that define the best spatial scale for studying certain processes

• Both empirical, quantitative (e.g. optimal raster sizes for population density estimations)

• As theoretical, qualitative (e.g. expert judgment)

• Techniques that express changing nature of observations when (spatially) aggregating • E.g. Representativeness in population terms of single operator data at different scales

• Techniques that investigate, or even incorporate sensitivity of definitions to spatial scale • E.g. Fragmented definitions of distance according to scale

• Techniques that investigate sensitivity of data to spatial aggregation • E.g. Spatial autocorrelations

• In long term, we need to: • See how all of this evolves over time as human behaviour & mobile phone use will change

Page 30: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Thoughts

• Why starting from individual indicators? • Privacy issues (newer datasets don’t allow this)

• Computationally expensive treatment

• Temporal resolution is far from optimal

• Difficult to communicate/visualise

• Why not using the ‘big’ aspect of the data and use patterns? • Activity patterns of cell-towers

• High-level communication/commuting patterns

• Population presence registration

Page 31: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

High-level analysis: Learning Urban Areas

Combes, de Bellefon and Vanhoof (Submitted) Understanding urban centers organization and influence with mobile phone data

Page 32: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

High-level analysis: Learning Urban Areas

Combes, de Bellefon and Vanhoof (Submitted) Understanding urban centers organization and influence with mobile phone data

Page 33: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Don’t be Batman.

The same problems and scientific questions will persist. Only now less visible, and as such, less provoked.

Page 34: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Conclusion

• ‘Work from the trenches’ on individual data identifies problems but • Is done by a limited amount of researchers

• Not a priority for operators (never was, never will be)

• Lack of data and knowledge at the institutions (but they are catching up)

• Limited rewards in academics, limited scientific community

• Is threatened by protective measurements on data • Impossibility to continue pursuing in-depth research

• Fled to African data, but limited quality of official statistics there

• Development of shared platforms for analysis, but simplifies workflows

• Is mostly limited to one-dataset, one-operator • Comparison of findings is absolutely necessary for better insights and methods

• Dream to have full coverage of population is feasible but needs strong policy

Page 35: Lessons From the Trenches: using Mobile Phone Data for ... · Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University

Thank you,

The end.

[email protected]

@Metti Hoof

MaartenVanhoof.com