Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

63
Listening to the pulse of our cities fusing Social Media Streams and Call Data Records Emanuele Della Valle [email protected] http://emanueledellavalle.org 18th International Conference on Business Information Systems 24-26 June 2015, Poznań, Poland

Transcript of Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

Page 1: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

Listening to the pulse of our cities fusing Social Media Streams and Call Data RecordsEmanuele Della [email protected]://emanueledellavalle.org

18th International Conference on Business Information Systems

24-26 June 2015, Poznań, Poland

Page 2: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle3

http://emanueledellavalle.org

Me

Assistant Professor at DEIBPolitecnico di Milano

Expert in semantic technologies and stream computing

Inventor of stream reasoning: an approach to master the velocity and variety dimension of Big Data

15 years experience in research and innovation projects

startupper: fluxedo.com

Emanuele Della Valle

[email protected]

http://emanueledellavalle.org

http://streamreasoning.org

http://fluxedo.com

Page 3: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Acknowledgements

Politecnico di Milano• DEIB

– What- Scientific direction- Semantic technologies- Stream Processing- Data science

– Who- Emanuele Della Valle- Marco Balduini

• Density Design Lab– What

- Visual analytics

– Who- Paolo Ciuccarelli- Matteo Azzi

Telecom Italia• SKIL Lab

– What- Big Data technology- Data Science

– Who - Fabrizio Antonelli- Roberto Larker

Funding agency

4

Page 4: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Agenda

Context

Problem

Experimental setting

Solution

Evaluation

Conclusions

5

Page 5: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

The digital reflection of our cities is sharpening

6

[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]

Page 6: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

The digital reflection of our cities is sharpening

7

[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]

because the urban environment is captured in open datasets

Page 7: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

The digital reflection of our cities is sharpening

8

[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]

and streams of information flows through our cities thanks to

Page 8: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

The digital reflection of our cities is sharpening

9

[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]

and streams of information flows through our cities thanks tothe pervasive deploymentof sensors

Page 9: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

The digital reflection of our cities is sharpening

10

[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]

and streams of information flows through our cities thanks tothe wide adoption of smart phones

Page 10: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

The digital reflection of our cities is sharpening

11

[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]

and streams of information flows through our cities thanks tothe usage of (location-based) social networks

Page 11: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

and it is tracking changes with a decreasing delay

12

Page 12: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

and it is tracking changes with a decreasing delay

13

Data source By when Frequency Delay

Census data 100s year years months

Newspaper 100s year days 1 day

Weather sensors 10s year hours/minutes hours/minutes

TV news 10s years hours minutes

Traffic sensors years 15 minutes minutes

Call Data Recors years 15 minutes hours

Social media years seconds seconds

IoT recently milliseconds milliseconds

Page 13: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 14

Data pile up without making decision any easier

I have to decide:A or B?

Why not C?What if D?

mayor

Page 14: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

But smarter Big Data can …

…advance our ability to feel the pulse of our cities

15

fusing all those data sources

making sense of the fused information

mayor

Definitely E!

to improve decision making and deliver innovative services

Page 15: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Can we collect, analyse and repurpose

• social media and

• Call Data Records

to allow

• perceiving emerging patterns and

• observing their dynamics?

Let's focus on a concrete research question

16

[photo: https://www.flickr.com/photos/debord/4932655275]

Page 16: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Can we collect, analyse and repurpose

• social media captured at place and events and

• privacy-preserving aggregates of Call Data Records

to allow visually

• perceiving emerging patterns and

• observing their dynamics?

More precisely, the research question is

17

[photo: https://www.flickr.com/photos/debord/4932655275]

Page 17: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

How to set up an experiment?

18

[photo: https://www.flickr.com/photos/myfuturedotcom/6053042920]

Question Answer

Which city? Milan

Comparing what? Milan Design Week vs. Milan in general

Experimental subjects? Event Managers & casual audience

Page 18: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

What's Milan Design Week?

19

[map: http://www.fuorisalone.it]

The Milan Design Week (MDW) is a city-scale event • held yearly in Milan, • featuring around 1,200 events • in 500+ places spread across the city and • attracting about half a million people from all over the

world.

Page 19: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Ingredients of the proposed solution

Big Data technologies- Address "velocity" of data streams in memory- Address "volume" of data that do not fit in memory

semantic technologies - Address "variety" using Ontology Based Data Access- Named Entity Recognition and Linking

data science- Statistical modelling- detecting anomalies

Visual analytics- Allow no-expert access to data- Tell stories out of data

20

Page 20: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 21

CitySensing - a solution for event managers (2013)

F. Antonelli, M.Azzi, M.Balduini, P.Ciuccarelli, E.Della Valle, R. Larcher: City sensing: visualising mobile and social data about a city scale event. AVI 2014: 337-338

http://jol.telecomitalia.com/jolskil/citysensing/

Page 21: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 22

CitySensing - a solution for casual audience (2014)

M.Balduini, E.Della Valle, M.Azzi, R.Larcher, F.Antonelli, and P.Ciuccarelli: CitySensing: Fusing City Data for Visual Storytelling. IEEE MultiMedia. TO APPEAR

http://jol.telecomitalia.com/jolskil/citysensing/http://citysensing.fuorisalone.it/

Page 22: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 23

How CitySensing works – step 0

Set up a conceptual model (FraPPE) to master the variety in the data sources

M.Balduini, E. Della Valle: FraPPE: a vocabulary to represent heterogeneous spatio-temporal data to support visual analytics. ISWC 2015 TO APPEAR

Page 23: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

How CitySensing works – step 0

FraPPE• Goal: a vocabulary to represent heterogeneous spatio-

temporal data to support visual analytics

FraPPE offers an homogenous view to the visual analytics interface built on heterogeneous data

24

Page 24: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

How CitySensing works – step 1

25

For every pixel compute the volume of Call Data Records(using privacy-preserving aggregation)

Real data recorded on 13 April 2013 between 13:00 and 00:00

Page 25: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

How CitySensing works – step 2

26

Find the anomalous pixels comparing the current volumes with a model of the volumes in this time period

Real data recorded on 13 April 2013 between 13:00 and 00:00

Page 26: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

How CitySensing works – step 3

27

Map anomalies to the districts of Milano Design Week

Brera

Tortona

What'sthis?

Real data recorded on 13 April 2013 between 13:00 and 00:00

Page 27: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

How CitySensing works – step 4

28

For every anomalous pixel capture the hashtags and semantic entities named in the social media streams

Brera

Tortona

What'sthis?

Real data recorded on 13 April 2013 between 13:00 and 00:00

Page 28: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

How CitySensing works – step 5

29

Take away the hashtags and semantic entities that are systematically used

Brera

Tortona

Real data recorded on 13 April 2013 between 13:00 and 00:00

Page 29: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 30

Logical architecture of CitySensing – setup time

Analyse Data Stream

Build Models

Capture Data Stream Capture Static Data

MDW

Page 30: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 31

Logical architecture of CitySensing – run time

Analyse Data Stream

Build Models

Detect Anomalies

Capture Data Stream

Visualize Analysis

Store Analysis

Capture Static Data

MDW

Page 31: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Capturing static data via FraPPE

The frame duration was fixed to15 minutes

Milano area was covered with • 1 grid (100x100)• 10,000 cells• 250x250 meters in each cell

(the size of the mobile network cells in the centre of Milan)

During the Milano Design Week a total of 5.76 Mln pixel werecaptured

+1000 events in +600 placeswhere collected using the crowd-sourced databases of fuorisalone.it, breradesigndistrict.it and tortonaroundesign.com thanks to a partnership with studiolabo

32

Cells in which there are placeshosting Milan Design Week 2013events

Page 32: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Processing Telecom Italia Call Data Records

1.92 Mln Gaussian models were built• one for each pixel (i.e., for each frame and cell)• grouping the frames by working and week-end days • using two months of Call Data Records, and• verifying volume of CDR has a Gaussian distribution with an

Anderson-Darling test with a significance of 0.05

Built on Pig, R e Cascalog

The processing on 7 m1.large EC2 machines took 24 hours

33

Bad case Good case

His

togr

am

His

togr

am

Q-Q

Plo

t

Q-Q

plo

t

Page 33: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Processing Telecom Italia Call Data Records

Volume of CDR captured in Milan during the Design Week

Calls, SMS and Internet access were aggregated(with privacy-preservingmethods) and an anomaly index was computed for each of the 5.76 Mln pixel

The processing of 1 day on 7 m1.large EC2 took 20 mins

34

What 2013 2014

Calls 16,743,875 19,719,629

SMSs 19,454,497 20,240,485

Internet data accesses 137,381,761 197,767,245

[image: https://cerijayne.files.wordpress.com/2011/10/outliersss.png]

Page 34: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Do CDR-anomalous pixels relate to events?

CDR-anomalous pixels =pixels in which the anomaly index is high (>+2σ and <-2σ)

To test if the anomalous pixels were related to the events of the Milan Design Week• We used three ground truth

– the pixel of Milan– the pixels of Brera district– the pixels of Tortona district

where there was at least an event of Milan Design Week 2013• We compute

– Precision – Recall

of the anomalous pixels to find pixels in those three ground truths

35

Page 35: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 36

Do CDR-anomalous pixels relate to events?

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

Mila

nB

rera

Toro

tna 09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

Tuesday Wednesday Thursday Friday Saturday Sunday

precision

Page 36: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 37

Do CDR-anomalous pixels relate to events?

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

Mila

nB

rera

Toro

tna 09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

Tuesday Wednesday Thursday Friday Saturday Sunday

recall

Page 37: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 38

Do CDR-anomalous pixels relate to events?

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

Mila

nB

rera

Toro

tna 09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

Tuesday Wednesday Thursday Friday Saturday Sunday

precision recall

Lesson learnt

• High precision

• Low recall at city scale

• High recall in Brera and Tortona

Page 38: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Processing Social Streams

The machinery: the Streaming Linked Data framework

39

M.Balduini, E.Della Valle, D.Dell'Aglio, M.Tsytsarau, T.Palpanas, and C.Confalonieri:Social Listening of City Scale Events Using the Streaming Linked Data Framework. International Semantic Web Conference (2) 2013: 1-16

Stream Bus

AnalyserDecorato

r

Adapter Publisher VisualizerStream

HTTP

HTTP

Data Source Streaming Linked Data Server HTML5 Browser

Page 39: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Processing Social Streams

Decoration at work

40

Happily into a bottle of Heineken bear #heinekendesignweek @ the Heineken Magazzini

City-Scale Event: Milano Design Week

Event: Heineken Design Week

Location: The Magazzini

hosts

takesPlaceIn

M.Balduini, A.Bozzon, E.Della Valle, Y.Huang, G-J Houben: Recommending Venues Using Continuous Predictive Social Media Analytics. IEEE Internet Computing 18(5): 28-35 (2014)

Page 40: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Processing Social Streams

predictive models were built• For hastags and semantic entities systematically present• Using a Holt-Winter method

• grouping the frames by – working and week-end days and– Early morning, morning, afternoon, evening, and late night

• Analysing 300,000 geo-located micro-posts collected other 6 months in Milano area (november 2013, aprile 2014)

• It takes few seconds per hashtag/semantic entity on a 60€/month VM in a IaaS

41

Data

Fitted

Forecast

Lower 2,5%

Upper 97,5%

Page 41: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Processing Social Streams

Usage of #milan in the weeks around Milan Design Week

Subtracting the predicted usage of #milan

42

200 – 700

700 – 1100

1100 – 1400

1400 – 1900

1900 – 200

200 – 700

700 – 1100

1100 – 1400

1400 – 1900

1900 – 200

WD WE WD WE WD WE WD WE WD

Milan Design Week

WD WE WD WE WD WE WD WE WD

Page 42: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Processing Social Streams

The difference between the observed and the predicted usage of #milan perfectly fits the usage of #mdw (the official hashtag of Milan Design Week)

43

200 – 700

700 – 1100

1100 – 1400

1400 – 1900

1900 – 200

200 – 700

700 – 1100

1100 – 1400

1400 – 1900

1900 – 200

WD WE WD WE WD WE WD WE WD

Milan Design Week

Anomalous usage of

#milan

Usage of #mdw

Page 43: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Processing Social Streams

Geo-references micro-posts captured, semantically annotated, cleansed using the predictive models and analyzed in Milan area

For each pixel with at least 1 micro-post we computed The volume related to Milano Design Week

The top-10 hashtags

The top-3 locations/events

Real-time processing was possible with our in-memory C-SPARQL engine and the Streaming Linked Data framework on a 20€/month VM in a IaaS

44

What 2013 2014

Geo-located micropost 57,154 21,782

Linked to Milano Design Week 3,569 3,499

Linked to a specific location/event 761 547

Page 44: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Do socially active pixels relate to events?

socially active pixels =pixels in which we captured social media that talk about Milan Design Week

To computes • precision• recall

of the socially active pixels in find pixels in pixels in the three ground truths about Milan, Brera district and Tortona district

45

Page 45: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

46

Do socially active pixels relate to events? M

ilan

Bre

raTo

rotn

a

Tuesday Wednesday Thursday Friday Saturday Sunday

precision

Page 46: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

47

Do socially active pixels relate to events? M

ilan

Bre

raTo

rotn

a

Tuesday Wednesday Thursday Friday Saturday Sunday

recall

Page 47: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

48

Do socially active pixels relate to events? M

ilan

Bre

raTo

rotn

a

Tuesday Wednesday Thursday Friday Saturday Sunday

precision recall

Page 48: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.10.20.30.40.50.60.70.80.9

1

49

Do socially active pixels relate to events? M

ilan

Bre

raTo

rotn

a

Tuesday Wednesday Thursday Friday Saturday Sunday

precision recall

Lesson learnt

• High precision

• Acceptable recall in the districts

Page 49: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Anomalous Socially active Intersection Similar?

Are CDR-anomalous and socially active pixels similar?

Which of the following four scenarios?

50

Page 50: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Are CDR-anomalous and socially active pixels similar?

More formally• Jaccard

• E.g.,

51

J(A,B) = 8/11 J(A,B) = 3/11

A B A

B

J(A,B) = |A ∩ B|

|A∪B|

Page 51: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

09 04:00

09 10:00

09 16:00

09 22:00

10 04:00

10 10:00

10 16:00

10 22:00

11 04:00

11 10:00

11 16:00

11 22:00

12 04:00

12 10:00

12 16:00

12 22:00

13 04:00

13 10:00

13 16:00

13 22:00

14 04:00

14 10:00

14 16:00

14 22:000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

52

Are CDR-anomalous and socially active pixels similar?B

rera

Toro

tna

Tuesday Wednesday Thursday Friday Saturday Sunday

recall CDR-anomalous recall socially active Jaccard

Lesson learnt

At district level, in the large majority of the cases the

socially active pixels are also CDR-anomalous pixels

Page 52: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 53

Visualizing for a casual audience

Page 53: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 54

See it in action!

http://youtu.be/MOBie09NHxM

Page 54: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Evaluation methodology for the casual audience

Guessability study• Can you guess what I mean without any explanation?

E.g.

55

Dinosaur extinction

"The Shining" by Stephen King

Page 55: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Evaluation of interface guessability

56

Page 56: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

The patters you should have got

The CDR-anomaly and the social activity is

57

Correlated Partially correlated Not correlated

Page 57: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Evaluation of interface guessability

58

Q: In Brera District the volume of social media signal is partially correlated with the value of mobile anomaly signal A:

FALSE

UNCERTAINTRUE

0

0.2

0.4

0.6

0.8

1

Page 58: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Evaluation of interface guessability

59

Q: In Porta Romana the volume of social media signal is strongly correlated with the value of mobile anomaly signal A:

FALSE

UNCERTAINTRUE

0

0.2

0.4

0.6

0.8

1

Page 59: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Evaluation of interface guessability

60

Q: In Tortona District the volume of social media signal is strongly correlated with the value of mobile anomaly signalA:

FALSE

UNCERTAINTRUE

0

0.2

0.4

0.6

0.8

1

Page 60: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle

Back to the research question

61

[photo: https://www.flickr.com/photos/debord/4932655275]

Can we collect, analyse and repurpose

• social media captured at place and events and

• privacy-preserving aggregates of Call Data Records

to allow visually

• perceiving emerging patterns and

• observing their dynamics?

Yes!

at least, in Milano Design Week 2013 and 2014

[photo: https://flic.kr/p/beuDaX ]

Page 61: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 62

Take home message … guess it :-)

Page 62: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

http://emanueledellavalle.org - Emanuele Della Valle 63

Take home message … guess it :-)

Emanuele Della [email protected]://emanueledellavalle.org

Thank you!

Any question?

Page 63: Listening to the pulse of our cities fusing Social Media Streams and Call Data Records

Listening to the pulse of our cities fusing Social Media Streams and Call Data RecordsEmanuele Della [email protected]://emanueledellavalle.org

18th International Conference on Business Information Systems

24-26 June 2015, Poznań, Poland