Delivering near real time mobility insights at swisscom

33
Delivering near real-time mobility insights at Swisscom François Garillot [email protected] @huitseeker

Transcript of Delivering near real time mobility insights at swisscom

Delivering near real-time mobilityinsights at Swisscom

François [email protected]

@huitseeker

AgendaIntroSmart-DataBig Data ArchitectureStreamingData challenges

Introduction : Positioning

Positioning users in a modernnetwork

no radio-goniometer at scalecell of attachment has position, beam characteristicsover history, best position ~200m

Positioning at specific locations

handovers at specific cell-to-cell locationphone needs to be active

Positioning with more precision

better positioning with excellent data sources:3G : GPEH4G: LTE-CTR

Trajectorydata miningtime series reconstructiontrajectory segmentationmap matching, clusteringmode of transport detection...

How to create value withpositioning at Swisscom ?

with competitive analytics & data sources,and by making sure it embodies the right values.

Smart Data

On (not) tracking (any users) "Swisscom strictly complies with all applicable legislations, inparticular with the telecommunications law and the dataprotection initiative."

Jürg Studerus, Swisscom Senior Manager, Corporate Responsibility

Smart Data : Big Data without Big BrotherPrivacy preservation is an assetIt makes sense to care as much about your customer as they do about you.

We technically enforce thisanswering only synoptic questions, no individual ones,with data flow control : we neutralize quasi-identifiers at every stage

Swisscom mobile subscribers

source: xavierstuder.com, MD\&A reports

Our choicespublic good applications: making Switzerland run better,understanding places, not individuals,all results presented aggregated, anonymized.

Markets

A first product : City

"It's a dream for civil engineers" -- Alexandre Machu, Urbansystems engineer, Pully

Demo time

UsagesNew roads to divert transit traffic out of downtown (informs a 50M$project)Parking lot expansion and transformation (informs a 10M$ project)Electric car charging station deployment

Big Data architecture

In the backend

Spark configuration essentials for enterprisejobs

spark.executor.memory="not the default 1g" spark.kryo.registrator="something custom" // and companions spark.shuffle.service.enabled="true" spark.dynamicAllocation.enabled="true" spark.deploy.recoveryMode="ZOOKEEPER" spark.deploy.recoveryDirectory="/path/to/state" spark.deploy.zookeeper.url="quorumMachine1:2181, ..."

NOT the only valuable settings, see https://techsuppdiva.github.iofor more

See Also

In the front-end

Scala (1/2)type ChronoHistory = List[UEupdate] @@ Chronological type AnteChronoHistory = List[UEupdate] @@ AnteChronological implicit class Chrono(l: List[UEupdate]) { def asChrono: ChronoHistory = { chronoCheck(l) l.asInstanceOf[ChronoHistory] } def asAnteChrono: AnteChronoHistory = { anteChronoCheck(l) l.asInstanceOf[AnteChronoHistory] } }

Scala (2/2)implicit def reverseChrono(l: ChronoHistory): AnteChronoHistory = l.reverse.asAnteChronoimplicit def reverseAnteChrono(l: AnteChronoHistory): ChronoHistory = l.reverse.asChrono

Streaming Analytics

Selecting users on a path of Interest

Massive discrepancy between # of users (2-3E6)and # of interesting users (1.5E3 on test segments)Filtering interesting time series.

Graph matching

Locality-sensitive hashing short historiesA family H of hashing functions is -sensitive if:(r, cr, , )p1 p2

if then if then

p– q } r P [h(q) = h(p)] ~rH p1

p– q ~ cr P [h(q) = h(p)] }rH p2

More :Locality Sensitive Hashing By Spark, Uber, Spark SummitA Gentle Introduction to Locality-Sensitive Hashing with Apache Spark,Scala by The Bay

Computing speeds: Solving graphconstraints

a speed comes from a user well-positioned, twiceplus route knowledgegiven a history of cells, where was the user, exactly ?

Solving graph constraints

just a few users left in computation at this stageso a lot invested in > linear complexity algorithms

Data Challenges

Crucial elementsQuality, reliability of data sourcesAutomated ground truth checking

sensorsTEMS fleet

What's the ground truth for mode of transport, domicile, etc ?Colleagues and friends volunteers

In the worksAccuracy improvementsMore features (see you Spark Summit EU!)Streaming for city

Thank you