Real User Monitoring at scaleJarosław Bloch Albert Łącki
PL
SKHU
RS
CH
20 mln real users
7 mln requests every minute
150 mln PV daily
3 data centers
3 000 servers and devices
130 Gbps traffic
40 teams
> 250 deployments daily
300 specialists
Speed
search +0,4s PV/UU -0,5%
load +35% Revenue -5%
load +250ms PV/UU -1,8%
Revenue/UU -4,3%
Component monitoring
Service monitoring
User experience
Our responsibility
Under our control
External monitoring
Real User Monitoring
?
Real User Monitoring
?
? ?
collection
ingestion processing presentation
Data collection
<script> var data = collect(); data.send();</script>
Data collection
Navigation Timing API
onLoadDOM
Processing
ResponseRequestTCP, DNSRedirects
navigationStart
fetchStart
requestStart
responseStart domLoading
responseEnd domInteractive
domComplete
Resource Timing API
JS Errors
try { …} catch (ex) { collect(ex);}
Data collection
data
DC-1
Accelerator
DC-2
GET
HTTP 202
Data collection
data
Script injection
Script injection
<js>
Data ingestion
5 min
Data ingestion
Kafka
Partition 1
Partition 2
Partition 3
Data replication
DC-2 DC-1
DATA
DATA EVERYWHERE
METRICS
METRICS EVERYWHERE
Storm
Topologymetrics
data
Presentation
Graphite
sum(metric1, metric2)highestMax(services.*, 10)…
Graphite3 millions metrics
Custom alerts
Druid
Real User Monitoring
Deployments
Instant feedback
Thank you!
Top Related