What the hell is your software doing at runtime?
-
Upload
roberto-franchini -
Category
Software
-
view
233 -
download
1
Transcript of What the hell is your software doing at runtime?
![Page 1: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/1.jpg)
{ }
{ }
{ }
Firenze, November 17th 2015
Roberto “FRANK” Franchini
@robfrankie
Increase business value, measure it!
What the hell is your software doing at runtime?
![Page 2: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/2.jpg)
More than 15 years of experience, proud to be a programmer
Member of OrientDB team, tech lead for the full-text, spatial, JDBC and Docker images
Wrote software for NLP and opinion mining (@scale )
Played with servers, then bought a sysadmin
JUG-Torino co-lead
2
whoami(1)
![Page 3: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/3.jpg)
Agenda
Quotes
System monitoring
Coding
Application monitoring
All together
Feedback
Sample Scenario3
![Page 4: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/4.jpg)
{ }
{ }
{ }
Quotes
![Page 5: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/5.jpg)
Business value
Our code generates business value
when it runs, not when we write it.
We need to know what our code does when it runs.
We can’t do this unless we measure it.
(Codahale)
5
![Page 6: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/6.jpg)
SLA driven
Have an SLA for your service
Measure and report performance against the SLA
(Ben Treynor, Google inc.)
6
![Page 7: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/7.jpg)
{ }
{ }
{ }
System monitoring
![Page 8: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/8.jpg)
Infrastructure monitoring
Sysadmins monitor infrastructure
from the beginning of IT
With right tools a single BOFH
can handle hundreds of servers
8
![Page 9: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/9.jpg)
Tools
On premises
collectd zabbix zenoss
nagios cacti graphite/grafana
Cloud based
datadog newrelic
9
![Page 10: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/10.jpg)
Measures
Cpu load
Network traffic
Disk I/O
Memory
More and more
10
![Page 11: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/11.jpg)
Charts
11
![Page 12: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/12.jpg)
Dashboard
12
![Page 13: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/13.jpg)
Cool, black dashboard
13
![Page 14: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/14.jpg)
{ }
{ }
{ }
Code and deploy
![Page 15: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/15.jpg)
Write
TDD
SOLID principles
Design Patterns
Code metrics
15
![Page 16: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/16.jpg)
Build
unit tests
integration tests
performance tests
test coverage
code quality reports
16
![Page 17: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/17.jpg)
Deploy
Deployment pipeline
Microservices
Container
Cloud
17
![Page 18: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/18.jpg)
Rest
All done, take your rest
Umh
I don’t think so anymore
18
![Page 19: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/19.jpg)
{ }
{ }
{ }
Application monitoring
![Page 20: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/20.jpg)
The day after deployment
How to monitor our service status?
How to measure it?
How it behave?
How it interact with other parts of the system?
Multiply for each µ-service
20
![Page 21: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/21.jpg)
Monitorability
Design sw to be monitorable
Expose metrics (JMX)
Expose status (REST api)
Send metrics to monitoring tools
21
![Page 22: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/22.jpg)
We need application monitoring
“Application monitoring? WHAT?”
“Ok, let me explain
What the app is doing right now?
How is the app performing right now?
And then graph it!”
“Ok, I got it!”
“Let me see”22
![Page 23: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/23.jpg)
5 minutes laterpublic class PoorManJavaMetrics {
int called;
long totalTime;
public void doThings() {
final long start = System.currentTimeMillis();
//heavy business logic
called++;
final long end = System.currentTimeMillis();
final long duration = end - start;
totalTime +=duration;
}
public void logStats() {
System.out.println("---stats---");
//Here be DRAGONS
}
}
23
![Page 24: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/24.jpg)
24Luca Franchini
![Page 25: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/25.jpg)
Use the right tool
Use a library (e.g.: dropwizard metrics)
Count events, measure duration
Log metric values
Send application metrics
to the same backend of system metrics
25
![Page 26: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/26.jpg)
Don’t forget naming!
A naming pattern<namespace>.<instrumented section>
.<target (noun)>.<action (past tense verb)>
Such asaccounts.authentication.password.failed
Use prefix
prod, test, dev, local
prod.accounts.authentication.password.failed
26
![Page 27: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/27.jpg)
Which metrics?
Rate of documents processed
Latency
Transactions per second (€€€€)
Total number of errors
Meantime user interaction
27
![Page 28: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/28.jpg)
{ }
{ }
{ }
All together now
![Page 29: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/29.jpg)
Code on systems
Don’t cross the streams
Enable code metrics means
sysadmins and devs in the same room
talking to each other
to improve business value
29
![Page 30: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/30.jpg)
Send
application metrics to
the same backend
of system metrics
30
![Page 31: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/31.jpg)
Correlate application
and
system metrics
31
![Page 32: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/32.jpg)
Repeat with me
32
![Page 33: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/33.jpg)
Correlate application
and
system metrics
(Cross the streams!)
33
![Page 34: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/34.jpg)
Single metrics backend
graphite
collectd
applications
grafana
34
![Page 35: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/35.jpg)
To do what?
Discover bottlenecks
post-mortem analysis
SLA monitoring
IO impact
Network traffic
Memory utilization
35
![Page 36: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/36.jpg)
To do what?
Why is performing better on dev laptop?
Why on customer infrastructure it takes 24h (our old test server takes 1h)?
Mechanical sympathy at large: the new service is fucking up the I/O
36
![Page 37: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/37.jpg)
Implement THE User Story
Given the application running
when the manager comes
then I want to show a big green number
37
![Page 38: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/38.jpg)
The answer
42
38
![Page 39: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/39.jpg)
Application metrics dashboard
39
![Page 40: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/40.jpg)
Get feedback
40
It’s all about feedback
Our code is talking to us
Listen to it
And take decisions
![Page 41: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/41.jpg)
Decisions
Set new SLAs
Refactor bottleneck
Buy new hw
Expand the cloud
Drop a product
41
![Page 42: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/42.jpg)
42
write code
deploy it
measure it
get feedback
![Page 43: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/43.jpg)
Iterative
10 define some metrics
20 deploy
30 add other metrics
40 goto 10
Are you able to deploy every day?
43
![Page 44: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/44.jpg)
{ }
{ }
{ }
Sample scenario
![Page 45: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/45.jpg)
45 bare metal servers
Ngnix, Jetty, PostgreSQL
GlusterFS, Queues,
Redis, Jenkins (cron on steroids)
Infrastructure
45
![Page 46: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/46.jpg)
Software
Java shop
deploy with Docker
More than 120 webapps
More than 100 batch jobs
NRT stream processing jobs running 24x7
46
![Page 47: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/47.jpg)
Monitoring
collectD, graphite, grafana for system monitoring
Dropwizard Metrics inside code for application monitoring
Application metrics reported to graphite too
47
![Page 48: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/48.jpg)
Feedback and decisions
WTF happened last night?
How is it going this morning?
Do you think we can survive the message flood?
Hey boss, it’s time to buy a new server, we are running out of resources.
48
![Page 49: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/49.jpg)
{ }
{ }
{ }
Wrap up
![Page 50: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/50.jpg)
Shopping list
Define your SLAs/target
Code and deploy with good practices
Code with monitorability in mind
Monitor your app/service
Correlate system and application metrics
Get feedback
Take decisions50
![Page 51: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/51.jpg)
References
https://dropwizard.github.io/metrics/3.1.0/
https://dl.dropboxusercontent.com/u/2744222/2011-04-09-Metrics-Metrics-Everywhere.pdf
http://graphite.wikidot.com/
http://grafana.org/
http://matt.aimonetti.net/posts/2013/06/26/practical-guide-to-graphite-monitoring/
https://www.usenix.org/sites/default/files/conference/protected-
files/srecon15_slides_limoncelli.pdf51
![Page 52: What the hell is your software doing at runtime?](https://reader031.fdocuments.net/reader031/viewer/2022022409/58a9f5a71a28ab36018b6e17/html5/thumbnails/52.jpg)
Credits
Sketches by my sons
Andrea (Andrew) and Luca (Luke) Franchini
Cool dashboards are made with Grafana
52