Download - Take My Logs. Please!

Transcript
Page 1: Take My Logs. Please!

Take my logs. Please.

Mike BrittainDirector of Engineering, InfrastructureEtsy.com

[email protected] @mikebrittain

Page 2: Take My Logs. Please!

(hello?)

Page 3: Take My Logs. Please!

This sounds boooooorrrrring...No, no... hang in there!

Page 4: Take My Logs. Please!
Page 5: Take My Logs. Please!

25 MM uniques/month150 Countries$300 MM+ sales last year

Page 6: Take My Logs. Please!

Apache, PHP, MySQL, PostgreSQL,Memcache, Gearman,Solr, etc.

Page 7: Take My Logs. Please!

What’s working?

Page 8: Take My Logs. Please!

What’s working?Performance

Page 9: Take My Logs. Please!

What’s working?PerformanceOperability

Page 10: Take My Logs. Please!

What’s working?PerformanceOperabilitySimplicity

Page 11: Take My Logs. Please!

Logging + Trending

Page 12: Take My Logs. Please!

App logging(Apache access and error logs)

Page 13: Take My Logs. Please!

LogFormat "%h %l %u %t

\"%r\" %>s %b

“Common”

Page 14: Take My Logs. Please!

LogFormat "%h %l %u %t

\"%r\" %>s %b

\"%{Referer}i\"

\"%{User-agent}i\""

“Combined”

Page 15: Take My Logs. Please!

mod_log_config%f Filename requested

%k # of keepalive requests served on this connection

%T Time taken to serve the request, in seconds

Page 16: Take My Logs. Please!

%f Filename requested

%k # of keepalive requests served on this connection

%D Time taken to serve the request, in microseconds

mod_log_config

Page 17: Take My Logs. Please!

%f Filename requested

%k # of keepalive requests served on this connection

%D Time taken to serve the request, in microseconds

%{foobar}n Contents of “note” foobar from another module

mod_log_config

Page 18: Take My Logs. Please!

apache_note(“foobar”, $whatever);

apache_note()

Page 19: Take My Logs. Please!

LogFormat %{True-Client-IP}i %l %t \"%r\"

%>s %b \"%{Referer}i\"

\"%{User-Agent}i\" %V

%{user_id}n %{shop_id}n %{uaid}n

%{ab_selections}n %{request_uid}n

%{api_consumer_key}n

%{api_method_name}n

%{php_bytes}n %{php_microsec}n %D

“Steroids”

Page 20: Take My Logs. Please!

LogFormat %{True-Client-IP}i %l %t \"%r\"

%>s %b \"%{Referer}i\"

\"%{User-Agent}i\" %V

%{user_id}n %{shop_id}n %{uaid}n

%{ab_selections}n %{request_uid}n

%{api_consumer_key}n

%{api_method_name}n

%{php_bytes}n %{php_microsec}n %D

“Steroids”

Page 21: Take My Logs. Please!

$GLOBALS['timer'] = microtime(true) * 1000000;

Page 22: Take My Logs. Please!

$GLOBALS['timer'] = microtime(true) * 1000000;

register_shutdown_function('pageStats');

function pageStats() {

}

Page 23: Take My Logs. Please!

$GLOBALS['timer'] = microtime(true) * 1000000;

register_shutdown_function('pageStats');

function pageStats() {

$timer_end = microtime(true) * 1000000;

$diff = $timer_end - $GLOBALS['timer'];

}

Page 24: Take My Logs. Please!

$GLOBALS['timer'] = microtime(true) * 1000000;

register_shutdown_function('pageStats');

function pageStats() {

$timer_end = microtime(true) * 1000000;

$diff = $timer_end - $GLOBALS['timer'];

apache_note('php_microsec', $diff);

apache_note('php_bytes',

memory_get_peak_usage());

}

Page 25: Take My Logs. Please!

What about “%D”?

Page 26: Take My Logs. Please!

LogFormat %{True-Client-IP}i %l %t \"%r\"

%>s %b \"%{Referer}i\"

\"%{User-Agent}i\" %V

%{user_id}n %{shop_id}n %{uaid}n

%{ab_selections}n %{request_uid}n

%{api_consumer_key}n

%{api_method_name}n

%{php_bytes}n %{php_microsec}n %D

“Steroids”

Page 27: Take My Logs. Please!

LogFormat %{True-Client-IP}i %l %t \"%r\"

%>s %b \"%{Referer}i\"

\"%{User-Agent}i\" %V

%{user_id}n %{shop_id}n %{uaid}n

%{ab_selections}n %{request_uid}n

%{api_consumer_key}n

%{api_method_name}n

%{php_bytes}n %{php_microsec}n %D

“Steroids”

Page 28: Take My Logs. Please!

LogFormat %{True-Client-IP}i %l %t \"%r\"

%>s %b \"%{Referer}i\"

\"%{User-Agent}i\" %V

%{user_id}n %{shop_id}n %{uaid}n

%{ab_selections}n %{request_uid}n

%{api_consumer_key}n

%{api_method_name}n

%{php_bytes}n %{php_microsec}n %D

“Steroids”

Page 29: Take My Logs. Please!

LogFormat %{True-Client-IP}i %l %t \"%r\"

%>s %b \"%{Referer}i\"

\"%{User-Agent}i\" %V

%{user_id}n %{shop_id}n %{uaid}n

%{ab_selections}n ...

easy_reg=1; personalize_widget=0;

icon_in_cornflower_blue=1;

“Steroids”

Page 30: Take My Logs. Please!

Coming soon...%{locale}n (i18n)

%{platform}n (desktop vs. mobile)

Page 31: Take My Logs. Please!

%{locale}n (i18n)

%{platform}n (desktop vs. mobile)

OPS-1805, OPS-1827

etsy.com/careers

Coming soon...

Page 32: Take My Logs. Please!

Using something else?time, http method, request uri, response code, referer, user-agent, response time, response memory, custom segmentation fields...

Page 33: Take My Logs. Please!

Quick averagesgrep "GET /listing/" access.log | \

awk '{sum=sum+$(NF-1)} END {print sum/NR}'

Page 34: Take My Logs. Please!

Quick graphsgrep "GET /listing/" access.log | \

perl -pe "s/.*\[.*\d{4}:(\d{2}):(\d{2}):\d{2}.*\]/\1:\2/" | \

awk '{print $1, $(NF-1)}' > /tmp/pagetimes.dat

gives you...

Page 35: Take My Logs. Please!

Quick graphs# /tmp/pagetimes.dat

18:37 251.018:38 252.118:39 253.518:40 251.018:45 250.0

and then...

Page 36: Take My Logs. Please!

Quick graphs# GNUPLOT

set terminal png

set output 'listings.png'

set yrange [0:2000]

set xdata time

set timefmt "%d/%B/%Y:%H:%M:%S"

set format x "%H:%M"

plot '/tmp/pagetimes.dat' using 1:2 with points

Page 37: Take My Logs. Please!

Quick graphs

Page 38: Take My Logs. Please!

Error logsPHP + Apache errors in one fileSimple logging interface

Page 39: Take My Logs. Please!

Error logsLevels: error, info, debugNamespace: perf, sql, __class__

Page 40: Take My Logs. Please!

Logger::error("Query exceeded 5 sec: $query",

“sql_long_query”);

Page 41: Take My Logs. Please!

web0054 [Fri Mar 04 16:27:48 2011] [error]

[sql_long_query] [mk04gw1p71] Query exceeded

5 sec: SELECT * FROM ...

Page 42: Take My Logs. Please!

web0054 [Fri Mar 04 16:27:48 2011] [error]

[sql_long_query] [mk04gw1p71] Query exceeded

5 sec: SELECT * FROM ...

Page 43: Take My Logs. Please!

$ grep "16:27:48" access.log | wc -l

1527

Page 44: Take My Logs. Please!

web0054 [Fri Mar 04 16:27:48 2011] [error]

[sql_long_query] [mk04gw1p71] Query exceeded

5 sec: SELECT * FROM ...

Page 45: Take My Logs. Please!

iowerror.log -> request_uid -> access.log

request uri, ab selections, user id, locale, platform, api key, etc.

Page 46: Take My Logs. Please!

Filteringtail -f error.log | grep -v “sql_long_query” | ...

Page 47: Take My Logs. Please!

web0001 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo!web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Help me, Rhonda.web0001 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Heeeeeeellllllllllllllppppp!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo!web0001 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0201 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0034 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web1101 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0201 [04:28:54 2011] [error] [client 10.101.x.x] You've been eaten by a grue.web0055 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!!!web0002 [04:28:54 2011] [warning] [client 10.101.x.x] Sky is falling.web0089 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0020 [04:28:54 2011] [error] [client 10.101.x.x] Sky is falling.web1101 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0055 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0034 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0087 [04:28:54 2011] [fatal] [client 10.101.x.x] Sky is falling.web0002 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo!web0201 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0077 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0355 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0052 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0003 [04:28:54 2011] [error] [client 10.101.x.x] You've been eaten by a grue.web0066 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!!!web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Sky is fallingweb0020 [04:28:54 2011] [error] [client 10.101.x.x] Sky is falling.web1101 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0055 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0034 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0087 [04:28:54 2011] [fatal] [client 10.101.x.x] Sky is falling.

Page 48: Take My Logs. Please!

Trendingfatals errors warnings

Page 49: Take My Logs. Please!

LogsterRun by cronMaintains a cursor on log filesSimple parsing & aggregationOutput to Ganglia or Graphite

github.com/etsy

Page 50: Take My Logs. Please!

web0054 [Fri Mar 04 16:27:48 2011] [error] [login] [mk04gw1p71] User login failed.

Reason: wrong password for ...

Page 51: Take My Logs. Please!

^.+ \[.+\] \[(?P<log_level>.+)\]

Page 52: Take My Logs. Please!

if (fields['log_level'] == “fatal”): self.fatals += 1

elif (fields['log_level'] == “error”): self.errors += 1

elif (fields['log_level'] == “warning”): self.warnings += 1

...

Page 53: Take My Logs. Please!

MetricObject("fatals", (self.fatals / self.duration), "per sec")

MetricObject("errors", (self.errors / self.duration), "per sec")

MetricObject("warning", (self.warnings / self.duration), "per sec")

Page 54: Take My Logs. Please!

fatals errors warnings

Page 55: Take My Logs. Please!

Logster

Signed-in vs. Signed-out

Page 56: Take My Logs. Please!

github.com/etsy

Page 57: Take My Logs. Please!

Log a plethora of data.Don’t be afraid to use one file.

Page 58: Take My Logs. Please!

Use custom fields to segment data.

Page 59: Take My Logs. Please!

Correlate errors to specific requests.

Page 60: Take My Logs. Please!

Make f#@k!ng graphs.

Page 61: Take My Logs. Please!

Convert rates to trend lines.

Page 62: Take My Logs. Please!

Take my logs. Please!

Page 63: Take My Logs. Please!

Mike BrittainDirector of Engineering, InfrastructureEtsy.com

[email protected] @mikebrittain

codeascraft.etsy.comgithub.com/etsy

Thank you.