Metrics

X-Ray your app

Morten Siebuhr

@msiebuhr / Github

Questions!

METRICS?

Grab a dictionary

№1

Metric: Greek metrikē, from feminine of metrikos in meter, by measure, from metron measure — more at measure

Miriam Webster

№2

Metric: A standard of measurement

Miriam Webster

№3

No metric exists that can be applied directly to happiness

Scientific Monthly / Miriam Webster

Bullshit!

In the beginning

Kennedy

Let´s go to the moon...

Today

Boeing 787

~ 500 GB / flight

Meanwhile in our little world

Typically

We already have that

Stuff slips through the cracks

Your
stuff slips though the cracks

Costs $$$

(Did for us! And still does...)

Built on librrd

Quick and Dirty

log.info()

printf()

+

grep

The Good

The Bad

The Ugly

Process all webserver logs:

15+ min

SLOOOOOOW

Logging still needed

We have no idea why these filesystems began to act up.

But we do know when and where to look in the logs.

KISS

StatsD

(Originally built by Etsy, inspired by Flickr)

Basic idea:

Send UDP with identifier, value & type

Collect & generate regular statistics

Store, Graph and act

Basic setup:

Your application + some client

StatsD

Graphite / Hosted / Other

Easy metrics

Client in a tweet

Send over UDP

FAST

Small overhead

Doesn`t add to failure domain

StatsD

(~2500 LOC, including test-suite & packaging)

Types

Counters

Gives you absolute reported value + normalized "per second"-numbers

Types

Gauges

Stay the same until you tell it otherwise

Types

Sets

Outputs number of different values seen

Types

Timings

Min/mean/ang/std.dev/max + 90th percentile for all reported values

Pluggable backends

amqp datadog ganglia Graphite

librato OpenTSDB socket.i0 statsd zabbix

...

Node.js

Don´t worry - clones in pretty much all languages.

Ends up in Graphite

Play with it

Correlate odd stuff

Objective:

Challenge your assumptions

Three things happen

№ 1

You will be surprised

OMG! WTF?

We have a hit-rate <50%

Assumption:

Varnish outputs more data than it ingests.

Data says:

Varnish outputs less data than it ingests!

№2

Bug / Configuration error

(Go fix it!)

№3

Make it a KPI

(Put it on a dashboard, send e-mails, rotating lights, klaxons, &c!)

Rinse, repeat

Do It Yourself

Took me a few hours

(Mostly because Graphite is a PITA to set up)

Boss day 1:

Don`t waste too much time on it!

Boss day 90:

How about 1000x data?

TL;DL

Questions / Discussion

Thanks

Images

Haribo, NASA, One.com

#

/