If Engineering at Etsy has a religion, it’s the Church of Graphs. If it moves, we track it. Sometimes we’ll draw a graph of something that isn’t moving yet, just in case it decides to make a run for it. In general, we tend to measure at three levels: network, machine, and application. (You can read more about our graphs in Mike’sTracking Every Release post.)
Application metrics are usually the hardest, yet most important, of the three. They’re very specific to your business, and they change as your applications change (and Etsy changes a lot). Instead of trying to plan out everything we wanted to measure and putting it in a classical configuration management system, we decided to make it ridiculously simple for any engineer to get anything they can count or time into a graph with almost no effort. (And, because we can push codeanytime, anywhere, it’s easy to deploy the code too, so we can go from “how often does X happen?” to a graph of X happening in about half an hour, if we want to.)
Meet StatsD
StatsD is a simple NodeJS daemon (and by “simple” I really mean simple — NodeJS makes event-based systems like this ridiculously easy to write) that listens for messages on a UDP port. (See Flickr’s “Counting & Timing” for a previous description and implementation of this idea, and check out the open-sourced code on github to see our version.) It parses the messages, extracts metrics data, and periodically flushes the data to graphite.
We like graphite for a number of reasons: it’s very easy to use, and has very powerful graphing and data manipulation capabilities. We can combine data from StatsD with data from our other metrics-gathering systems. Most importantly for StatsD, you can create new metrics in graphite just by sending it data for that metric. That means there’s no management overhead for engineers to start tracking something new: simply tell StatsD you want to track “grue.dinners” and it’ll automagically appear in graphite. (By the way, because we flush data to graphite every 10 seconds, our StatsD metrics are near-realtime.)
Not only is it super easy to start capturing the rate or speed of something, but it’s very easy to view, share, and brag about them.
Why UDP?
So, why do we use UDP to send data to StatsD? Well, it’s fast — you don’t want to slow your application down in order to track its performance — but also sending a UDP packet is fire-and-forget. Either StatsD gets the data, or it doesn’t. The application doesn’t care if StatsD is up, down, or on fire; it simply trusts that things will work. If they don’t, our stats go a bit wonky, but the site stays up. Because we also worship at the Church of Uptime, this is quite alright. (The Church of Graphs makes sure we graph UDP packet receipt failures though, which the kernel usefully provides.)
Measure Anything
Here’s how we do it using our PHP StatsD library:
StatsD::increment("grue.dinners");
That’s it. That line of code will create a new counter on the fly and increment it every time it’s executed. You can then go look at your graph and bask in the awesomeness, or for that matter, spot someone up to no good in the middle of the night:
We can use graphite’s data-processing tools to take the the data above and make a graph that highlights deviations from the norm:
(We sometimes use the “rawData=true” option in graphite to get a stream of numbers that can feed into automatic monitoring systems. Graphs like this are very “monitorable.”)
We don’t just track trivial things like how many people are signing into the site — we also track really important stuff, like how much coffee is left in the kitchen:
Time Anything Too
In addition to plain counters, we can track times too:
$start = microtime(true);
eat_adventurer();
StatsD::timing("grue.dinners", (microtime(true) - $start) * 1000);
StatsD automatically tracks the count, mean, maximum, minimum, and 90th percentile times (which is a good measure of “normal” maximum values, ignoring outliers). Here, we’re measuring the execution times of part of our search infrastructure:
Sampling Your Data
One thing we found early on is that if we want to track something that happens really, really frequently, we can start to overwhelm StatsD with UDP packets. To cope with that, we added the option to sample data, i.e. to only send packets a certain percentage of the time. For very frequent events, this still gives you a statistically accurate view of activity.
To record only one in ten events:
StatsD::increment(“adventurer.heartbeat”, 0.1);
What’s important here is that the packet sent to StatsD includes the sample rate, and so StatsD then multiplies the numbers to give an estimate of a 100% sample rate before it sends the data on to graphite. This means we can adjust the sample rate at will without having to deal with rescaling the y-axis of the resulting graph.
Measure Everything
We’ve found that tracking everything is key to moving fast, but the only way to do it is to make tracking anything easy. Using StatsD, we enable engineers to track what they need to track, at the drop of a hat, without requiring time-sucking configuration changes or complicated processes.
Try StatsD for yourself: grab the open-sourced code from github and start measuring. We’d love to hear what you think of
reference:
https://codeascraft.com/2011/02/15/measure-anything-measure-everything/
相关推荐
【Statsd介绍】 Statsd 是一款轻量级的数据采集工具,最初由Flickr公司为Graphite和datadog等监控系统设计,后来Etsy公司用Node.js进行了重构。它作为一个监听UDP(默认)或TCP的守护进程,接收来自客户端的数据,对...
【statsd搭建和配置手册1】的文档主要涵盖了在Linux环境下使用systemd管理statsd服务以及statsd的安装和配置过程。statsd是一款轻量级的监控系统,它收集来自客户端的应用程序统计数据,并将这些数据转发到后端的...
1. **简单易用的API**:`statsd-client`提供了直观的JavaScript API,使得在前端代码中集成度量上报变得简单。只需几行代码,就可以定义并发送数据到StatsD服务器。 2. **多种度量类型**:支持四种基本的度量类型,...
**前端开源库-node-statsd** `node-statsd`是一个针对 Etsy 的 StatsD 服务器的 Node.js 客户端实现,它是前端开发中的一个重要工具,用于收集、聚合和转发性能数据到后端监控系统。StatsD 是一个轻量级代理,它...
nginx-statsd, 用于将统计信息发送到statsd的Nginx 模块 statsd用于向statsd发送统计信息的Nginx 模块。这就是如何使用 nginx-statsd模块:http { # Set the server that you want to send stats to.
《Alexcesaro-statsd:一个高效友好的开源StatsD客户端》 在当今的软件开发领域,数据收集和分析已经成为至关重要的环节。为了更好地理解和优化系统的性能,开发人员常常需要借助各种工具来监控应用的运行状况。...
docker-graphite-statsd, 用于 Graphite & Statsd的Docker 图像 用于 Graphite & Statsd的 Docker 映像立即运行 Graphite & StatsdGraphite & Statsd可能对安装程序很复杂。 这个映像将在几分钟内运行&
node-dogstatsd, Datadog StatsD扩展服务器的node.js 客户端 节点 dogstatsdDatadog 扩展StatsD服务器的node.js 客户端。Datadog为自己的StatsD实现添加了一些新特性( 直方图和标记) 。 这里客户端是通用StatsD...
statsd出口商 statsd_exporter接收StatsD样式的度量标准并将其导出为Prometheus度量标准。总览使用StatsD 要将指标从现有StatsD环境导入Prometheus,请配置StatsD的转发器后端,以将所有接收到的指标重复到statsd_...
docker-statsd-influxdb-grafana, 带有 Telegraf ( StatsD ),InfluxDB和Grafana的Docker 映像 带有 Telegraf ( StatsD ),InfluxDB和Grafana的 Docker 映像
《使用Delphi构建StatsD客户端库,实现Java应用与StatsD通信》 在现代软件开发中,监控系统扮演着至关重要的角色,它们帮助开发者实时了解应用程序的运行状况和性能指标。StatsD是一款轻量级的统计代理服务,它可以...
rusts中statsd的StatsD客户端实现。 使用客户端库 将statsd软件包作为依赖项添加到Cargo.toml文件中: [ dependencies ] statsd = " ^0.13.1 " 您需要rustc> = 1.8.0才能使statsd工作。 然后,您可以获取客户端...
Statsd Docker映像 描述 Statsd是一个网络守护程序,它在Node.js平台上运行,并侦听通过UDP或TCP发送的统计信息(例如计数器和计时器),并将聚合发送到一个或多个可插拔的后端服务(例如Graphite)。 该映像使您...
本篇文章将深入探讨如何在Laravel项目中集成`laravel-statsd`,以便将应用程序的性能数据发送到Statsd服务器进行监控和分析。 首先,让我们了解什么是Statsd。Statsd是由 Etsy 开发的一款轻量级代理服务,它运行在...
Kamailio mod statsd 这个项目是 Kamailio 的 statsd 模块。 使用此模块,您可以将信息发送到 statsd/graphite。 安装 获取kamailio代码: git clone --depth 1 --no-single-branch git://git.sip-router.org/...
资源分类:Python库 所属语言:Python 资源全名:django-statsd-2.0.1.tar.gz 资源来源:官方 安装方法:https://lanzao.blog.csdn.net/article/details/101784059
StatsD HTTP代理带有REST接口的StatsD HTTP代理,可在浏览器中使用 StatsD使用UDP连接,不能直接在浏览器中使用。 该服务器是StatsD的HTTP代理,可用于通过AJAX将指标从前端发送到StatsD。 可以选择使用JWT令牌对...
《C语言实现statsd详解》 在信息技术领域,性能监控是任何系统不可或缺的一部分。statsd是一种流行的性能数据收集代理,最初由 Etsy 开发,用于收集应用程序的度量数据,如计数器、计时器和直方图,然后将这些数据...
关于 Go的客户端(UDP)。 文件 可在在线。 例子 一些例子: import ( "log" "github.com/cactus/go-statsd-client/v5/statsd" ) func main () { // First create a client config. Here is a simple config ...
统计statsd-vis是具有内置Web UI的独立,零依赖性单二进制服务器,您可以使用该UI可视化图形。 它在可配置的时间范围内保存时间序列数据,并且不会持久化或转发。建立statsd-vis完全用编写。 要构建它,可以go get它...