日志监控系统
Collectd: gather statistics on performance, processes, and overall status of the system
Graphite: Store numeric time-series data & Render graphs of this data on demand
Grafana: visualization and dashboarding tool for time series data
一个日志监控系统包括:日志收集,日志存储,日志可视化。
分别对应Collectd,Graphite(不用graphite本身的web工具),Grafana。
时间序列数据库的替代方案有:elasticSearch,influxdb。
而logstash可以认为是日志收集和日志存储之间的桥梁,因为它提供了I/O配置方便地在不同系统中进行数据传输。
如果没有logstash作为桥梁,日志收集后怎么放到存储中是个问题,需要自己调用客户端API。
那么这些系统之间如何通信,如何组织?
- collectd负责收集数据,并通过network可以发送到logstash的指定端口
- logstash的输入是监听步骤1的端口,输出可以写到存储系统中,比如es,influxdb
- grafana通过配置数据源的方式,获取存储系统中的数据,进行可视化展现
elasticSearch | 2.3.3 | 192.168.6.52 | 索引数据库 |
logstash | 2.3.2 | 本机 | 有Input/Outout,因此可以连接各种管道 |
collectd | 192.168.6.52 | 收集机器的信息,性能,进程等 | |
Grafana | 192.168.6.52 | 可视化,可以接入不同的数据源es和influxdb | |
influxdb | 0.13 | 192.168.6.52 | 时间序列数据库 |
ELK
LogStash
标准I/O
命令行输入输出,通过脚本执行,codec指定如何解码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
$ cd logstash-2.3.2 $ bin/logstash -e "input {stdin{}} output {stdout{}}" <<< 'Hello World' Settings: Default pipeline workers: 4 Pipeline main started 2016-05-25T04:00:29.531Z zqhmac Hello World Pipeline main has been shutdown stopping pipeline {:id=>"main"} $ bin/logstash -e 'input{stdin{}}output{stdout{codec=>rubydebug}}' <<< 'Hello World' { "message" => "Hello World", "@version" => "1", "@timestamp" => "2016-05-25T04:01:09.095Z", "host" => "zqhmac" } $ vi logstash-simple.conf input { stdin {} } output { stdout { codec=> rubydebug } } $ bin/logstash agent -f logstash-simple.conf --verbose |
File Input
1 2 3 4 5 6 7 8 9 10 11 |
$ bin/logstash agent -f logstash-file.conf --verbose input { file { path => "/usr/install/cassandra/logs/system.log" start_position => beginning type => "cassandra" } } output { stdout { codec=> rubydebug } } |
日志文件
1 2 3 4 5 6 7 8 9 10 11 |
[qihuang.zheng@dp0652 logstash-1.5.0]$ head /usr/install/cassandra/logs/system.log ERROR [metrics-graphite-reporter-thread-1] 2016-06-13 18:30:39,502 GraphiteReporter.java:281 - Error sending to Graphite: java.net.SocketException: 断开的管道 at java.net.SocketOutputStream.socketWrite0(Native Method) ~[na:1.7.0_51] at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113) ~[na:1.7.0_51] at java.net.SocketOutputStream.write(SocketOutputStream.java:159) ~[na:1.7.0_51] at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) ~[na:1.7.0_51] at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282) ~[na:1.7.0_51] at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125) ~[na:1.7.0_51] at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207) ~[na:1.7.0_51] at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:129) ~[na:1.7.0_51] |
默认一行一个事件,对于有异常的日志文件来说,不经过任何处理肯定是不行的
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
Using version 0.1.x input plugin 'file'. This plugin isn't well supported by the community and likely has no maintainer. {:level=>:info} Using version 0.1.x codec plugin 'plain'. This plugin isn't well supported by the community and likely has no maintainer. {:level=>:info} Using version 0.1.x output plugin 'stdout'. This plugin isn't well supported by the community and likely has no maintainer. {:level=>:info} Using version 0.1.x codec plugin 'rubydebug'. This plugin isn't well supported by the community and likely has no maintainer. {:level=>:info} Registering file input {:path=>["/usr/install/cassandra/logs/system.log"], :level=>:info} No sincedb_path set, generating one based on the file path {:sincedb_path=>"/home/qihuang.zheng/.sincedb_261f7476b9c9830f1fa5a51db2793e1e", :path=>["/usr/install/cassandra/logs/system.log"], :level=>:info} Pipeline started {:level=>:info} Logstash startup completed { "message" => "ERROR [metrics-graphite-reporter-thread-1] 2016-06-13 18:30:39,502 GraphiteReporter.java:281 - Error sending to Graphite:", "@version" => "1", "@timestamp" => "2016-06-20T08:42:35.543Z", "type" => "cassandra", "host" => "dp0652", "path" => "/usr/install/cassandra/logs/system.log" } { "message" => "java.net.SocketException: 断开的管道", "@version" => "1", "@timestamp" => "2016-06-20T08:42:35.544Z", "type" => "cassandra", "host" => "dp0652", "path" => "/usr/install/cassandra/logs/system.log" } { "message" => "\tat java.net.SocketOutputStream.socketWrite0(Native Method) ~[na:1.7.0_51]", "@version" => "1", "@timestamp" => "2016-06-20T08:42:35.544Z", "type" => "cassandra", "host" => "dp0652", "path" => "/usr/install/cassandra/logs/system.log" } |
添加多行支持
1 2 3 4 5 6 7 8 9 10 11 |
input { file { path => "/usr/install/cassandra/logs/system.log" start_position => beginning type => "cassandra" codec => multiline { pattern => "^\s" what => "previous" } } } |
将任何以空白开始的行与上一行合并,但是这种方式还是不够理想。实际上下面两条记录应该属于一条
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
{ "@timestamp" => "2016-06-20T08:49:54.521Z", "message" => "ERROR [metrics-graphite-reporter-thread-1] 2016-06-13 18:30:39,850 GraphiteReporter.java:281 - Error sending to Graphite:", "@version" => "1", "type" => "cassandra", "host" => "dp0652", "path" => "/usr/install/cassandra/logs/system.log" } { "@timestamp" => "2016-06-20T08:49:54.521Z", "message" => "java.net.SocketException: 断开的管道\n\tat java.net.SocketOutputStream.socketWrite0(Native Method) ~[na:1.7.0_51]\n\tat java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113) ~[na:1.7.0_51]\n\tat java.net.SocketOutputStream.write(SocketOutputStream.java:159) ~[na:1.7.0_51]\n\tat sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) ~[na:1.7.0_51]\n\tat sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282) ~[na:1.7.0_51]\n\tat sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125) ~[na:1.7.0_51]\n\tat java.io.OutputStreamWriter.write(OutputStreamWriter.java:207) ~[na:1.7.0_51]\n\tat java.io.BufferedWriter.flushBuffer(BufferedWriter.java:129) ~[na:1.7.0_51]\n\tat java.io.BufferedWriter.write(BufferedWriter.java:230) ~[na:1.7.0_51]\n\tat java.io.Writer.write(Writer.java:157) ~[na:1.7.0_51]\n\tat com.yammer.metrics.reporting.GraphiteReporter.sendToGraphite(GraphiteReporter.java:271) [metrics-graphite-2.2.0.jar:na]\n\tat com.yammer.metrics.reporting.GraphiteReporter.sendObjToGraphite(GraphiteReporter.java:265) [metrics-graphite-2.2.0.jar:na]\n\tat com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:304) [metrics-graphite-2.2.0.jar:na]\n\tat com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:26) [metrics-graphite-2.2.0.jar:na]\n\tat com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) [metrics-core-2.2.0.jar:na]\n\tat com.yammer.metrics.reporting.GraphiteReporter.printRegularMetrics(GraphiteReporter.java:247) [metrics-graphite-2.2.0.jar:na]\n\tat com.yammer.metrics.reporting.GraphiteReporter.run(GraphiteReporter.java:213) [metrics-graphite-2.2.0.jar:na]\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_51]\n\tat java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_51]\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_51]\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_51]\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_51]\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51]\n\tat java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]", "@version" => "1", "tags" => [ [0] "multiline" ], "type" => "cassandra", "host" => "dp0652", "path" => "/usr/install/cassandra/logs/system.log" } |
网上找到的一个Cassandra日志文件的配置:https://github.com/rustyrazorblade/dotfiles/blob/master/logstash.conf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
output { elasticsearch { hosts => ["192.168.6.52:9200"] index => "logstash-%{type}-%{+YYYY.MM.dd}" document_type => "%{type}" workers => 2 flush_size => 1000 idle_flush_time => 5 template_overwrite => true } stdout { } } input { file { path => "/usr/install/cassandra/logs/system.log" start_position => beginning type => cassandra_system } } filter { if [type] == "cassandra" { grok { match => {"message" => "%{LOGLEVEL:level} \[%{WORD:class}:%{NUMBER:line}\] %{TIMESTAMP_ISO8601:timestamp} %{WORD:file}\.java:%{NUMBER:line2} - %{GREEDYDATA:msg}"} } } } |
性能测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
input { generator { count => 30000000 } } output { stdout { codec => dots } kafka { broker_list => "localhost:9092" topic_id => "test" compression_codec => "snappy" } } |
bin/logstash agent -f out.conf | pv -Wbart > /dev/null
1 2 3 4 5 6 7 8 9 10 11 12 |
topic_id => "test" compression_codec => "snappy" request_required_acks => 1 serializer_class => "kafka.serializer.StringEncoder" request_timeout_ms => 10000 producer_type => 'async' message_send_max_retries => 5 retry_backoff_ms => 100 queue_buffering_max_ms => 5000 queue_buffering_max_messages => 10000 queue_enqueue_timeout_ms => -1 batch_num_messages => 1000 |
Collectd
在52上安装collectd,network插件表示要将当前机器的信息发送到远程服务器10.57.2.26的25856端口
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
$ sudo yum install collectd
$ sudo mv /etc/collectd.conf /etc/collectd_backup.conf
$ sudo vi /etc/collectd.conf
Hostname "dp0652"
FQDNLookup true
LoadPlugin interface
LoadPlugin cpu
LoadPlugin memory
LoadPlugin network
LoadPlugin df
LoadPlugin disk
<Plugin interface>
Interface "eth0"
IgnoreSelected false
</Plugin>
<Plugin network>
Server "10.57.2.26" "25826"
</Plugin>
Include "/etc/collectd.d"
$ sudo service collectd start
Starting collectd: [ OK ]
$ sudo service collectd status
collectd (pid 9295) is running...
|
开发机器地址是10.57.2.26,监听25826端口,收集来自于collectd发送的信息。
即流程是:在192.168.6.52通过collectd收集系统信息,发送到10.57.2.26的logstash上
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
$ bin/plugin list | grep collect logstash-codec-collectd $ vi collectd.conf input { udp { port => 25826 buffer_size => 1452 codec => collectd { } } } output { stdout { codec => rubydebug } } $ bin/logstash -f collectd.conf Settings: Default pipeline workers: 4 Pipeline main started { "host" => "dp0652", "@timestamp" => "2016-05-25T03:49:52.000Z", "plugin" => "cpu", "plugin_instance" => "21", "collectd_type" => "cpu", "type_instance" => "system", "value" => 1220568, "@version" => "1" }.... |
ElasticSearch
启动ES
1 2 3 4 5 6 7 8 9 10 |
$ cd elasticsearch-2.3.3 $ vi config/elasticsearch.yml cluster.name: es52 network.host: 192.168.6.52 #discovery.zen.ping.multicast.enabled: false #http.cors.allow-origin: "/.*/" #http.cors.enabled: true $ bin/elasticsearch -d $ curl http://192.168.6.52:9200/ |
LogStash+CollectD+ElasticSearch
上面把从collectd搜集到的数据打印到控制台,修改out转存到elasticsearch中。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
$ vi elastic.conf input { udp { port => 25826 buffer_size => 1452 codec => collectd { } } } output { elasticsearch { hosts => ["192.168.6.52:9200"] index => "logstash-%{type}-%{+YYYY.MM.dd}" document_type => "%{type}" workers => 2 flush_size => 1000 idle_flush_time => 5 template_overwrite => true } } $ bin/logstash -f elastic.conf $ curl http://192.168.6.52:9200/_search?pretty { "took" : 4, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 613, "max_score" : 1.0, "hits" : [ { "_index" : "logstash-%{type}-2016.05.25", "_type" : "%{type}", "_id" : "AVTmTRv8ihrmowv7fKbo", "_score" : 1.0, "_source" : { "host" : "dp0652", "@timestamp" : "2016-05-25T05:04:42.000Z", "plugin" : "cpu", "plugin_instance" : "22", "collectd_type" : "cpu", "type_instance" : "steal", "value" : 0, "@version" : "1" } }, ......] } } |
最终的网络拓扑流程图如下:
实际上collectd是不同的服务器节点(比如Nginx服务器),而logstash和es只需要一台机器即可:
Kiabana
下载后如果ElasticSearch安装在本机,默认直接启动bin/kibana即可
https://www.elastic.co/guide/en/kibana/current/getting-started.html
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
wget -c https://www.elastic.co/guide/en/kibana/3.0/snippets/shakespeare.json wget -c https://github.com/bly2k/files/blob/master/accounts.zip?raw=true wget -c https://download.elastic.co/demos/kibana/gettingstarted/logs.jsonl.gz unzip accounts.zip gunzip logs.jsonl.gz shakespeare表结构 { "line_id": INT, "play_name": "String", "speech_number": INT, "line_number": "String", "speaker": "String", "text_entry": "String", } 建立mapping表结构 curl -XPUT http://192.168.6.52:9200/shakespeare -d ' { "mappings" : { "_default_" : { "properties" : { "speaker" : {"type": "string", "index" : "not_analyzed" }, "play_name" : {"type": "string", "index" : "not_analyzed" }, "line_id" : { "type" : "integer" }, "speech_number" : { "type" : "integer" } } } } } '; bank表结构 { "account_number": INT, "balance": INT, "firstname": "String", "lastname": "String", "age": INT, "gender": "M or F", "address": "String", "employer": "String", "email": "String", "city": "String", "state": "String" } //logstash-2015.05.18, logstash-2015.05.19, logstash-2015.05.20都要修改 curl -XPUT http://192.168.6.52:9200/logstash-2015.05.20 -d ' { "mappings": { "log": { "properties": { "geo": { "properties": { "coordinates": { "type": "geo_point" } } } } } } } '; curl -XPOST '192.168.6.52:9200/bank/account/_bulk?pretty' --data-binary @accounts.json curl -XPOST '192.168.6.52:9200/shakespeare/_bulk?pretty' --data-binary @shakespeare.json curl -XPOST '192.168.6.52:9200/_bulk?pretty' --data-binary @logs.jsonl curl '192.168.6.52:9200/_cat/indices?v' [qihuang.zheng@dp0652 ~]$ curl '192.168.6.52:9200/_cat/indices?v' health status index pri rep docs.count docs.deleted store.size pri.store.size yellow open logstash-%{type}-2016.05.25 5 1 12762 0 1.5mb 1.5mb yellow open logstash-cassandra_system-2016.06.20 5 1 17125 0 3.9mb 3.9mb yellow open logstash-cassandra_system-2016.06.21 5 1 262 0 233.7kb 233.7kb yellow open bank 5 1 1000 0 442.2kb 442.2kb yellow open .kibana 1 1 5 0 23.3kb 23.3kb yellow open shakespeare 5 1 111396 0 18.4mb 18.4mb green open graylog_0 1 0 8123 0 2.3mb 2.3mb yellow open logstash-2015.05.20 5 1 4750 0 28.7mb 28.7mb yellow open logstash-2015.05.18 5 1 4631 0 27.4mb 27.4mb yellow open logstash-cassandra_system-2016.06.22 5 1 42 0 146.5kb 146.5kb yellow open logstash-2015.05.19 5 1 4624 0 27.8mb 27.8mb |
InfluxDB
1 2 3 4 5 |
wget -c https://dl.influxdata.com/influxdb/releases/influxdb-0.13.0_linux_amd64.tar.gz wget -c https://dl.influxdata.com/telegraf/releases/telegraf-0.13.1_linux_i386.tar.gz wget -c https://dl.influxdata.com/chronograf/releases/chronograf-0.13.0-1.x86_64.rpm wget -c https://dl.influxdata.com/kapacitor/releases/kapacitor-0.13.1_linux_amd64.tar.gz sudo yum localinstall chronograf-0.13.0-1.x86_64.rpm |
生成配置文件,启动时指定配置文件
1 2 3 |
$ cd influxdb-0.13.0-1 $ usr/bin/influxd config > influxdb.generated.conf $ nohup usr/bin/influxd -config influxdb.generated.conf & |
用命令行客户端创建数据库,插入数据,查询数据
1 2 3 4 5 6 7 8 9 10 11 |
$ usr/bin/influx Connected to http://localhost:8086 version 0.13.x InfluxDB shell 0.13.x > CREATE DATABASE mydb > USE mydb > INSERT cpu,host=serverA,region=us_west value=0.64 > INSERT cpu,host=serverA,region=us_east value=0.45 > INSERT temperature,machine=unit42,type=assembly external=25,internal=37 > SELECT host, region, value FROM cpu > SELECT * FROM temperature > SELECT * FROM /.*/ LIMIT 1 |
在web页面也可以添加数据:
Grafana
grafana-1.9.1
由于Grafana是存静态的,你只需要下载源代码解压,将它部署在Nginx上面就可以了,或者可以用Python的SimpleHTTPServer来跑
1 2 3 |
$ wget http://grafanarel.s3.amazonaws.com/grafana-1.9.1.tar.gz $ cd grafana-1.9.1 $ python -m SimpleHTTPServer 8383 |
没有任何数据源时,页面是空白的:
添加了数据源后比如influxdb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
$ cp config.sample.js config.js $ vi config.js datasources: { influxdb: { type: 'influxdb', url: "http://192.168.6.52:8086/db/cassandra-metrics", username: 'admin', password: 'admin', }, grafana: { type: 'influxdb', url: "http://192.168.6.52:8086/db/grafana", username: 'admin', password: 'admin', grafanaDB: true }, }, |
重启python进程后,可以看到多了点东西(如果influxdb没有添加admin用户,上面的username和password可以去掉),
但是就是看不到配置相关的按钮,难道是没有权限?而且这个版本进来后,根本没有login页面。
grafana-2.x
https://grafanarel.s3.amazonaws.com/builds/grafana-3.0.3-1463994644.linux-x64.tar.gz
如果用2.5以上的版本包括3.0,和1.9的目录结构相比发生很大变化
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
[qihuang.zheng@dp0652 grafana-1.9.1]$ tree -L 1 . ├── app ├── build.txt ├── config.js ├── config.sample.js ├── css ├── font ├── img ├── index.html ├── LICENSE.md ├── NOTICE.md ├── plugins ├── README.md ├── test └── vendor [qihuang.zheng@dp0652 grafana-3.0.3-1463994644]$ tree -L 2 . ├── bin │ ├── grafana-cli │ ├── grafana-cli.md5 │ ├── grafana-server │ └── grafana-server.md5 ├── conf │ ├── defaults.ini │ └── sample.ini ├── LICENSE.md ├── NOTICE.md ├── public │ ├── app │ ├── css │ ├── dashboards │ ├── emails │ ├── fonts │ ├── img │ ├── robots.txt │ ├── sass │ ├── test │ ├── vendor │ └── views ├── README.md └── vendor └── phantomjs |
如果也是用python -m SimpleHTTPServer 8383
打开,浏览器不会显示任何东西
参照官方的安装指南分分钟搞定,而且貌似并没有依赖Web服务器之类的。
1 2 3 4 5 6 7 8 9 10 |
$ sudo yum install https://grafanarel.s3.amazonaws.com/builds/grafana-3.0.4-1464167696.x86_64.rpm
$ sudo service grafana-server start
Starting Grafana Server: .... FAILED
$ ps -ef|grep grafana
grafana 16078 1 0 13:19 ? 00:00:00 /usr/sbin/grafana-server
--pidfile=/var/run/grafana-server.pid --config=/etc/grafana/grafana.ini
cfg:default.paths.data=/var/lib/grafana cfg:default.paths.logs=/var/log/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins
$ vi /var/log/grafana/grafana.log
$ sudo service grafana-server status
grafana-server (pid 16078) is running...
|
虽然看似失败了,不过日志文件中没有什么错误信息,打开:http://192.168.6.52:3000/
出现了登陆页面,admin/admin
而且grafana的图标是可以点的,也有数据源。首先添加influxdb的数据源
添加一个dashboards①,然后添加一个panel②,在数据源中选择influxdb
默认的查询语句:SELECT mean("value") FROM "measurement" WHERE $timeFilter GROUP BY time($interval) fill(null)
更改为:SELECT mean("value") FROM "cpu" WHERE "host" = "serverA" AND $timeFilter GROUP BY time($interval)
这时候上方会出现图,点击关闭,退出编辑状态。 注意不要点击眼睛,一旦变成灰色,表示不发送查询语句
正常q的内容是查询语句:curl -GET 'http://localhost:8086/query?pretty=true' --data-urlencode "db=mydb" --data-urlencode "q=SELECT value FROM cpu_load_short WHERE region='us-west'"
往influxdb中插入几条数据:
1 2 |
cpu,host=serverA,region=us_west value=0.36 cpu,host=serverA,region=us_west value=0.85 |
在右上角可以选择时间范围,如果时间超过了,没有数据,出现N/A。
Cassandra+InfluxDB+Grafana
使用Grafana监控Cassandra有两个步骤:
- 将Cassandra监控指标数据收集到InfluxDB中
- 在Grafana中配置InfluxDB的数据源,展现Cassandra指标数据
参考文档:
http://www.datastax.com/dev/blog/pluggable-metrics-reporting-in-cassandra-2-0-2
https://www.pythian.com/blog/monitoring-cassandra-grafana-influx-db/
其中步骤1有多种方案:
- 使用Graphite收集数据,发送到InfluxDB中
- 使用InfluxDB的telegraf输入插件收集数据
Graphite
1)下载metrics-graphite.jar放到Cassandra的lib目录下
2)修改influxdb的配置文件,其中database要对应influxdb中的数据库名称,这里为cassandra-metrics(要在influxdb中手动创建这个数据库)
参考文档中配置文件是config.toml,在新版本中启动influxdb时生成了一个配置文件,实际上是一样的。
旧版本的输入配置项是:[input_plugins.graphite],新版本的配置项为:[[graphite]]。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
[qihuang.zheng@dp0652 influxdb-0.13.0-1]$ vi influxdb.generated.conf [[graphite]] enabled = true bind-address = ":2003" protocol = "tcp" batch-size = 5000 batch-pending = 10 batch-timeout = "1s" consistency-level = "one" separator = "." udp-read-buffer = 0 # database = "graphite" database = "cassandra-metrics" udp_enabled = true |
上面的配置文件表示,InfluxDB将打开2003端口,接收graphite类型的指标数据
3)重启influxdb,使配置文件生效
1 2 |
$ service influxdb reload # 通过RPM方式安装时,不需要重启 $ influxdb -config influxdb.generated.conf reload |
4)在cassandra的conf目录下创建influx的配置文件,其中host指的是influxdb的服务端地址,端口对应上面influxdb.generated.conf的2003端口。
如果influxdb安装在不同节点上,下面的host要指向influxdb的地址。 prefix通常设置为当前机器的IP地址。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
[qihuang.zheng@dp0652 conf]$ vi /usr/install/cassandra/conf/influx-reporting.yaml graphite: - period: 60 timeunit: 'SECONDS' prefix: 'Node1' hosts: - host: '192.168.6.52' port: 2003 predicate: color: "white" useQualifiedName: true patterns: - ".*" |
上面的配置文件表示:当前Cassandra节点的每个指标前缀是Node1,都会发送到地址为52的2003端口,即把Cassandra的指标数据发送给InfluxDB。
这里也可以看出指标数据发送的机器只是指定了地址和端口,所以目标地址可以是任何可以接收graphite类型的数据库,不一定是InfluxDB。
5)安装grafana,不需要配置config.js,在web页面也可以添加数据源
6)给cassandra-env.sh添加-D启动选项
1 2 |
[qihuang.zheng@dp0652 conf]$ vi /usr/install/cassandra/conf/cassandra-env.sh JVM_OPTS="$JVM_OPTS -Dcassandra.metricsReporterConfigFile=influx-reporting.yaml" |
7)重启cassandra
1 2 |
INFO [main] YYYY-MM-DD HH:MM:SS,SSS CassandraDaemon.java:353 - Trying to load metrics-reporter-config from file: influx-reporting.yaml INFO [main] YYYY-MM-DD HH:MM:SS,SSS GraphiteReporterConfig.java:68 - Enabling GraphiteReporter to 192.168.6.52:2003 |
8)验证Cassandra指标数据通过metrics-graphite.jar被收集到influxdb中
选择一个measurement,验证有数据被写入:select * FROM "Node1.jvm.daemon_thread_count"
9)在grafana中配置influxdb的数据源,把数据库名称改成cassandra-metrics
配置监控图,修改measurement
10)总结下通过Graphite+InfluxDB收集Cassandra指标数据的步骤流程图:
用正则表达式可以聚合多个节点的指标
1 2 |
select mean(value) from /.*org.apache.cassandra.metrics.ClientRequest.Read/ select mean(value) from "192.168.6.53.org.apache.cassandra.metrics.ClientRequest.Read.Latency.15MinuteRate" |
Telegraph
Hekad
http://hekad.readthedocs.io/en/v0.10.0/index.html
Atlas
https://github.com/Netflix/atlas
Graylog
http://www.cnblogs.com/wjoyxt/p/4961262.html
http://docs.graylog.org/en/2.0/pages/installation/manual_setup.html
准备工作:mongodb安装和启动
1 2 3 4 5 6 7 8 9 10 11 12 |
mkdir -p /home/qihuang.zheng/data/mongodb curl -O https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-3.2.7.tgz tar zxf mongodb-linux-x86_64-3.2.7.tgz nohup mongodb-linux-x86_64-3.2.7/bin/mongod --dbpath /home/qihuang.zheng/data/mongodb & sudo rpm -ivh pwgen-2.07-1.el6.x86_64.rpm && sudo yum install perl-Digest-SHA [qihuang.zheng@dp0652 ~]$ pwgen -N 1 -s 96 XNamqyHbxtV46AHXNMlMAvVeV2dutp2pQaeY9IaOSf9XnwgyYGXqN97SSCQ2OLyR2HR41BtCpxwSMH4kFnr1VHmRNUQvYyic [qihuang.zheng@dp0652 ~]$ echo -n XNamqyHbxtV46AHXNMlMAvVeV2dutp2pQaeY9IaOSf9XnwgyYGXqN97SSCQ2OLyR2HR41BtCpxwSMH4kFnr1VHmRNUQvYyic | shasum -a 256 cb535aa3ff35e81f69f9014005bcf1ad032048cc123dad735bbf87970eb2cacb - [qihuang.zheng@dp0652 ~]$ echo -n admin | shasum -a 256 8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918 - |
graylog的配置文件中默认的配置项,单机情况可以不用修改任何配置,不过最好把localhost改成本机IP
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
$ wget https://fossies.org/linux/misc/graylog-2.0.2.tgz $ tar zxf graylog-VERSION.tgz && cd graylog $ vi graylog.conf.example is_master = true node_id_file = /etc/graylog/server/node-id password_secret = root_password_sha2 = rest_listen_uri = http://127.0.0.1:12900/ #elasticsearch_cluster_name = graylog #elasticsearch_discovery_zen_ping_unicast_hosts = 127.0.0.1:9300, 127.0.0.2:9500 elasticsearch_shards = 4 elasticsearch_replicas = 0 elasticsearch_index_prefix = graylog mongodb_uri = mongodb://localhost/graylog |
配置文件的路径/etc/graylog/server/server.conf
写死在bin/graylogctl
脚本中
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
$ sudo mkdir -p /etc/graylog/server/ $ sudo cp graylog.conf.example /etc/graylog/server/server.conf $ cat /etc/graylog/server/server.conf |grep -v grep|grep -v ^#|grep -v ^$ password_secret=XNamqyHbxtV46AHXNMlMAvVeV2dutp2pQaeY9IaOSf9XnwgyYGXqN97SSCQ2OLyR2HR41BtCpxwSMH4kFnr1VHmRNUQvYyic password=8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918 cluster=`cat ~/elasticsearch-2.3.3/config/elasticsearch.yml |grep cluster.name | awk '{print $2}'` sed -i -e "s#password_secret =#password_secret = $password_secret#g" server.conf sed -i -e "s#root_password_sha2 =#root_password_sha2 = $password#g" server.conf sed -i -e "s#127.0.0.1#192.168.6.52#g" server.conf sed -i -e "s#localhost#192.168.6.52#g" server.conf sed -i -e "s#elasticsearch_shards = 4#elasticsearch_shards = 1#g" server.conf sed -i -e "s#web_listen_uri = http://127.0.0.1:9000/#web_listen_uri = http://192.168.6.52:9999/#g" server.conf ###################################### elasticsearch_cluster_name = $cluster elasticsearch_discovery_zen_ping_unicast_hosts = 192.168.6.52:9300 |
graylog的启动脚本在bin/graylogctl,实际上启动命令是java -jar graylog.jar
1 2 3 4 5 6 7 8 9 |
[qihuang.zheng@dp0652 graylog-2.0.2]$ ll drwxr-xr-x 2 qihuang.zheng users 4096 6月 20 13:37 bin -rw-r--r-- 1 qihuang.zheng users 35147 5月 26 23:31 COPYING drwxr-xr-x 4 qihuang.zheng users 4096 6月 20 12:58 data -rw-r--r-- 1 qihuang.zheng users 23310 5月 26 23:31 graylog.conf.example -rw-r--r-- 1 qihuang.zheng users 80950701 5月 26 23:34 graylog.jar ⬅️ drwxr-xr-x 3 qihuang.zheng users 4096 6月 20 11:55 lib drwxr-xr-x 2 qihuang.zheng users 4096 6月 20 13:02 log drwxr-xr-x 2 qihuang.zheng users 4096 5月 26 23:33 plugin |
如果要自定义log配置,修改bin/graylogctl的start部分在-jar前添加配置文件路径
1
|
-Dlog4j.configurationFile=file:///home/qihuang.zheng/graylog-2.0.2/log4j2.xml -jar "${GRAYLOG_SERVER_JAR}" server
|
启动graylog,同时会启动web服务,默认端口是9000,不过和HDFS重了,所以上面把web_listen_uri修改成9999
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
[qihuang.zheng@dp0652 ~]$ sudo graylog-2.0.2/bin/graylogctl start Starting graylog-server ... [qihuang.zheng@dp0652 ~]$ sudo graylog-2.0.2/bin/graylogctl status graylog-server running with PID 8600 [qihuang.zheng@dp0652 ~]$ ll /etc/graylog/server/ -rw-r--r-- 1 root root 36 6月 20 12:58 node-id -rw-r--r-- 1 root root 23496 6月 20 12:57 server.conf [qihuang.zheng@dp0652 ~]$ cat /etc/graylog/server/node-id eb943e44-3464-47be-9c07-2a554de71428 [qihuang.zheng@dp0652 graylog-2.0.2]$ ps -ef|grep graylog root 33748 1 48 14:17 pts/0 00:01:20 /home/qihuang.zheng/jdk1.8.0_91/bin/java -Djava.library.path=bin/../lib/sigar -Xms1g -Xmx1g -XX:NewRatio=1 -server -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow -jar graylog.jar server -f /etc/graylog/server/server.conf -p /tmp/graylog.pid 506 38283 29183 0 14:20 pts/0 00:00:00 grep graylog [qihuang.zheng@dp0652 graylog-2.0.2]$ cat /tmp/graylog.pid 33748 |
日志文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
[qihuang.zheng@dp0652 graylog-2.0.2]$ cat log/graylog-server.log
2016-06-20 14:17:46,602 INFO : kafka.log.LogManager - Loading logs.
2016-06-20 14:17:46,662 INFO : kafka.log.LogManager - Logs loading complete.
2016-06-20 14:17:46,662 INFO : org.graylog2.shared.journal.KafkaJournal - Initialized Kafka based journal at data/journal
2016-06-20 14:17:46,676 INFO : org.graylog2.shared.buffers.InputBufferImpl - Initialized InputBufferImpl with ring size <65536> and wait strategy <BlockingWaitStrategy>, running 2 parallel message handlers.
2016-06-20 14:17:46,706 INFO : org.mongodb.driver.cluster - Cluster created with settings {hosts=[192.168.6.52:27017], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=5000}
2016-06-20 14:17:46,731 INFO : org.mongodb.driver.cluster - No server chosen by ReadPreferenceServerSelector{readPreference=primary} from cluster description ClusterDescription{type=UNKNOWN, connectionMode=SINGLE, all=[ServerDescription{address=192.168.6.52:27017, type=UNKNOWN, state=CONNECTING}]}. Waiting for 30000 ms before timing out
2016-06-20 14:17:46,752 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:1, serverValue:40}] to 192.168.6.52:27017
2016-06-20 14:17:46,753 INFO : org.mongodb.driver.cluster - Monitor thread successfully connected to server with description ServerDescription{address=192.168.6.52:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 2, 7]}, minWireVersion=0, maxWireVersion=4, maxDocumentSize=16777216, roundTripTimeNanos=512946}
2016-06-20 14:17:46,758 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:2, serverValue:41}] to 192.168.6.52:27017
2016-06-20 14:17:46,929 INFO : org.graylog2.plugin.system.NodeId - Node ID: eb943e44-3464-47be-9c07-2a554de71428
2016-06-20 14:17:46,978 INFO : org.elasticsearch.node - [graylog-eb943e44-3464-47be-9c07-2a554de71428] version[2.3.2], pid[33748], build[b9e4a6a/2016-04-21T16:03:47Z]
2016-06-20 14:17:46,978 INFO : org.elasticsearch.node - [graylog-eb943e44-3464-47be-9c07-2a554de71428] initializing ...
2016-06-20 14:17:46,984 INFO : org.elasticsearch.plugins - [graylog-eb943e44-3464-47be-9c07-2a554de71428] modules [], plugins [graylog-monitor], sites []
2016-06-20 14:17:48,243 INFO : org.elasticsearch.node - [graylog-eb943e44-3464-47be-9c07-2a554de71428] initialized
2016-06-20 14:17:48,304 INFO : org.hibernate.validator.internal.util.Version - HV000001: Hibernate Validator 5.2.4.Final
2016-06-20 14:17:48,419 INFO : org.graylog2.shared.buffers.ProcessBuffer - Initialized ProcessBuffer with ring size <65536> and wait strategy <BlockingWaitStrategy>.
2016-06-20 14:17:49,918 INFO : org.graylog2.bindings.providers.RulesEngineProvider - No static rules file loaded.
2016-06-20 14:17:50,510 INFO : org.graylog2.bootstrap.ServerBootstrap - Graylog server 2.0.2 (4da1379) starting up
2016-06-20 14:17:50,514 WARN : org.graylog2.shared.events.DeadEventLoggingListener - Received unhandled event of type <org.graylog2.plugin.lifecycles.Lifecycle> from event bus <AsyncEventBus{graylog-eventbus}>
2016-06-20 14:17:50,532 INFO : org.graylog2.shared.initializers.PeriodicalsService - Starting 24 periodicals ...
2016-06-20 14:17:50,538 INFO : org.elasticsearch.node - [graylog-eb943e44-3464-47be-9c07-2a554de71428] starting ...
2016-06-20 14:17:50,544 INFO : org.graylog2.periodical.IndexRetentionThread - Elasticsearch cluster not available, skipping index retention checks.
2016-06-20 14:17:50,546 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:4, serverValue:43}] to 192.168.6.52:27017
2016-06-20 14:17:50,546 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:3, serverValue:42}] to 192.168.6.52:27017
2016-06-20 14:17:50,553 INFO : org.graylog2.periodical.IndexerClusterCheckerThread - Indexer not fully initialized yet. Skipping periodic cluster check.
2016-06-20 14:17:50,573 INFO : org.graylog2.shared.initializers.PeriodicalsService - Not starting [org.graylog2.periodical.UserPermissionMigrationPeriodical] periodical. Not configured to run on this node.
2016-06-20 14:17:50,648 INFO : org.elasticsearch.transport - [graylog-eb943e44-3464-47be-9c07-2a554de71428] publish_address {127.0.0.1:9350}, bound_addresses {[::1]:9350}, {127.0.0.1:9350}
2016-06-20 14:17:50,654 INFO : org.elasticsearch.discovery - [graylog-eb943e44-3464-47be-9c07-2a554de71428] es52/rWpoduohQ1CXcyoAuwzNtg
2016-06-20 14:17:53,239 INFO : org.glassfish.grizzly.http.server.NetworkListener - Started listener bound to [192.168.6.52:9999]
2016-06-20 14:17:53,241 INFO : org.glassfish.grizzly.http.server.HttpServer - [HttpServer] Started.
2016-06-20 14:17:53,242 INFO : org.graylog2.initializers.WebInterfaceService - Started Web Interface at <http://192.168.6.52:9999/>
2016-06-20 14:17:53,657 WARN : org.elasticsearch.discovery - [graylog-eb943e44-3464-47be-9c07-2a554de71428] waited for 3s and no initial state was set by the discovery
2016-06-20 14:17:53,657 INFO : org.elasticsearch.node - [graylog-eb943e44-3464-47be-9c07-2a554de71428] started
2016-06-20 14:17:53,730 INFO : org.elasticsearch.cluster.service - [graylog-eb943e44-3464-47be-9c07-2a554de71428] detected_master {Daisy Johnson}{ZVzCrnWoRsKVnRryYLE6BQ}{192.168.6.52}{192.168.6.52:9300}, added {{Daisy Johnson}{ZVzCrnWoRsKVnRryYLE6BQ}{192.168.6.52}{192.168.6.52:9300},}, reason: zen-disco-receive(from master [{Daisy Johnson}{ZVzCrnWoRsKVnRryYLE6BQ}{192.168.6.52}{192.168.6.52:9300}])
2016-06-20 14:17:56,443 INFO : org.glassfish.grizzly.http.server.NetworkListener - Started listener bound to [192.168.6.52:12900]
2016-06-20 14:17:56,443 INFO : org.glassfish.grizzly.http.server.HttpServer - [HttpServer-1] Started.
2016-06-20 14:17:56,444 INFO : org.graylog2.shared.initializers.RestApiService - Started REST API at <http://192.168.6.52:12900/>
2016-06-20 14:17:56,445 INFO : org.graylog2.shared.initializers.ServiceManagerListener - Services are healthy
2016-06-20 14:17:56,446 INFO : org.graylog2.shared.initializers.InputSetupService - Triggering launching persisted inputs, node transitioned from Uninitialized [LB:DEAD] to Running [LB:ALIVE]
2016-06-20 14:17:56,446 INFO : org.graylog2.bootstrap.ServerBootstrap - Services started, startup times in ms: {JournalReader [RUNNING]=1, BufferSynchronizerService [RUNNING]=1, OutputSetupService [RUNNING]=1, InputSetupService [RUNNING]=2, MetricsReporterService [RUNNING]=2, KafkaJournal [RUNNING]=4, PeriodicalsService [RUNNING]=51, WebInterfaceService [RUNNING]=2705, IndexerSetupService [RUNNING]=3211, RestApiService [RUNNING]=5912}
2016-06-20 14:17:56,451 INFO : org.graylog2.bootstrap.ServerBootstrap - Graylog server up and running.
2016-06-20 14:18:00,548 INFO : org.graylog2.indexer.Deflector - Did not find an deflector alias. Setting one up now.
2016-06-20 14:18:00,552 INFO : org.graylog2.indexer.Deflector - There is no index target to point to. Creating one now.
2016-06-20 14:18:00,554 INFO : org.graylog2.indexer.Deflector - Cycling deflector to next index now.
2016-06-20 14:18:00,555 INFO : org.graylog2.indexer.Deflector - Cycling from <none> to <graylog_0>
2016-06-20 14:18:00,555 INFO : org.graylog2.indexer.Deflector - Creating index target <graylog_0>...
2016-06-20 14:18:00,614 INFO : org.graylog2.indexer.indices.Indices - Created Graylog index template "graylog-internal" in Elasticsearch.
2016-06-20 14:18:00,698 INFO : org.graylog2.indexer.Deflector - Waiting for index allocation of <graylog_0>
2016-06-20 14:18:00,800 INFO : org.graylog2.indexer.Deflector - Done!
2016-06-20 14:18:00,800 INFO : org.graylog2.indexer.Deflector - Pointing deflector to new target index....
2016-06-20 14:18:00,845 INFO : org.graylog2.system.jobs.SystemJobManager - Submitted SystemJob <bd3c64c0-36ae-11e6-bcbc-02423384d6ab> [org.graylog2.indexer.ranges.CreateNewSingleIndexRangeJob]
2016-06-20 14:18:00,845 INFO : org.graylog2.indexer.Deflector - Done!
2016-06-20 14:18:00,845 INFO : org.graylog2.indexer.ranges.CreateNewSingleIndexRangeJob - Calculating ranges for index graylog_0.
2016-06-20 14:18:00,964 INFO : org.graylog2.indexer.ranges.MongoIndexRangeService - Calculated range of [graylog_0] in [117ms].
2016-06-20 14:18:00,973 INFO : org.graylog2.indexer.ranges.CreateNewSingleIndexRangeJob - Created ranges for index graylog_0.
2016-06-20 14:18:00,973 INFO : org.graylog2.system.jobs.SystemJobManager - SystemJob <bd3c64c0-36ae-11e6-bcbc-02423384d6ab> [org.graylog2.indexer.ranges.CreateNewSingleIndexRangeJob] finished in 128ms.
|
查看ES是否创建索引
1 2 3 |
[qihuang.zheng@dp0652 graylog-2.0.2]$ curl http://192.168.6.52:9200/_cat/indices yellow open logstash-%{type}-2016.05.25 5 1 12762 0 1.5mb 1.5mb green open graylog_0 1 0 0 0 159b 159b |
查看MongoDB是否创建数据库
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
[qihuang.zheng@dp0652 graylog-2.0.2]$ ~/mongodb-linux-x86_64-3.2.7/bin/mongo MongoDB shell version: 3.2.7 connecting to: test > show dbs; graylog 0.001GB local 0.000GB test 0.000GB > show collections cluster_config cluster_events collectors content_packs grok_patterns index_failures index_ranges nodes notifications pipeline_processor_pipelines pipeline_processor_pipelines_streams pipeline_processor_rules roles sessions system_messages users > db.nodes.count() 1 > db.cluster_config.count() 11 > db.nodes.find() { "_id" : ObjectId("5767780ab01412219843147b"), "is_master" : true, "hostname" : "dp0652", "last_seen" : 1466404508, "transport_address" : "http://192.168.6.52:12900/", "type" : "SERVER", "node_id" : "eb943e44-3464-47be-9c07-2a554de71428" } > db.cluster_config.distinct("type") [ "org.graylog2.bundles.ContentPackLoaderConfig", "org.graylog2.cluster.UserPermissionMigrationState", "org.graylog2.indexer.management.IndexManagementConfig", "org.graylog2.indexer.retention.strategies.ClosingRetentionStrategyConfig", "org.graylog2.indexer.retention.strategies.DeletionRetentionStrategyConfig", "org.graylog2.indexer.rotation.strategies.MessageCountRotationStrategyConfig", "org.graylog2.indexer.rotation.strategies.SizeBasedRotationStrategyConfig", "org.graylog2.indexer.rotation.strategies.TimeBasedRotationStrategyConfig", "org.graylog2.indexer.searches.SearchesClusterConfig", "org.graylog2.periodical.IndexRangesMigrationPeriodical.MongoIndexRangesMigrationComplete", "org.graylog2.plugin.cluster.ClusterId" ] |
注意必须修改server.conf的elasticsearch_cluster_name
配置项和已经安装的elasticsearch的名称一样,否则会报错:
1 2 3 4 5 6 7 8 |
2016-06-20 14:06:47,920 ERROR: org.graylog2.shared.rest.exceptionmappers.AnyExceptionClassMapper - Unhandled exception in REST resource org.elasticsearch.discovery.MasterNotDiscoveredException at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$5.onTimeout(TransportMasterNodeAction.java:226) ~[graylog.jar:?] at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:236) ~[graylog.jar:?] at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:804) ~[graylog.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91] |
用admin/admin登陆:http://192.168.6.52:9999,密码是echo -n admin | shasum -a 256
中的admin
发送数据到gralog
UDP
1.在web页面的System/Inputs添加一个RAW UDP文本
2.模拟向端口输入数据
1
|
[qihuang.zheng@dp0652 graylog-2.0.2]$ echo "Hello Graylog, let's be friends." | nc -w 1 -u 127.0.0.1 5555
|
3.在Search可以查询到这条消息
GELF
1.配置gelf
2.发送gelf数据
curl -XPOST http://192.168.6.52:12201/gelf -p0 -d ‘{“short_message”:”Hello there”, “host”:”example.org”, “facility”:”test”, “_foo”:”bar”}’
Logstash
Input配置为GELF UDP,端口为12202,和GELF HTTP的12201区分开来
1 2 |
➜ logstash-2.3.2 bin/logstash -e 'input { stdin {} } output { gelf {host => "192.168.6.52" port => 12202 } }' logstash message |
syslog
kafka
1.创建topic
1 2 3 |
bin/kafka-topics.sh --create --zookeeper 192.168.6.55:2181,192.168.6.56:2181,192.168.6.57:2181 --replication-factor 1 --partitions 1 --topic graylog-test bin/kafka-console-producer.sh --broker-list 192.168.6.55:9092,192.168.6.56:9092,192.168.6.57:9092 --topic graylog-test hello msg from kafka |
2.配置kafka输入
3.查看收集的日志
Graylog-Cassandra
https://github.com/Graylog2/graylog-plugin-metrics-reporter
相关推荐
ELK(Elasticsearch, Logstash, Kibana)栈是大数据分析和日志管理领域广泛应用的开源解决方案。本文将详细介绍ELK 7.16.1版本的关键知识点,特别是针对最新官方修复的log4j漏洞,以及如何进行一键离线安装。 **...
ELK(Elasticsearch, Logstash, Kibana)是一个流行的日志管理和分析栈,用于收集、解析、存储和可视化各种日志数据。在本文中,我们将深入探讨如何使用Docker Compose来设置一个完整的ELK环境。Docker Compose是一...
ELK(Elasticsearch, Logstash, Kibana)栈是大数据分析和日志管理领域的一个强大工具组合,尤其在实时搜索、监控和数据分析方面表现出色。本手册提供了ELK的中文详细指南,旨在帮助用户更好地理解和应用这套系统。 ...
华为大数据平台ELk组件产品文档,主要介绍华为FusionInsight C80版本的ELk组件功能和使用方法。ELk是由Elasticsearch、Logstash和Kibana组成的ELK Stack,广泛应用于日志分析和大数据处理领域。华为将此技术集成到其...
在本文中,我们将详细探讨如何在CentOS 7操作系统上部署Elasticsearch 6.5.4、Logstash 6.5.4、Kibana 6.5.4以及log4j2,以此组成一个ELK(Elasticsearch、Logstash和Kibana)日志处理和可视化平台。这个过程将涉及...
elk的测试数据。 ELK是三个开源软件的缩写,分别表示:Elasticsearch , Logstash, Kibana , 它们都是开源软件。新增了一个FileBeat,它是一个轻量级的日志收集处理工具(Agent),Filebeat占用资源少,适合于在各个...
1.elk简介、ES安装.flv 2.es集群.flv 3-logstash快速入门.flv 4-logstash收集系统日志-file.flv 5-logstash收集java日志-codec.flv 6-kibana介绍.flv 7-logstash收集nginx访问日志-json.flv 8-logstash收集syslog...
### ELK-14——300:气体绝缘组合开关技术解析 #### 一、产品概述 ELK-14是ABB公司推出的一种气体绝缘组合开关(Gas-insulated Switchgear, GIS),工作电压等级为300kV。该产品采用模块化设计,具备极高的灵活性...
云计算面试题之ELK面试题,运维工程师必备云计算面试题之ELK面试题,运维工程师必备云计算面试题之ELK面试题,运维工程师必备云计算面试题之ELK面试题,运维工程师必备云计算面试题之ELK面试题,运维工程师必备...
CentOS 7 搭建 ELK 服务 ELK 服务是 Elastic Stack 的一部分,包括 Elasticsearch、Logstash 和 Kibana 等组件。ELK 服务可以用来实时地收集、处理和展示日志数据,提供了一个强大的日志分析和可视化平台。 一、...
ELK(Elasticsearch, Logstash, Kibana)栈是大数据分析和日志管理领域的一个流行解决方案。这个“ELK_中文指南_指南”压缩包文件提供了完整的中文版本,帮助用户更好地理解和应用ELK技术。 1. **Elasticsearch**:...
在IT行业中,日志管理是监控和排查问题的关键环节,而ELK(Elasticsearch、Logstash、Kibana)栈则是广泛使用的日志收集、分析和可视化的工具。在这个场景下,我们将讨论如何利用Kubernetes(简称k8s)这一强大的...
【Centos7下搭建ELK日志分析系统】 ELK栈是日志管理和分析的强大工具,由Elasticsearch、Logstash、Kibana三个组件组成。Elasticsearch是一个分布式的实时搜索和分析引擎,用于存储、分析和检索大量数据。Logstash...
### ELK 5.5 环境搭建与性能调优 #### 一、系统拓扑及注意事项 在搭建实时日志分析系统 ELK (ElasticSearch + Logstash + Kibana) 5.5 版本的过程中,需要注意以下几个关键点: 1. **Java 环境**:ELK 的三个组件...
ELK大型环境部署是在大数据环境下,对Elasticsearch、Logstash和Kibana(统称为ELK)的集中管理和大规模部署。Elasticsearch负责数据存储和搜索,Logstash负责数据的收集与处理,Kibana负责数据的展示和交互。ELK在...
ELKstack是Elasticsearch、Logstash、Kibana三个开源软件的组合,是目前开源界流行的实时数据分析方案,成为实时日志处理领域开源界的选择。然而,ELKstack也并不是实时数据分析界的灵丹妙药,使用不恰当,反而会...
开源实时日志分析ELK平台能够完美的解决日志收集和日志检索、分析的问题,ELK就是指ElasticSearch、Logstash和Kiabana三个开源工具。 因为ELK是可以跨平台部署,因此非常适用于多平台部署的应用。 二 环境准备 1...
ELK工具,全称为Elasticsearch、Logstash、Kibana,是三个开源软件的组合,广泛用于日志管理和分析领域。这套工具提供了一种高效且灵活的方式,用于收集、解析、存储、搜索和可视化大量数据。以下是关于ELK工具及其...