`
lz1130
  • 浏览: 409772 次
  • 性别: Icon_minigender_1
  • 来自: 福建
社区版块
存档分类
最新评论

linux nagios pnp安装配置+短信报警

阅读更多
用到的软件包:
nagios-cn-3.2.0.tar.bz2
nagios-plugins-1.4.14.tar.gz
nrpe-2.12.tar.gz
rrdtool-1.0.50.tar.gz
pnp-0.4.14.tar.gz

1、Nagios监控端安装

安装apache、php和相关库
yum -y install gd gd-devel
yum -y install httpd php php-gd


建立运行用户
useradd nagios
groupadd nagcmd
usermod -G nagcmd nagios
usermod -G nagcmd apache


Nagios主程序安装
./configure --with-command-group=nagcmd
make all
make install
make install-init
make install-config
make install-commandmode
make install-webconf


创建一个nagiosadmin(系统默认管理员用户,用其他用户名时需要自己更改cgi.cfg配置)的用户用于Nagios的WEB接口登录
htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
/etc/init.d/httpd restart


安装Nagios插件
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
make install


安装nrpe(监控linux专用)
./configure
make all
make install-plugin


配置监控端(仔细查看etc下的配置文件和官方配置文件说明)
vi /usr/local/nagios/etc/objects/commands.cfg
在最后面增加如下内容
###################################################################
#####
#
# 2009.10.17 add by sapling
# NRPE COMMAND
#
###################################################################
#####
# 'check_nrpe ' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

监控linux服务器示例
define service{
        use                      local-service         ; Name of service template to use
        host_name                    10.3.37.110
        service_description             CHECK-USERS
        check_command               check_nrpe!check_users   ; !后为要执行的命令
        }


mkdir /usr/local/nagios/etc/objects/host
chown nagios.nagios /usr/local/nagios/etc/objects/host
vi /usr/local/nagios/etc/nagios.cfg
注释默认监控,加入个存放监控主机配置的目录
#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
cfg_dir=/usr/local/nagios/etc/objects/host


vi /usr/local/nagios/etc/objects/host/localhost.cfg
文件示例:

###############################################################################
# LOCALHOST.CFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
#
# Last Modified: 05-31-2007
#
# NOTE: This config file is intended to serve as an *extremely* simple 
#       example of how you can create configuration entries to monitor
#       the local (Linux) machine.
#
###############################################################################




###############################################################################
###############################################################################
#
# HOST DEFINITION
#
###############################################################################
###############################################################################

# Define a host for the local machine

define host{
        use                     linux-server
        host_name               127.0.0.1 
        alias                   localhost
        address                 127.0.0.1
        }

###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################


# Define a service to "ping" the local machine

define service{
        use                             local-service         ; Name of service template to use
        host_name                       127.0.0.1 
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        }


# Define a service to check the disk space of the root partition
# on the local machine.  Warning if < 20% free, critical if
# < 10% free space on partition.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       127.0.0.1 
        service_description             DISK 
        check_command                   check_local_disk!20%!10%! /
        }



# Define a service to check the number of currently logged in
# users on the local machine.  Warning if > 20 users, critical
# if > 50 users.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       127.0.0.1 
        service_description             USERS 
        check_command                   check_local_users!20!50
        }


# Define a service to check the number of currently running procs
# on the local machine.  Warning if > 250 processes, critical if
# > 400 users.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       127.0.0.1 
        service_description             PROCES 
        check_command                   check_local_procs!250!400!RSZDT
        }



# Define a service to check the load on the local machine. 

define service{
        use                             local-service         ; Name of service template to use
        host_name                       127.0.0.1 
        service_description             LOAD 
        check_command                   check_local_load!10.0,8.0,4.0!30.0,20.0,10.0
        }



# Define a service to check the swap usage the local machine. 
# Critical if less than 10% of swap is free, warning if less than 20% is free

define service{
        use                             local-service         ; Name of service template to use
        host_name                       127.0.0.1 
        service_description             SWAP 
        check_command                   check_local_swap!30!10
        }



# Define a service to check SSH on the local machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       127.0.0.1
        service_description             SSH
        check_command                   check_tcp!22!1.0!10.0
        notifications_enabled           1
        }



# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       127.0.0.1 
        service_description             HTTP
        check_command                   check_http
        notifications_enabled           1
        }

define service{
        use                             local-service         ; Name of service template to use
        host_name                       127.0.0.1
        service_description             FTP
        check_command                   check_ftp
        notifications_enabled           1
        process_perf_data               0
        }


vi /usr/local/nagios/etc/objects/host/10.3.37.110.cfg
文件示例:

###############################################################################
# LOCALHOST.CFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
#
# Last Modified: 05-31-2007
#
# NOTE: This config file is intended to serve as an *extremely* simple 
#       example of how you can create configuration entries to monitor
#       the local (Linux) machine.
#
###############################################################################




###############################################################################
###############################################################################
#
# HOST DEFINITION
#
###############################################################################
###############################################################################

# Define a host for the local machine

define host{
        use                     linux-server
        host_name               10.3.37.110
        alias                   10.3.37.110
        address                 10.3.37.110
        }

###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################

define service{
        use                             local-service         ; Name of service template to use
        host_name                       10.3.37.110
        service_description             CHECK-DISK
        check_command                   check_nrpe!check_sda7
        }

define service{
        use                             local-service         ; Name of service template to use
        host_name                       10.3.37.110
        service_description             CHECK-USERS
        check_command                   check_nrpe!check_users
        }

define service{
        use                             local-service         ; Name of service template to use
        host_name                       10.3.37.110
        service_description             CHECK-LOAD
        check_command                   check_nrpe!check_load
        }

define service{
        use                             local-service         ; Name of service template to use
        host_name                       10.3.37.110
        service_description             CHECK-ZOMBIE-PROCS
        check_command                   check_nrpe!check_zombie_procs
        }

define service{
        use                             local-service         ; Name of service template to use
        host_name                       10.3.37.110
        service_description             CHECK-TOTAL-PROCS
        check_command                   check_nrpe!check_total_procs
        }


vi /usr/local/nagios/etc/objects/host/group.cfg
文件示例:

###############################################################################
###############################################################################
#
# HOST GROUP DEFINITION
#
###############################################################################
###############################################################################

# Define an optional hostgroup for Linux machines

define hostgroup{
        hostgroup_name  linux-servers ; The name of the hostgroup
        alias           Linux Servers ; Long name of the group
        members         *     ; Comma separated list of hosts that belong to this group
        }


令SELinux处于容许模式(出现无权限问题的话就执行)
setenforce 0


检查配置与启动
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/etc/init.d/nagios start


访问监控web
http://localhost/nagios/


2、nagios被监控端安装

没安装xinetd的要安装
yum -y install xinetd


安装Nagios插件
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
make install


安装nrpe
./configure
make all
make install-daemon
make install-daemon-config
make install-xinetd


配置nrpe启动
vi /etc/xinetd.d/nrpe
service nrpe
{
        flags           = REUSE
        socket_type     = stream    
        port            = 5666    
        wait            = no
        user            = nagios
        group           = nagios
        server          = /usr/local/nagios/bin/nrpe
        server_args     = -c /usr/local/nagios/etc/nrpe.cfg --inetd
        log_on_failure  += USERID
        disable         = no
        only_from       = 127.0.0.1 10.3.37.110
#only_from: allow monit server ip. “ ”ge kai duo ge ip
}


vi /etc/services

加入以下:
nrpe            5666/tcp                        # nrpe


重启 xinetd 服务
/etc/init.d/xinetd restart


检查nrpe是否正常工作
在监控端执行以下命令,返回版本则成功。
/usr/local/nagios/libexec/check_nrpe -H 被监控端ip
NRPE v2.8.1


配置监控命令
vi /usr/local/nagios/etc/nrpe.cfg


# The following examples use hardcoded command arguments...
###############
command[check_users]=/usr/local/nagios/libexec/check_users -w 10 -c 20
command[check_load]=/usr/local/nagios/libexec/check_load -w 16,10,8 -c 30,25,20
command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1
command[check_sda2]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda2
command[check_sda5]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda5
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 300 -c 360 
command[check_http]=/usr/local/nagios/libexec/check_http -H 10.3.37.110 -u /nagios.php
command[check_ftp]=/usr/local/nagios/libexec/check_ftp -H 10.3.37.110 -p 21
command[check_ssh]=/usr/local/nagios/libexec/check_ssh 10.3.37.110
command[check_alive]=/usr/local/nagios/libexec/check_ping -H 10.3.37.110 -w 100,20% -c 500,60% -p 4
command[check_105mysql]=/usr/local/nagios/libexec/check_mysql -H 10.3.37.110 -P 3306 -u nagios -p ***
##############


检查监控命令是否生效
在监控端执行以下命令,返回结果则成功。
/usr/local/nagios/libexec/check_nrpe -H被监控端ip -c check_load

OK - load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0;


3、Nagios 的性能分析图

监控服务变化曲线的工具 ---- PNP

安装rrdtools(绘图工具)可能需要的库
yum install cairo pango libart_lgpl libart_lgpl-devel zlib zlib-devel freetype freetype-devel


安装rrdtools
./configure
make
make install


编辑Nagios 的主配置文件 nagios.cfg
vi /usr/local/nagios/etc/nagios.cfg
修改如下:
process_performance_data=1
host_perfdata_command=host-service-perfdata
service_perfdata_command=process-service-perfdata


如果想要对某个监控对象做数据图表,则需在所对应的host或者service 定义中包含如下的定义:
process_perf_data 1

编辑command.cfg,将“process-service-perfdata”命令对应的执行命令行的内容替换成该脚本:
define command{
        command_name    process-service-perfdata
#       command_line    /usr/bin/printf "%b" "$LASTSERVICECHECK$\t$HOSTNAME$\t$SERVICEDESC$\t$SERVICESTATE$\t$SERVICEATTEMPT$\t$SERV
ICESTATETYPE$\t$SERVICEEXECUTIONTIME$\t$SERVICELATENCY$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$\n" >> /usr/local/nagios/var/service-perf
data.out
        
        command_line /usr/local/nagios/libexec/process_perfdata.pl

#        command_line    /usr/bin/perl /usr/local/nagios/sbin/insert.cgi 
        }


安装PNP
./configure --with-rrdtool=/usr/local/rrdtool-1.0.50/bin/rrdtool
make all
make install


检查配置文件并重启
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/etc/init.d/nagios restart


访问web
http://localhost/nagios/pnp/index.php


4、整合飞信机器人发送短信报警

飞信机器人下载地址:http://www.it-adv.net/

加入飞信运行所需libACE库文件
tar zxvf fetion20091117-linux.tar.gz -C /usr/local/
mv /usr/local/fx /usr/local/fetion


安装飞信机器人
chmod -R 755 /usr/local/fetion
chown -R nagios:nagios /usr/local/fetion


加入飞信.so文件到系统链接库
vi /etc/ld.so.conf.d/fetion.conf
加入一行:
/usr/local/fetion/
更新:ldconfig


发送短信测试
/usr/local/fetion/fetion --hide --mobile=137*** --pwd=*** --to=136*** --msg-utf8="test"


编辑发送飞信命令commands.cfg
vi /usr/local/nagios/etc/objects/commands.cfg

# 'notify-host-by-fei' command definition
define command {
             command_name            host-notify-by-fei
             command_line            /usr/local/fetion/fetion --hide --mobile=136******** --pwd=*** --to=$CONTACTPAGER$ --msg-utf8="Host $HOSTSTATE$ alert for $HOSTNAME$! on '$LONGDATETIME$'" $CONTACTPAGER$
             }

# 'notify-service-by-fei' command definition
define command {
             command_name         service-notify-by-fei
             command_line         /usr/local/fetion/fetion --hide --mobile=136******** --pwd=*** --to=$CONTACTPAGER$ --msg-utf8="$HOSTADDRESS$ $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ on $LONGDATETIME$" $CONTACTPAGER$
             }


编辑联系人配置文件contacts.cfg
vi /usr/local/nagios/etc/objects/contacts.cfg
加入*-notify-by-fei两行和pager
define contact{
        contact_name                    nagiosadmin             ; Short name of user
        use                             generic-contact         ; Inherit default values from generic-contact template (defined abov
e)
        alias                           Nagios Admin            ; Full name of user
        service_notification_commands   service-notify-by-fei
        host_notification_commands      host-notify-by-fei
        email                           242427255@qq.com        ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
        pager                           136********
        }


检查配置文件并重启
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/etc/init.d/nagios restart
4
2
分享到:
评论

相关推荐

    nagios安装和配置全过程

    本文将详细介绍在Linux环境下安装和配置Nagios的全过程,以及一些个人实践心得。 首先,确保你拥有以下软件包:httpd、imagepak-base、mysql、nagios、nagios-plugins、nrpe、perl、php、pnp4nagios和rrdtool。这些...

    nagios监控安装配置文档.zip_crops2k_nagios

    本文档详细介绍了Nagios的安装配置过程,包括监控插件的编写与部署、Pnp4Nagios的安装以及短信报警的设置,旨在帮助IT管理员实现全面的系统监控。 一、Nagios安装基础 1.1 系统要求:Nagios支持多种Linux发行版,...

    nagios+pnp绘图

    - **安装短信工具Gnokii**:用于发送短信。 - **将gnokii的短信功能绑定到Nagios**:配置Nagios使用Gnokii发送短信告警。 - **飞信发布告警**: - **注册飞信**:获取飞信账户。 - **下载飞信机器人并安装**:...

    Nagios-3.2 安装与配置

    ### Nagios-3.2 安装与配置详解 #### 一、简介 Nagios是一款强大且跨平台的开源监控系统,它能够对本地或远程主机及服务进行监控,并在发现异常时发送通知。Nagios支持多种操作系统,如Linux/Unix,并提供了一个...

    nagios+pnp4图形化监控

    ##### 2.3 PNP4Nagios安装与配置 - 下载并安装PNP4Nagios。 - 修改Nagios插件以启用性能数据收集。 - 配置PNP4Nagios以将图表集成到Nagios Web界面中。 #### 三、Nagios + PNP4Nagios 功能特性 ##### 3.1 实时监控...

    nagios+nrpe详细安装配置

    【Nagios+nagiosQL 详细安装配置】 Nagios 是一款强大的开源系统和网络监控工具,适用于Linux或Unix操作系统。它通过一系列插件监控本地及远程服务,当检测到异常时,能够及时向管理员发送告警。Nagios 的核心功能...

    nagios安装文档

    ### Nagios安装与配置知识点详解 #### 一、Nagios概述 - **定义与背景**:Nagios是一款开源的计算机系统与网络监控工具,主要用于监控Windows、Linux及Unix等操作系统下的主机状态以及网络设备(如路由器、交换机...

    Nagios+Cacti详细配置及应用

    总的来说,Nagios+Cacti的组合是IT运维中的强大工具,它们的详细配置和应用需要对Linux系统、网络监控和Web服务有一定的理解。通过精心的配置,可以实现全面、高效的监控环境,保障系统的稳定运行。

    oracle+nagios 如何监控

    在IT监控领域,Nagios是广受欢迎的...通过上述步骤的安装和配置,可以实现对关键系统组件的实时监控,并通过Cacti提供图形化数据展示,通过Nagios的日志和报警功能对异常状况进行及时响应,确保整个IT系统的稳定运行。

    Nagios远程监控Windows服务器的安装与配置

    ### Nagios远程监控Windows服务器的安装与配置 #### 一、Nagios简介及功能 Nagios是一款开源的电脑系统和网络监视工具,能够有效监控Windows、Linux和Unix等不同操作系统下的主机状态,以及交换机、路由器等网络...

    nagios安装文档.doc

    下载并解压PNP4Nagios,按照官方文档的指引进行编译和安装,配置Nagios以将数据发送到PNP4Nagios处理,然后在Web界面上查看详细的性能数据。 在整个安装过程中,需要注意的是,Nagios的配置文件通常位于/etc/nagios...

    Nagios安装

    - **源代码安装**:文档展示了从源代码编译和安装Nagios的过程,这是Linux下常见的软件安装方式。 - **Web服务器配置**:为了让用户可以通过浏览器访问Nagios,必须配置apache服务器并设置访问权限。 - **用户和权限...

    nagios完整配置文档

    优化中还包括了使用PNP工具进行图形化绘图,以及模块化安装方式,这使得Nagios的配置和扩展更加灵活。 安装Nagios的过程包括准备软件包、验证配置文件正确性、启动和停止Nagios等操作。配置Nagios包括理解主配置...

    服务器监控利器nagios

    2.5 **邮件报警**:编写代码实现邮件报警功能,同时可结合短信报警,确保通知及时送达。 ### 3. 具体设置 4.1 **监控数据收集**: - **Apache**:提供Web界面和报警设置。 - **NRPE/NSCA**:允许远程监控Windows...

    nagios的配置

    报警功能是Nagios的关键部分,虽然Nagios自身不包含报警功能的代码,但可以通过与其他开源项目集成实现。例如,Nagios可以通过NRPE(Nagios Remote Plugin Executor)插件来远程管理服务,NRPE在远程服务器上执行...

    nagios的基本使用

    - 配置Nagios服务,包括定义监控目标、设置报警策略等。 - 启动Nagios服务,并通过Web界面访问以查看监控信息。 通过以上步骤,可以有效地使用Nagios来监控网络和服务的状态,确保系统的稳定性和可用性。

    Nagios配置管理

    - **报警机制**:虽然Nagios本身不包含报警功能,但通过配置自定义插件可以实现报警通知,比如通过邮件、短信或自定义方式通知运维人员。 #### 四、Nagios的安装与配置 - **环境准备**:安装Nagios前,需确保系统为...

Global site tag (gtag.js) - Google Analytics