用shell脚本分析Nginx日志

zhengdl126

浏览: 2547525 次
性别:
来自: 深圳

最近访客更多访客>>

lxin410

u012363178

xjf112233445566

seavers

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

linux运维
shell

本文中的shell脚本又分为两种情况，第一种情况是Nginx作为最前端的负载均衡器，其集群架构为Nginx+Keepalived时，脚本内容如下所示：

    vim　log-nginx.sh  
    #!/bin/bash  
    if [$# -eq 0 ]; then  
    　 echo "Error: please specify logfile."  
    　 exit 0  
    else  
    　 LOG=￥1  
    fi  
    if [ ! -f $1 ]; then  
    　 echo "Sorry， sir， I can""t find this apache log file， pls try again!"  
    exit 0  
    fi  
 ################################
    echo "Most of the ip:"  
    echo "-------------------------------------------"  
    awk ""{ print $1 }""$LOG| sort| uniq -c| sort -nr| head -10  
    echo  
    echo  
################### 
    echo "Most of the time:"  
    echo "--------------------------------------------"  
    awk ""{ print $4 }""$LOG| cut -c 14-18| sort| uniq -c| sort -nr| head -10  
    echo  
    echo  
#######################
    echo "Most of the page:"  
    echo "--------------------------------------------"  
    awk ""{print $11}""$LOG| sed ""s/^.*\\（.cn*\\）\"/\\1/g""| sort| uniq -c| sort -rn| head -10  
    echo  
    echo  
#####################3  
    echo "Most of the time / Most of the ip:"  
    echo "--------------------------------------------"  
    awk ""{ print $4 }""$LOG| cut -c 14-18| sort -n| uniq -c| sort -nr| head -10 > timelog  
    for i in ""awk ""{ print $2 }"" timelog""  
    do  
    　 num=""grep $i timelog| awk ""{ print $1 }""""  
    　 echo "$i $num"  
    　 ip=""grep $i $LOG| awk ""{ print $1}""| sort -n| uniq -c| sort -nr| head -10""  
    　 echo "$ip"  
    　 echo  
    done  
    rm -f timelog

第二种情况是以Nginx作为Web端，置于LVS后面，这时要剔除掉LVS的IP地址，比如LVS服务器的公网IP地址(像203.93.236.141、203.93.236.145等)。这样可以将第一种情况的脚本略微调整一下，如下所示：

    #!/bin/bash  
    if ［$# -eq 0 ］; then  
    　 echo "Error: please specify logfile."  
    　 exit 0  
    else  
    　 cat$1| egrep -v '203.93.236.141|145' > LOG  
    fi  
    if ［ ! -f$1 ］; then  
    　 echo "Sorry, sir, I can't find this apache log file, pls try again!"  
    exit 0  
    fi  
    ###################################################  
    echo "Most of the ip:"  
    echo "-------------------------------------------"  
    awk '{ print$1 }' LOG| sort| uniq -c| sort -nr| head -10  
    echo  
    echo  
    ####################################################  
    echo "Most of the time:"  
    echo "--------------------------------------------"  
    awk '{ print$4 }' LOG| cut -c 14-18| sort| uniq -c| sort -nr| head -10  
    echo  
    echo  
    ####################################################  
    echo "Most of the page:"  
    echo "--------------------------------------------"  
    awk '{print$11}' LOG| sed 's/^.*\\(.cn*\\)\"/\\1/g'| sort| uniq -c| sort -rn| head -10  
    echo  
    echo  
    ####################################################  
    echo "Most of the time / Most of the ip:"  
    echo "--------------------------------------------"  
    awk '{ print$4 }' LOG| cut -c 14-18| sort -n| uniq -c| sort -nr| head -10 > timelog  
    for i in 'awk '{ print$2 }' timelog'  
    do  
    　 num='grep$i timelog| awk '{ print$1 }''  
    　 echo "$i$num"  
    　 ip='grep$i LOG| awk '{ print$1}'| sort -n| uniq -c| sort -nr| head -10'  
    　 echo "$ip"  
    　 echo  
    done  
    rm -f timelog 
        #!/bin/bash  
        if ［$# -eq 0 ］; then  
        　 echo "Error: please specify logfile."  
        　 exit 0  
        else  
        　 cat$1| egrep -v '203.93.236.141|145' > LOG  
        fi  
        if ［ ! -f$1 ］; then  
        　 echo "Sorry, sir, I can't find this apache log file, pls try again!"  
        exit 0  
        fi  
        ###################################################  
        echo "Most of the ip:"  
        echo "-------------------------------------------"  
        awk '{ print$1 }' LOG| sort| uniq -c| sort -nr| head -10  
        echo  
        echo  
        ####################################################  
        echo "Most of the time:"  
        echo "--------------------------------------------"  
        awk '{ print$4 }' LOG| cut -c 14-18| sort| uniq -c| sort -nr| head -10  
        echo  
        echo  
        ####################################################  
        echo "Most of the page:"  
        echo "--------------------------------------------"  
        awk '{print$11}' LOG| sed 's/^.*\\(.cn*\\)\"/\\1/g'| sort| uniq -c| sort -rn| head -10  
        echo  
        echo  
        ####################################################  
        echo "Most of the time / Most of the ip:"  
        echo "--------------------------------------------"  
        awk '{ print$4 }' LOG| cut -c 14-18| sort -n| uniq -c| sort -nr| head -10 > timelog  
        for i in 'awk '{ print$2 }' timelog'  
        do  
        　 num='grep$i timelog| awk '{ print$1 }''  
        　 echo "$i$num"  
        　 ip='grep$i LOG| awk '{ print$1}'| sort -n| uniq -c| sort -nr| head -10'  
        　 echo "$ip"  
        　 echo  
        done  
        rm -f timelog

我们可以用此脚本分析文件名为www_tomcat_20110331.log的文件。[root@localhost 03]# sh counter_nginx.sh　www_tomcat_20110331.log大家应该跟我一样比较关注脚本运行后的第一项和第二项结果，即访问我们网站最多的IP和哪个时间段IP访问比较多，如下所示：
　　

    Most of the ip:
    -------------------------------------------
    　 5440 117.34.91.54
    　9 119.97.226.226
    　4 210.164.156.66
    　4 173.19.0.240
    　4 109.230.251.35
    　2 96.247.52.15
    　2 85.91.140.124
    　2 74.168.71.253
    　2 71.98.41.114
    　2 70.61.253.194
    Most of the time:
    --------------------------------------------
     12 15:31
     11 09:45
     10 23:55
     10 21:45
     10 21:37
     10 20:29
     10 19:54
     10 19:44
     10 19:32
     10 19:13

　　如果对日志的要求不高，我们可以直接通过Awk和Sed来分析Linux日志(如果对Perl熟练也可以用它来操作)，还可以通过Awstats来进行详细分析，后者尤其适合Web服务器和邮件服务器。另外，如果对日志有特殊需求的话，还可以架设专用的日志服务器来收集Linux服务器日志。总之一句话：一切看需求而定。

分享到：

网络体系结构-OSI参考模型 | 网络抓包TCPDump+Wireshark

2012-03-13 19:41
浏览 2491
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论