解决线上故障-python分析日志脚本

kongxuan

浏览: 70810 次
性别:

最近访客更多访客>>

hlh1039690326

wgx13

heianxing

chenzhihui

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

python

简直在作死的节奏，一个项目出了三个故障，半夜被搞醒处理故障。。。

末了，需要从日志里面扣出数据，进行历史数据修复

开始用shell,奈何shell的水准半桶水都不到。。实在不行，果断用python.磕磕碰碰，一路百度。。终于完成了。

贴出来。。半夜的成果

(1)对于gz文件，已上面其中一个文件为例，执行脚本sudo python parse_gz_log.py /home/q/www/hms.***.com/logs/access.2014-01-09.log.gz会在/home/q/www/hms.***.com/logs/下面生成相应的sql文件access.2014-01-09.log.gz.sql。

（2）对于唯一的log文件access.2014-01-10.log，执行如下脚本sudo python parselog.py /home/q/www/***.***.com/logs/access.2014-01-10.log会在/home/q/www/***.***.com/logs/下面生成相应的sql文件access.2014-01-10.log.sql

代码：

# -*- coding: UTF-8 -*-

import sys

#line="192.168.224.18 - - [10/Jan/2014:02:05:42 +0800] \"GET /commission/order/user/edit.htm?fromDate=2014-01-12&toDate=2014-01-13&roomId=1147845&QHFP=ZSD_A9F061AB&QHP=ZSB_A9F08213&from=q***rHotel&partnerUserId=&bd_sign= HTTP/1.0\" 200 39745 \"http://hotel.q***r.com/booksystem/Booking_Main.jsp?local=zh&full=false&ttsSign=05facbd629f70c21d00c1e7353a7060a&priceCut=0&tid=5357743&required1=2014-01-12&required2=2014-01-13&payment=0&CPCB=wiqu***ar0004&roomId=1147845&requestID=c0a8f8af-m2ijp-6mn&lpsp=np&ppb=0&stat=30148&retailPrice=1&detailType=guru&from=q***rHotel&sgroup=A&filterid=973515f5-5323-4abf-bd47-68804ba091b6_C&QHFP=ZSD_A9F061AB&QHP=ZSB_A9F08213&stat2=1124511&required0=concepcion&codeBase=wiqun***r0004&hotelSEQ=concepcion_44" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36" "U.pudh0348" "192.168.0.207_429c0d20_13a91dbf8d5_66ea|1351068665199\" 192.168.11.202"


indexstr="commission/order/user/edit.htm"
indexfrom="fromDate="
indextodate="&toDate="
indexroomId="&roomId="
indexroomIdEnd="&QHFP="

data={}

#fileName="access.2014-01-10.log"
#需要分析的日志文件
fileName= sys.argv[1]
orderdate=fileName.split('.')[1]

#生成sql语句
#sqlFile="update_username.sql"
sqlFile=fileName+".sql"
sf=open(sqlFile,'w')
readfile=open(fileName)
for line in readfile:
    #判断是否包含commission/order/user/edit.htm
    if(line.find(indexstr)<0 or line.find(indexfrom)<0 or line.find(indextodate)<0 or line.find(indexroomId)<0):
        continue  
    fromdate=line[line.find(indexfrom) + 9:line.find(indextodate)]
    todate=line[line.find(indextodate) + 8:line.find(indexroomIdEnd)]
    todate=todate[0:todate.find(indexroomId)]    
    roomId=line[line.find(indextodate) + 8:line.find(indexroomIdEnd)]
    roomId=roomId[roomId.find(indexroomId)+8:]
    
    
    
    #获取username
    list=line.split(' ')
    if(len(list)<3):continue
    username=list[len(list)-3]
    username=username[username.find("U.") + 2:username.find("192.168.")]
    
    if(username == '-' or username ==''):continue
    
    userdate=fromdate+","+todate+","+roomId+","+username
    #去除重复数据
    data[userdate]="1"
    print fromdate,todate,roomId,username
    
#循环数据，写入文本文件
for key in data.keys():
    keys=key.split(',')
    if(len(keys)<3):continue
    sql="update commission_order set user_name='"+keys[3]+"' where from_date='"+keys[0]+"' and to_date='"+keys[1]+"' and room_id='"+keys[2]+"' and order_date='"+orderdate+"';"
    sf.write(sql+"\n")
sf.close()
readfile.close()

下面是处理gz文件的代码：

# -*- coding: UTF-8 -*-

import sys
import gzip

#line="192.168.224.18 - - [10/Jan/2014:02:05:42 +0800] \"GET /commission/order/user/edit.htm?fromDate=2014-01-12&toDate=2014-01-13&roomId=1147845&QHFP=ZSD_A9F061AB&QHP=ZSB_A9F08213&from=***&partnerUserId=&bd_sign= HTTP/1.0\" 200 39745 \"http://hotel.***.com/booksystem/Booking_Main.jsp?local=zh&full=false&ttsSign=05facbd629f70c21d00c1e7353a7060a&priceCut=0&tid=5357743&required1=2014-01-12&required2=2014-01-13&payment=0&CPCB=wiqu***ar0004&roomId=1147845&requestID=c0a8f8af-m2ijp-6mn&lpsp=np&ppb=0&stat=30148&retailPrice=1&detailType=guru&from=***&sgroup=A&filterid=973515f5-5323-4abf-bd47-68804ba091b6_C&QHFP=ZSD_A9F061AB&QHP=ZSB_A9F08213&stat2=1124511&required0=concepcion&codeBase=***&hotelSEQ=concepcion_44" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36" "U.pudh0348" "192.168.0.207_429c0d20_13a91dbf8d5_66ea|1351068665199\" 192.168.11.202"


indexstr="/commission/order/user/edit.htm"
indexfrom="fromDate="
indextodate="&toDate="
indexroomId="&roomId="
indexroomIdEnd="&QHFP="

data={}

#fileName="access.2014-01-09.log.gz"
#需要分析的日志文件
fileName= sys.argv[1]
orderdate=fileName.split('.')[1]

#生成sql语句
#sqlFile="update_username.sql"
sqlFile=fileName+".sql"
sf=open(sqlFile,'w')
readfile = gzip.GzipFile(fileName)
for line in readfile:
    #判断是否包含commission/order/user/edit.htm
    if(line.find(indexstr)<0 or line.find(indexfrom)<0 or line.find(indextodate)<0 or line.find(indexroomId)<0):
        continue  
    fromdate=line[line.find(indexfrom) + 9:line.find(indextodate)]
    todate=line[line.find(indextodate) + 8:line.find(indexroomIdEnd)]
    todate=todate[0:todate.find(indexroomId)]    
    roomId=line[line.find(indextodate) + 8:line.find(indexroomIdEnd)]
    roomId=roomId[roomId.find(indexroomId)+8:]
    
    
    #获取username
    list=line.split(' ')
    if(len(list)<3):continue
    username=list[len(list)-3]
    username=username[username.find("U.") + 2:username.find("192.168.")]
    
    if(username == '-' or username ==''):continue
    
    userdate=fromdate+","+todate+","+roomId+","+username
    #去除重复数据
    data[userdate]="1"
    print fromdate,todate,roomId,username
    
#循环数据，写入文本文件
for key in data.keys():
    keys=key.split(',')
    if(len(keys)<3):continue
    sql="update commission_order set user_name='"+keys[3]+"' where from_date='"+keys[0]+"' and to_date='"+keys[1]+"' and room_id='"+keys[2]+"' and order_date='"+orderdate+"';"
    sf.write(sql+"\n")
sf.close()
readfile.close()

自己mark一下。。。

分享到：

python查数据库发邮件 | (二)自定义velocity 标签

2014-01-10 07:41
浏览 2504
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论