论坛首页 编程语言技术论坛

python抓取google搜索url

浏览 6094 次
精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (1)
作者 正文
   发表时间:2011-12-22  
import sys
url = ('https://ajax.googleapis.com/ajax/services/search/web''?v=1.0&q=%s&rsz=8&start=%s') % (search,page)
try:
    request = urllib2.Request(
    url, None, {'Referer': 'http://www.baidu.com'})
    response = urllib2.urlopen(request)
            
# Process the JSON string.
    results = simplejson.load(response)
    info = results['responseData']['results']
except Exception,e:
    print e
    time.sleep(5)
    continue
            
    for minfo in info:
        for key in minfo.keys():
            if key == 'url':
                 try:
                     print 'url:%s' % minfo[key]
   发表时间:2011-12-23  
1. https://ajax.googleapis.com/ajax/services/search/web is an old service, the results are not quite as good as the google.com results, especially the ranking (order of the rsulting urls)
2. urllib2.urlopen has leaks, if you run this for a miilion time, your machine network is likely going down.
0 请登录后投票
   发表时间:2012-07-13  
想多学习一门语言,所以最近也在看python,学习了。。。
0 请登录后投票
论坛首页 编程语言技术版

跳转论坛:
Global site tag (gtag.js) - Google Analytics