浏览 6094 次
锁定老帖子 主题:python抓取google搜索url
精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (1)
|
|
---|---|
作者 | 正文 |
发表时间:2011-12-22
import sys url = ('https://ajax.googleapis.com/ajax/services/search/web''?v=1.0&q=%s&rsz=8&start=%s') % (search,page) try: request = urllib2.Request( url, None, {'Referer': 'http://www.baidu.com'}) response = urllib2.urlopen(request) # Process the JSON string. results = simplejson.load(response) info = results['responseData']['results'] except Exception,e: print e time.sleep(5) continue for minfo in info: for key in minfo.keys(): if key == 'url': try: print 'url:%s' % minfo[key] 声明:ITeye文章版权属于作者,受法律保护。没有作者书面许可不得转载。
推荐链接
|
|
返回顶楼 | |
发表时间:2011-12-23
1. https://ajax.googleapis.com/ajax/services/search/web is an old service, the results are not quite as good as the google.com results, especially the ranking (order of the rsulting urls)
2. urllib2.urlopen has leaks, if you run this for a miilion time, your machine network is likely going down. |
|
返回顶楼 | |
发表时间:2012-07-13
想多学习一门语言,所以最近也在看python,学习了。。。
|
|
返回顶楼 | |