该帖已经被评为新手帖
|
|
---|---|
作者 | 正文 |
发表时间:2010-12-01
现在的网站越来越复杂,集成的东西越来越多,有些事情httpunit是做不了的,
HttpClient is NOT a browser. HttpClient's purpose is to transmit and receive HTTP messages. HttpClient will not attempt to cache content, execute javascript embedded in HTML pages, try to guess content type, or reformat request / redirect location URIs, or other functionality unrelated to the HTTP transport. 如果需要模拟浏览器操作,该使用什么工具? JWebUnit? PHP - HttpUnit? 或者是其他的? |
|
返回顶楼 | |
发表时间:2010-12-01
yang02301 写道 satikey 写道 yang02301 写道 请教一下LZ:
RenRen网可以用附加的程序处理Login, 但是t.qq.com则是使用Get的方法,另附加Cookie,不知道LZ是如何处理的,请赐教。 多谢! 如果是GET方法,可以再后面添加参数例如 t.qq.com?username=xxx&password=xxx 大概这个样子的。你用网页登陆一下。注意浏览器的地址变化吧。 等我有时间了再弄腾讯微博吧。 针对 t.qq.com LOGON: Step1): sb = notify("http://ptlogin2.qq.com/check?uin=@hdrive20&appid=46000101&r=0.617148618189815"); System.out.println("Verify Code = '" + sb.substring(18, 22)+ "'"); 返回验证码,如: ptui_checkVC('0','!BMF'); ptui_checkVC存于login_div.js文件中。 问题出现:httpcomponents-client-4.0.3是否可以执行ptui_checkVC?如何执行? Step2):得到验证码后,驶入口令,点“登录”见后,应该向browser发送get方法,如: url = "http://ptlogin2.qq.com/login?u=@hdrive20&p=67E5A3B52AE29D6FC6FAFB1587F8D8F3&verifycode=" + sb.substring(18, 22) + "&low_login_enable=1&low_login_hour=720&aid=46000101&u1=http%3A%2F%2Ft.qq.com&ptredirect=1&h=1&from_ui=1&dumy=&fp=loginerroralert"; sb = notify(url); System.out.println(sb); 返回: ptuiCB('3','0','','0','您输入的密码有误,请重试。'); 没有关系,注意参数‘p’是用户口令经过验证码处理后的数值,本人还不知道如何得到,所以返回错误。 请各位指教 这个还真有意思啊。。嘿嘿。改天有时间研究一下。。最近在弄别的东西了。。 |
|
返回顶楼 | |
发表时间:2010-12-01
网友说,用HttpClient抓取腾讯微博的 数据很难,我想试试。哪些人报名,一起研究一下?
|
|
返回顶楼 | |
发表时间:2010-12-01
satikey 写道 网友说,用HttpClient抓取腾讯微博的 数据很难,我想试试。哪些人报名,一起研究一下?
难点在于用JavaScript写的MD5代码太恶心了(在login_div.js文件中),在ajax_Submit()中生成Password代码段如下: if(E[A].name=="p"){ alert(E.verifycode.value); alert(E.p.value) var F=""; F+=E.verifycode.value; F=F.toUpperCase(); B+=md5(md5_3(E.p.value)+F) } E.p.value是实际口令,E.verifycode.value是返回的确认吗,4次使用MD5, function md5_3(B){ var A=new Array; A=core_md5(A,B.length*chrsz); A=core_md5(A,16*chrsz); A=core_md5(A,16*chrsz); return binl2hex(A); } 请哪位将MD5翻译好的Java代码贴出来共享一下,谢谢! |
|
返回顶楼 | |
发表时间:2010-12-02
非常有意思,在Break in t.qq.com的时候发现:当连续LOGON账户多次时,页面会出现要求输入图形认证码(显示代码在JavaScript中,当键入password时向主机请求认证码图形,修改主页HTML代码应该可以屏蔽掉),防止机器人大量Sign In,不过已经实现Java自动Logon t.qq.com功能,随后发帖。
请楼主继续支持如何从t.qq.com自动下载“高校”数据部分。 |
|
返回顶楼 | |
发表时间:2010-12-02
最后修改:2010-12-02
yang02301 写道 非常有意思,在Break in t.qq.com的时候发现:当连续LOGON账户多次时,页面会出现要求输入图形认证码(显示代码在JavaScript中,当键入password时向主机请求认证码图形,修改主页HTML代码应该可以屏蔽掉),防止机器人大量Sign In,不过已经实现Java自动Logon t.qq.com功能,随后发帖。
请楼主继续支持如何从t.qq.com自动下载“高校”数据部分。 现附上自动Logon到t.qq.com的Java代码,需要修改username和password,继续努力! import java.io.IOException; import java.io.UnsupportedEncodingException; import java.util.ArrayList; import java.util.List; import org.apache.http.Header; import org.apache.http.HttpResponse; import org.apache.http.NameValuePair; import org.apache.http.client.ClientProtocolException; import org.apache.http.client.ResponseHandler; import org.apache.http.client.entity.UrlEncodedFormEntity; import org.apache.http.client.methods.HttpGet; import org.apache.http.client.methods.HttpPost; import org.apache.http.impl.client.BasicResponseHandler; import org.apache.http.impl.client.DefaultHttpClient; import org.apache.http.message.BasicNameValuePair; import org.apache.http.protocol.HTTP; import java.security.*; public class QQNotify { private static HttpResponse response; private static DefaultHttpClient httpClient; public static void main(String[] args) { String username = "your username"; String password = "your password"; QQNotify notify = new QQNotify(username, password); if (true) { return; } // String code = new String(notify.notify("http://s.xnimg.cn/a13819/allunivlist.js")); // // 转换16进制的Unicode, // StringBuffer sb = new StringBuffer(code); // System.out.println(sb.toString()); // int pos; // while ((pos = sb.indexOf("\\u")) > -1) { // String tmp = sb.substring(pos, pos + 6); // sb.replace(pos, pos + 6, Character.toString((char) Integer.parseInt(tmp.substring(2), 16))); // } // code = sb.toString(); // System.out.println(code); ///如果你要看下面代码的效果,你只需要 注释掉上面String code 到 System.out.println(code); //转换&#xxxxx;形式Unicode // String code = new String(notify // .notify("http://www.renren.com/GetDep.do?id=13003")); // StringBuffer sb=new StringBuffer(code); // int pos; // while ((pos=sb.indexOf("&#"))>-1) { // String tmp=sb.substring(pos+2, pos+7); // sb.replace(pos, pos+8, Character.toString((char)Integer.parseInt(tmp,10))); // } // code=sb.toString(); // System.out.println(code); } public QQNotify(String userName, String password) { int i; this.httpClient = new DefaultHttpClient(); Header[] headers; String url, sb, verifyCode; // Step 1: get verify code url = "http://ptlogin2.qq.com/check?uin=@" + userName + "&appid=46000101&r=0.617148618189815"; sb = notify(url); i = sb.indexOf("'", 19); verifyCode = sb.substring(18, i).toUpperCase(); System.out.println(sb); System.out.println("Verify Code = '" + verifyCode + "'"); if (!false && verifyCode.length() > 4) { System.out.println("It seem you need input graphic verify code manually."); System.out.println("Wait a few minutes and try again."); System.out.println("Program abort!"); return; } // Step 2: logon // // '!UAK' -> '67E5A3B52AE29D6FC6FAFB1587F8D8F3' // //String str = MD5_3(password) + "!UAK"; //System.out.println("str = " + str); //System.out.println("MD5 = " + MD5(str)); String str = MD5_3(password) + verifyCode; url = "http://ptlogin2.qq.com/login?u=@" + userName + "&p=" + MD5(str) + "&verifycode=" + verifyCode + "&low_login_enable=1&low_login_hour=720&aid=46000101&u1=http%3A%2F%2Ft.qq.com&ptredirect=1&h=1&from_ui=1&dumy=&fp=loginerroralert"; sb = notify(url); System.out.println(sb); if (!true) { response = getMethod(url); System.out.println(response.getStatusLine());//返回302 headers = response.getAllHeaders(); for (i = 0; i < headers.length; i++) { Header header = headers[i]; System.out.println(header.getName() + ": " + header.getValue()); } System.out.println("-----------------------------"); } if (true) { System.out.println("Already logon to '" + userName + "' @ t.qq.com successfully."); System.out.println("Next you need redirect to http://t.qq.com/setting_edu.php, and grap college data."); System.out.println("Good luck!"); return; } return; // 读取跳转的地址 // String redirectUrl = response.getFirstHeader("Location").getValue(); // 查看一下跳转过后,都出现哪些内容. // response=getMethod(redirectUrl);//函数见后面 // System.out.println(response.getStatusLine()); // HTTP/1.1 200 OK // 读取一下主页都有什么内容 已经登陆进去 // System.out.println(readHtml("http://www.renren.com/home")); } // 嗅探指定页面的代码 public String notify(String url) { HttpGet get = new HttpGet(url); ResponseHandler<String> responseHandler = new BasicResponseHandler(); String txt = null; try { txt = httpClient.execute(get, responseHandler); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { get.abort(); } return txt; } // 用post方法向服务器请求 并获得响应,因为post方法要封装参数,因此在函数外部封装好传参 public HttpResponse postMethod(HttpPost post) { HttpResponse resp = null; try { resp = httpClient.execute(post); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { post.abort(); } return resp; } // 用get方法向服务器请求 并获得响应 public HttpResponse getMethod(String url) { HttpGet get = new HttpGet(url); HttpResponse resp = null; try { resp = httpClient.execute(get); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { get.abort(); } return resp; } private String MD5_3(String plainText) { StringBuffer buf = new StringBuffer(""); try { MessageDigest md = MessageDigest.getInstance("MD5"); md.update(plainText.getBytes()); // first time byte b[] = md.digest(); // Second Time b = md.digest(b); // Third Time b = md.digest(b); int i; for (int offset = 0; offset < b.length; offset++) { i = b[offset]; if (i < 0) { i += 256; } if (i < 16) { buf.append("0"); } buf.append(Integer.toHexString(i)); } //System.out.println("32-bit result: " + buf.toString());//32位的加密 //System.out.println("byte b[].size: " + b.length); } catch (NoSuchAlgorithmException e) { // TODO Auto-generated catch block e.printStackTrace(); } return buf.toString().toUpperCase(); } private String MD5(String plainText) { StringBuffer buf = new StringBuffer(""); try { MessageDigest md = MessageDigest.getInstance("MD5"); md.update(plainText.getBytes()); byte b[] = md.digest(); int i; for (int offset = 0; offset < b.length; offset++) { i = b[offset]; if (i < 0) { i += 256; } if (i < 16) { buf.append("0"); } buf.append(Integer.toHexString(i)); } //System.out.println("32-bit result: " + buf.toString());//32位的加密 //System.out.println("byte b[].size: " + b.length); } catch (NoSuchAlgorithmException e) { // TODO Auto-generated catch block e.printStackTrace(); } return buf.toString().toUpperCase(); } } |
|
返回顶楼 | |
发表时间:2010-12-02
已经搞定,可以自动LOGON,保存Cookies,Redirect URL,得到“高校”列表,得到“院系”列表部分还没有做,应该非常容易的啦,感谢LZ开阔思维! 在发送GET得到“高校”列表时,应该注意造一个请求头,假装使用浏览器。 请LZ在做一个简单的图形码验证LOGON例子,使用QQ所采用的。 最后附上源码,后增加的部分还没有整理,有些乱,请多多包含。 import java.io.IOException; import java.io.UnsupportedEncodingException; import java.util.ArrayList; import java.util.List; import java.util.logging.Level; import java.util.logging.Logger; import org.apache.http.Header; import org.apache.http.HttpResponse; import org.apache.http.NameValuePair; import org.apache.http.client.ClientProtocolException; import org.apache.http.client.ResponseHandler; import org.apache.http.client.entity.UrlEncodedFormEntity; import org.apache.http.client.methods.HttpGet; import org.apache.http.client.methods.HttpPost; import org.apache.http.impl.client.BasicResponseHandler; import org.apache.http.impl.client.DefaultHttpClient; import org.apache.http.message.BasicNameValuePair; import org.apache.http.protocol.HTTP; import java.security.*; import java.util.HashMap; import java.util.Iterator; import java.util.Map; import java.util.Set; import java.util.regex.Matcher; import java.util.regex.Pattern; import org.apache.http.cookie.Cookie; public class QQNotify { private static HttpResponse response; private static DefaultHttpClient httpClient; private static Map<String, String> cookies = new HashMap<String, String>(); public static void main(String[] args) { String username = "your username"; String password = "your password"; QQNotify notify = new QQNotify(username, password); if (true) { return; } // String code = new String(notify.notify("http://s.xnimg.cn/a13819/allunivlist.js")); // // 转换16进制的Unicode, // StringBuffer sb = new StringBuffer(code); // System.out.println(sb.toString()); // int pos; // while ((pos = sb.indexOf("\\u")) > -1) { // String tmp = sb.substring(pos, pos + 6); // sb.replace(pos, pos + 6, Character.toString((char) Integer.parseInt(tmp.substring(2), 16))); // } // code = sb.toString(); // System.out.println(code); ///如果你要看下面代码的效果,你只需要 注释掉上面String code 到 System.out.println(code); //转换&#xxxxx;形式Unicode // String code = new String(notify // .notify("http://www.renren.com/GetDep.do?id=13003")); // StringBuffer sb=new StringBuffer(code); // int pos; // while ((pos=sb.indexOf("&#"))>-1) { // String tmp=sb.substring(pos+2, pos+7); // sb.replace(pos, pos+8, Character.toString((char)Integer.parseInt(tmp,10))); // } // code=sb.toString(); // System.out.println(code); } public QQNotify(String userName, String password) { int i; this.httpClient = new DefaultHttpClient(); Header[] headers; String url, sb, verifyCode; cookies.clear(); // Step 1: get verify code url = "http://ptlogin2.qq.com/check?uin=@" + userName + "&appid=46000101&r=0.617148618189815"; sb = notify(url); SaveCookies(httpClient.getCookieStore().getCookies()); i = sb.indexOf("'", 19); verifyCode = sb.substring(18, i).toUpperCase(); System.out.println(sb); System.out.println("Verify Code = '" + verifyCode + "'"); if (!false && verifyCode.length() > 4) { System.out.println("It seem you need input graphic verify code manually."); System.out.println("Wait a few minutes and try again."); System.out.println("Program abort!"); return; } // Step 2: logon // // '!UAK' -> '67E5A3B52AE29D6FC6FAFB1587F8D8F3' // //String str = MD5_3(password) + "!UAK"; //System.out.println("str = " + str); //System.out.println("MD5 = " + MD5(str)); String str = MD5_3(password) + verifyCode; url = "http://ptlogin2.qq.com/login?u=@" + userName + "&p=" + MD5(str) + "&verifycode=" + verifyCode + "&low_login_enable=1&low_login_hour=720&aid=46000101&u1=http%3A%2F%2Ft.qq.com&ptredirect=1&h=1&from_ui=1&dumy=&fp=loginerroralert"; sb = notify(url); System.out.println(sb); SaveCookies(httpClient.getCookieStore().getCookies()); PrintCookies(); /* ptuiCB('0','0','http://t.qq.com','1','登录成功!'); -------- Cookies begin --------- Exception in thread "main" java.lang.NullPointerException 0 : [ptvfsession] = 'a56b05373bffaf65643dbe875a1c9614226d1789c91ddd39134c5289878b087b3f8fd21670efcc430d111b63fa41274f' 1 : [ptcz] = '06aa93cefb0fec33c298f13fecadb5792b7f7816adb11a5e9423e42cd4456115' 2 : [skey] = '@na9wdcELd' 3 : [pt2gguin] = 'o1093457233' 4 : [lskey] = '00010000a1ac49b4a67ea43dde8d6985bb353584846c27cd4a57d63889c082d45da5540b1f78dc6c9f972099' 5 : [luin] = 'o1093457233' 6 : [uin] = 'o1093457233' 7 : [ptuserinfo] = '6864726976653230' 8 : [ptisp] = '' -------- Cookies end --------- -------- Cookies begin --------- 0 : [ptvfsession] = 'cbebb4c13f69aaca9dabea361c77de60d0fb02bd9991902a7dc5abd486770613746651e4bbd99faebef2b77466e4649b' 1 : [ptcz] = 'bf9bd2ac71844eae57221a750a7f5321f4e12bdcb0d7178d654160d175da7f3a' 2 : [skey] = '@na9wdcELd' 3 : [pt2gguin] = 'o1093457233' 4 : [lskey] = '00010000035cf86f252e61d9e8f07aa2c39335e2890f01a2863caaffdb4d9e1aa64f2064ec518ccd9772d333' 5 : [luin] = 'o1093457233' 6 : [uin] = 'o1093457233' 7 : [ptuserinfo] = '6864726976653230' 8 : [ptisp] = '' -------- Cookies end --------- */ if (true) { // Now get country city list // sample get data /* GET /asyn/schoolist.php?type=4&key=%E4%B8%AD%E5%9B%BD_%E5%8C%97%E4%BA%AC&letter=& HTTP/1.1 Accept: *//* Accept-Language: en-us Referer: http://t.qq.com/setting_edu.php Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; QQPinyin 730; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0) Host: t.qq.com Connection: Keep-Alive Cookie: ptui_loginuin2=hdrive20; mb_reg_from=8; pgv_pvid=9250536308; pgv_flv=10.0; pgv_r_cookie=10113040085471; pt2gguin=o1093457233; ptcz=3ac05d30f3e337e94f4fe93999002a23d31779fc45e7bb8f7d86b7a2e34e548d; o_cookie=1093457233; luin=o1093457233; lskey=00010000a90b1dff0aad30b3b96d163e6118418db84c6da83a219311bcb2ff34758e77cb1e3f57228e0d8522; pgv_info=ssid=s4051841450; verifysession=h0050e18f0403630ce623631bd8e1f0f51760865ce2107df6a4bbd1e10521919986ace31226351179618b6c20640a91a959; ptisp=; uin=o1093457233; skey=@na9wdcELd p tui_loginuin2=hdrive20; mb_reg_from=8; pgv_pvid=9250536308; pgv_flv=10.0; pgv_r_cookie=10113040085471; pt2gguin=o1093457233; ptcz=3ac05d30f3e337e94f4fe93999002a23d31779fc45e7bb8f7d86b7a2e34e548d; o_cookie=1093457233; luin=o1093457233; lskey=00010000a90b1dff0aad30b3b96d163e6118418db84c6da83a219311bcb2ff34758e77cb1e3f57228e0d8522; pgv_info=ssid=s4051841450; verifysession=h0050e18f0403630ce623631bd8e1f0f51760865ce2107df6a4bbd1e10521919986ace31226351179618b6c20640a91a959; ptisp=; uin=o1093457233; skey=@na9wdcELd */ //String redirectUrl = "http://t.qq.com/setting_edu.php"; //http://t.qq.com/asyn/schoolist.php?type=4&key=%E4%B8%AD%E5%9B%BD_%E5%8C%97%E4%BA%AC&letter=& //中国_北京 String redirectUrl = "http://t.qq.com/asyn/schoolist.php?type=4&key=%E4%B8%AD%E5%9B%BD_%E5%8C%97%E4%BA%AC&letter=&"; //美国 //String redirectUrl = "http://t.qq.com/asyn/schoolist.php?type=4&key=%E7%BE%8E%E5%9B%BD&letter=&"; HttpGet get = new HttpGet(redirectUrl); get.setHeader("Accept", "*/*"); get.setHeader("Accept-Language", "en-us"); get.setHeader("Referer", "http://t.qq.com/setting_edu.php"); get.setHeader("User-Agent", "gzip, deflate"); get.setHeader("Accept-Language", "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; QQPinyin 730; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0)"); get.setHeader("Host", "t.qq.com"); get.setHeader("Connection", "Keep-Alive"); try { sb = httpClient.execute(get, new BasicResponseHandler()); } catch (IOException ex) { Logger.getLogger(QQNotify.class.getName()).log(Level.SEVERE, null, ex); } //sb = notify(redirectUrl); System.out.println(sb); String regex2 = "title=\"(.*?)\">"; Pattern pattern2 = Pattern.compile(regex2); Matcher matcher2 = pattern2.matcher(sb); while (matcher2.find()) { System.out.println(matcher2.group(1)); } System.out.println("Already logon to '" + userName + "' @ t.qq.com successfully."); System.out.println("Next you need redirect to http://t.qq.com/setting_edu.php do grap colleg data."); System.out.println("Good luck!"); return; } return; // 读取跳转的地址 // String redirectUrl = response.getFirstHeader("Location").getValue(); // 查看一下跳转过后,都出现哪些内容. // response=getMethod(redirectUrl);//函数见后面 // System.out.println(response.getStatusLine()); // HTTP/1.1 200 OK // 读取一下主页都有什么内容 已经登陆进去 // System.out.println(readHtml("http://www.renren.com/home")); } // 嗅探指定页面的代码 public String notify(String url) { HttpGet get = new HttpGet(url); //get.setHeader("User-Agent", "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; QQPinyin 730; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0)"); ResponseHandler<String> responseHandler = new BasicResponseHandler(); String txt = null; try { txt = httpClient.execute(get, responseHandler); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { get.abort(); } return txt; } // 用post方法向服务器请求 并获得响应,因为post方法要封装参数,因此在函数外部封装好传参 public HttpResponse postMethod(HttpPost post) { HttpResponse resp = null; try { resp = httpClient.execute(post); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { post.abort(); } return resp; } // 用get方法向服务器请求 并获得响应 public HttpResponse getMethod(String url) { HttpGet get = new HttpGet(url); HttpResponse resp = null; try { resp = httpClient.execute(get); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { get.abort(); } return resp; } private String MD5_3(String plainText) { StringBuffer buf = new StringBuffer(""); try { MessageDigest md = MessageDigest.getInstance("MD5"); md.update(plainText.getBytes()); // first time byte b[] = md.digest(); // Second Time b = md.digest(b); // Third Time b = md.digest(b); int i; for (int offset = 0; offset < b.length; offset++) { i = b[offset]; if (i < 0) { i += 256; } if (i < 16) { buf.append("0"); } buf.append(Integer.toHexString(i)); } //System.out.println("32-bit result: " + buf.toString());//32位的加密 //System.out.println("byte b[].size: " + b.length); } catch (NoSuchAlgorithmException e) { // TODO Auto-generated catch block e.printStackTrace(); } return buf.toString().toUpperCase(); } private String MD5(String plainText) { StringBuffer buf = new StringBuffer(""); try { MessageDigest md = MessageDigest.getInstance("MD5"); md.update(plainText.getBytes()); byte b[] = md.digest(); int i; for (int offset = 0; offset < b.length; offset++) { i = b[offset]; if (i < 0) { i += 256; } if (i < 16) { buf.append("0"); } buf.append(Integer.toHexString(i)); } //System.out.println("32-bit result: " + buf.toString());//32位的加密 //System.out.println("byte b[].size: " + b.length); } catch (NoSuchAlgorithmException e) { // TODO Auto-generated catch block e.printStackTrace(); } return buf.toString().toUpperCase(); } private void SaveCookies(List<Cookie> cs) { if (cs.isEmpty()) { System.out.println("None"); } else { for (int i = 0; i < cs.size(); i++) { cookies.put(cs.get(i).getName(), cs.get(i).getValue()); } } } private void PrintCookies() { int i = 0; //Get Map in Set interface to get key and value Set s = cookies.entrySet(); //Move next key and value of Map by iterator Iterator it = s.iterator(); System.out.println("-------- Cookies begin ---------"); while (it.hasNext()) { // key=value separator this by Map.Entry to get key and value Map.Entry m = (Map.Entry) it.next(); System.out.println(" " + i++ + " : [" + m.getKey() + "] = '" + m.getValue() + "'"); } System.out.println("-------- Cookies end ---------"); } } //Get Canada School List //http://t.qq.com/asyn/schoolist.php?type=4&key=%E5%8A%A0%E6%8B%BF%E5%A4%A7&letter=& |
|
返回顶楼 | |
发表时间:2010-12-02
yang02301 写道 已经搞定,可以自动LOGON,保存Cookies,Redirect URL,得到“高校”列表,得到“院系”列表部分还没有做,应该非常容易的啦,感谢LZ开阔思维! 在发送GET得到“高校”列表时,应该注意造一个请求头,假装使用浏览器。 请LZ在做一个简单的图形码验证LOGON例子,使用QQ所采用的。 最后附上源码,后增加的部分还没有整理,有些乱,请多多包含。 import java.io.IOException; import java.io.UnsupportedEncodingException; import java.util.ArrayList; import java.util.List; import java.util.logging.Level; import java.util.logging.Logger; import org.apache.http.Header; import org.apache.http.HttpResponse; import org.apache.http.NameValuePair; import org.apache.http.client.ClientProtocolException; import org.apache.http.client.ResponseHandler; import org.apache.http.client.entity.UrlEncodedFormEntity; import org.apache.http.client.methods.HttpGet; import org.apache.http.client.methods.HttpPost; import org.apache.http.impl.client.BasicResponseHandler; import org.apache.http.impl.client.DefaultHttpClient; import org.apache.http.message.BasicNameValuePair; import org.apache.http.protocol.HTTP; import java.security.*; import java.util.HashMap; import java.util.Iterator; import java.util.Map; import java.util.Set; import java.util.regex.Matcher; import java.util.regex.Pattern; import org.apache.http.cookie.Cookie; public class QQNotify { private static HttpResponse response; private static DefaultHttpClient httpClient; private static Map<String, String> cookies = new HashMap<String, String>(); public static void main(String[] args) { String username = "your username"; String password = "your password"; QQNotify notify = new QQNotify(username, password); if (true) { return; } // String code = new String(notify.notify("http://s.xnimg.cn/a13819/allunivlist.js")); // // 转换16进制的Unicode, // StringBuffer sb = new StringBuffer(code); // System.out.println(sb.toString()); // int pos; // while ((pos = sb.indexOf("\\u")) > -1) { // String tmp = sb.substring(pos, pos + 6); // sb.replace(pos, pos + 6, Character.toString((char) Integer.parseInt(tmp.substring(2), 16))); // } // code = sb.toString(); // System.out.println(code); ///如果你要看下面代码的效果,你只需要 注释掉上面String code 到 System.out.println(code); //转换&#xxxxx;形式Unicode // String code = new String(notify // .notify("http://www.renren.com/GetDep.do?id=13003")); // StringBuffer sb=new StringBuffer(code); // int pos; // while ((pos=sb.indexOf("&#"))>-1) { // String tmp=sb.substring(pos+2, pos+7); // sb.replace(pos, pos+8, Character.toString((char)Integer.parseInt(tmp,10))); // } // code=sb.toString(); // System.out.println(code); } public QQNotify(String userName, String password) { int i; this.httpClient = new DefaultHttpClient(); Header[] headers; String url, sb, verifyCode; cookies.clear(); // Step 1: get verify code url = "http://ptlogin2.qq.com/check?uin=@" + userName + "&appid=46000101&r=0.617148618189815"; sb = notify(url); SaveCookies(httpClient.getCookieStore().getCookies()); i = sb.indexOf("'", 19); verifyCode = sb.substring(18, i).toUpperCase(); System.out.println(sb); System.out.println("Verify Code = '" + verifyCode + "'"); if (!false && verifyCode.length() > 4) { System.out.println("It seem you need input graphic verify code manually."); System.out.println("Wait a few minutes and try again."); System.out.println("Program abort!"); return; } // Step 2: logon // // '!UAK' -> '67E5A3B52AE29D6FC6FAFB1587F8D8F3' // //String str = MD5_3(password) + "!UAK"; //System.out.println("str = " + str); //System.out.println("MD5 = " + MD5(str)); String str = MD5_3(password) + verifyCode; url = "http://ptlogin2.qq.com/login?u=@" + userName + "&p=" + MD5(str) + "&verifycode=" + verifyCode + "&low_login_enable=1&low_login_hour=720&aid=46000101&u1=http%3A%2F%2Ft.qq.com&ptredirect=1&h=1&from_ui=1&dumy=&fp=loginerroralert"; sb = notify(url); System.out.println(sb); SaveCookies(httpClient.getCookieStore().getCookies()); PrintCookies(); /* ptuiCB('0','0','http://t.qq.com','1','登录成功!'); -------- Cookies begin --------- Exception in thread "main" java.lang.NullPointerException 0 : [ptvfsession] = 'a56b05373bffaf65643dbe875a1c9614226d1789c91ddd39134c5289878b087b3f8fd21670efcc430d111b63fa41274f' 1 : [ptcz] = '06aa93cefb0fec33c298f13fecadb5792b7f7816adb11a5e9423e42cd4456115' 2 : [skey] = '@na9wdcELd' 3 : [pt2gguin] = 'o1093457233' 4 : [lskey] = '00010000a1ac49b4a67ea43dde8d6985bb353584846c27cd4a57d63889c082d45da5540b1f78dc6c9f972099' 5 : [luin] = 'o1093457233' 6 : [uin] = 'o1093457233' 7 : [ptuserinfo] = '6864726976653230' 8 : [ptisp] = '' -------- Cookies end --------- -------- Cookies begin --------- 0 : [ptvfsession] = 'cbebb4c13f69aaca9dabea361c77de60d0fb02bd9991902a7dc5abd486770613746651e4bbd99faebef2b77466e4649b' 1 : [ptcz] = 'bf9bd2ac71844eae57221a750a7f5321f4e12bdcb0d7178d654160d175da7f3a' 2 : [skey] = '@na9wdcELd' 3 : [pt2gguin] = 'o1093457233' 4 : [lskey] = '00010000035cf86f252e61d9e8f07aa2c39335e2890f01a2863caaffdb4d9e1aa64f2064ec518ccd9772d333' 5 : [luin] = 'o1093457233' 6 : [uin] = 'o1093457233' 7 : [ptuserinfo] = '6864726976653230' 8 : [ptisp] = '' -------- Cookies end --------- */ if (true) { // Now get country city list // sample get data /* GET /asyn/schoolist.php?type=4&key=%E4%B8%AD%E5%9B%BD_%E5%8C%97%E4%BA%AC&letter=& HTTP/1.1 Accept: *//* Accept-Language: en-us Referer: http://t.qq.com/setting_edu.php Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; QQPinyin 730; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0) Host: t.qq.com Connection: Keep-Alive Cookie: ptui_loginuin2=hdrive20; mb_reg_from=8; pgv_pvid=9250536308; pgv_flv=10.0; pgv_r_cookie=10113040085471; pt2gguin=o1093457233; ptcz=3ac05d30f3e337e94f4fe93999002a23d31779fc45e7bb8f7d86b7a2e34e548d; o_cookie=1093457233; luin=o1093457233; lskey=00010000a90b1dff0aad30b3b96d163e6118418db84c6da83a219311bcb2ff34758e77cb1e3f57228e0d8522; pgv_info=ssid=s4051841450; verifysession=h0050e18f0403630ce623631bd8e1f0f51760865ce2107df6a4bbd1e10521919986ace31226351179618b6c20640a91a959; ptisp=; uin=o1093457233; skey=@na9wdcELd p tui_loginuin2=hdrive20; mb_reg_from=8; pgv_pvid=9250536308; pgv_flv=10.0; pgv_r_cookie=10113040085471; pt2gguin=o1093457233; ptcz=3ac05d30f3e337e94f4fe93999002a23d31779fc45e7bb8f7d86b7a2e34e548d; o_cookie=1093457233; luin=o1093457233; lskey=00010000a90b1dff0aad30b3b96d163e6118418db84c6da83a219311bcb2ff34758e77cb1e3f57228e0d8522; pgv_info=ssid=s4051841450; verifysession=h0050e18f0403630ce623631bd8e1f0f51760865ce2107df6a4bbd1e10521919986ace31226351179618b6c20640a91a959; ptisp=; uin=o1093457233; skey=@na9wdcELd */ //String redirectUrl = "http://t.qq.com/setting_edu.php"; //http://t.qq.com/asyn/schoolist.php?type=4&key=%E4%B8%AD%E5%9B%BD_%E5%8C%97%E4%BA%AC&letter=& //中国_北京 String redirectUrl = "http://t.qq.com/asyn/schoolist.php?type=4&key=%E4%B8%AD%E5%9B%BD_%E5%8C%97%E4%BA%AC&letter=&"; //美国 //String redirectUrl = "http://t.qq.com/asyn/schoolist.php?type=4&key=%E7%BE%8E%E5%9B%BD&letter=&"; HttpGet get = new HttpGet(redirectUrl); get.setHeader("Accept", "*/*"); get.setHeader("Accept-Language", "en-us"); get.setHeader("Referer", "http://t.qq.com/setting_edu.php"); get.setHeader("User-Agent", "gzip, deflate"); get.setHeader("Accept-Language", "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; QQPinyin 730; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0)"); get.setHeader("Host", "t.qq.com"); get.setHeader("Connection", "Keep-Alive"); try { sb = httpClient.execute(get, new BasicResponseHandler()); } catch (IOException ex) { Logger.getLogger(QQNotify.class.getName()).log(Level.SEVERE, null, ex); } //sb = notify(redirectUrl); System.out.println(sb); String regex2 = "title=\"(.*?)\">"; Pattern pattern2 = Pattern.compile(regex2); Matcher matcher2 = pattern2.matcher(sb); while (matcher2.find()) { System.out.println(matcher2.group(1)); } System.out.println("Already logon to '" + userName + "' @ t.qq.com successfully."); System.out.println("Next you need redirect to http://t.qq.com/setting_edu.php do grap colleg data."); System.out.println("Good luck!"); return; } return; // 读取跳转的地址 // String redirectUrl = response.getFirstHeader("Location").getValue(); // 查看一下跳转过后,都出现哪些内容. // response=getMethod(redirectUrl);//函数见后面 // System.out.println(response.getStatusLine()); // HTTP/1.1 200 OK // 读取一下主页都有什么内容 已经登陆进去 // System.out.println(readHtml("http://www.renren.com/home")); } // 嗅探指定页面的代码 public String notify(String url) { HttpGet get = new HttpGet(url); //get.setHeader("User-Agent", "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; QQPinyin 730; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0)"); ResponseHandler<String> responseHandler = new BasicResponseHandler(); String txt = null; try { txt = httpClient.execute(get, responseHandler); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { get.abort(); } return txt; } // 用post方法向服务器请求 并获得响应,因为post方法要封装参数,因此在函数外部封装好传参 public HttpResponse postMethod(HttpPost post) { HttpResponse resp = null; try { resp = httpClient.execute(post); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { post.abort(); } return resp; } // 用get方法向服务器请求 并获得响应 public HttpResponse getMethod(String url) { HttpGet get = new HttpGet(url); HttpResponse resp = null; try { resp = httpClient.execute(get); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { get.abort(); } return resp; } private String MD5_3(String plainText) { StringBuffer buf = new StringBuffer(""); try { MessageDigest md = MessageDigest.getInstance("MD5"); md.update(plainText.getBytes()); // first time byte b[] = md.digest(); // Second Time b = md.digest(b); // Third Time b = md.digest(b); int i; for (int offset = 0; offset < b.length; offset++) { i = b[offset]; if (i < 0) { i += 256; } if (i < 16) { buf.append("0"); } buf.append(Integer.toHexString(i)); } //System.out.println("32-bit result: " + buf.toString());//32位的加密 //System.out.println("byte b[].size: " + b.length); } catch (NoSuchAlgorithmException e) { // TODO Auto-generated catch block e.printStackTrace(); } return buf.toString().toUpperCase(); } private String MD5(String plainText) { StringBuffer buf = new StringBuffer(""); try { MessageDigest md = MessageDigest.getInstance("MD5"); md.update(plainText.getBytes()); byte b[] = md.digest(); int i; for (int offset = 0; offset < b.length; offset++) { i = b[offset]; if (i < 0) { i += 256; } if (i < 16) { buf.append("0"); } buf.append(Integer.toHexString(i)); } //System.out.println("32-bit result: " + buf.toString());//32位的加密 //System.out.println("byte b[].size: " + b.length); } catch (NoSuchAlgorithmException e) { // TODO Auto-generated catch block e.printStackTrace(); } return buf.toString().toUpperCase(); } private void SaveCookies(List<Cookie> cs) { if (cs.isEmpty()) { System.out.println("None"); } else { for (int i = 0; i < cs.size(); i++) { cookies.put(cs.get(i).getName(), cs.get(i).getValue()); } } } private void PrintCookies() { int i = 0; //Get Map in Set interface to get key and value Set s = cookies.entrySet(); //Move next key and value of Map by iterator Iterator it = s.iterator(); System.out.println("-------- Cookies begin ---------"); while (it.hasNext()) { // key=value separator this by Map.Entry to get key and value Map.Entry m = (Map.Entry) it.next(); System.out.println(" " + i++ + " : [" + m.getKey() + "] = '" + m.getValue() + "'"); } System.out.println("-------- Cookies end ---------"); } } //Get Canada School List //http://t.qq.com/asyn/schoolist.php?type=4&key=%E5%8A%A0%E6%8B%BF%E5%A4%A7&letter=& 厉害,获取高校院系的那部分就很简单了。恩。基本上就是get post请求了。 |
|
返回顶楼 | |
发表时间:2010-12-02
最后修改:2010-12-02
请高手添加一段代码,从http://mat1.gtimg.com/www/mb/js/mi.City_100831.js导入JSON源数据到Java变量中。
http://mat1.gtimg.com/www/mb/js/mi.City_100831.js使用UTF-8编码。 拜托! |
|
返回顶楼 | |
发表时间:2010-12-02
yang02301 写道 请高手添加一段代码,从http://mat1.gtimg.com/www/mb/js/mi.City_100831.js导入JSON源数据到Java变量中。 拜托! 我不熟悉 Json数据到Java变量。上次准备弄,因为其他事情耽搁了。 |
|
返回顶楼 | |