Java网络抓取

huoming550

浏览: 420441 次
性别:
来自: 广州

最近访客更多访客>>

changeself

zhou5262721

fanfree

liuyouming

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

WEB2.0

Java Web .net

一、返回头信息的获取

步骤：

1、定义URL对象并初始化；

2、定义URLConnection对象，并通过URL对象的openConnection()方法获取该对象；
3、调用URLConnection对象的connect()方法实现和服务器的连接；

4、通过URLConnection对象获取请求头的域信息（getHeaderFields()、getHeaderField(key)）；

5、使用URLConnection对象的方法来获取信息。

示例：

              String urlName = "http://www....com ";

              try {

                     URL url = new URL(urlName);

                    URLConnection connection = url.openConnection();

                    connection.connect();

                      // print header fields

                     Map<String, List<String>> headers = connection.getHeaderFields();

                     for (Map.Entry<String, List<String>> entry : headers.entrySet()) {

                            String key = entry.getKey();

                            for (String value : entry.getValue()) {

                                   System.out.println(key + ": " + value);

                            }

                     }

                     // print convenience functions

                     System.out.println("------------------------");

                     System.out.println("getContentType:" + connection.getContentType());

                     System.out.println("getContentLength:"

                                   + connection.getContentLength());

                     System.out.println("getContentEncoding:"

                                   + connection.getContentEncoding());

                     System.out.println("getDate:" + connection.getDate());

                     System.out.println("getExpiration:" + connection.getExpiration());

                     System.out.println("getLastModified:"

                                   + connection.getLastModified());

                     System.out.println("------------------------");

                     Scanner in = new Scanner(connection.getInputStream());

                     // print first ten lines of contents

                     for (int n = 1; in.hasNextLine() && n <= 10; n++) {

                            System.out.println(in.nextLine());

                     }

                     if (in.hasNextLine())

                            System.out.println("...");

              } catch (IOException e) {

                     // TODO Auto-generated catch block

                     e.printStackTrace();

              }


二、带参数的请求

       在默认情况下，建立的连接只有从服务器读取信息的输入流，并没有任何之行写操作的输出流。如果想获取输出流（例如，想一个Web服务器提交数据），那么需要调用：connection.setDoOutput(true);

示例：

              String urlName = "……";

              Map<String, String> paras = new HashMap<String, String>();

              paras.put("flightway", "Single");

              String result;

              try {

                     result = doPost(urlName, paras);

                     System.out.println(result);

              } catch (IOException e) {

                     e.printStackTrace();

              }

       public static String doPost(String rlString,

                     Map<String, String> nameValuePairs) throws IOException {

              URL url = new URL(rlString);

              URLConnection connection = url.openConnection();

              connection.setDoOutput(true);

              PrintWriter out = new PrintWriter(connection.getOutputStream());

              boolean first = true;

              for (Map.Entry<String, String> pair : nameValuePairs.entrySet()) {

                     if (first)

                            first = false;

                     else

                            out.print('&');

                     String name = pair.getKey();

                     String value = pair.getValue();

                     out.print(name);

                     out.print('=');

                     out.print(URLEncoder.encode(value, "GB2312"));//UTF-8

              }

              out.close();
              Scanner in;

              StringBuffer response = new StringBuffer();

              try {

                     in = new Scanner(connection.getInputStream());

              } catch (IOException e) {

                     if (!(connection instanceof HttpURLConnection))

                            throw e;

                     InputStream err = ((HttpURLConnection) connection).getErrorStream();

                     if (err == null)

                            throw e;

                     in = new Scanner(err);

              }

              while (in.hasNextLine()) {

                     response.append(in.nextLine());

                     response.append("\n");

              }

              in.close();

              return response.toString();

       }

备注：

import java.io.*;import java.net.*;import java.util.*;

huc.setDoOutput(true);

// 设置为post方式

huc.setRequestMethod("POST");

huc.setRequestProperty("user-agent", "mozilla/4.7 [en] (win98; i)");

分享到：

java多线程并发访问解决方案 | Cookie在Web项目中的应用

2008-11-26 12:28
浏览 2407
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论