java 用httpclient 抓取网页时，只能抓取第一页

0 0

java 用httpclient 抓取网页时，只能抓取第一页0

抓取页面 URL: http://moni.10jqka.com.cn/120313901

1. 我看到post参数有三个，于是配置

NameValuePair[] data =

{ new NameValuePair("endDate", "2012-06-04"),new NameValuePair("page", "1"),new NameValuePair("startDate", "2012-04-20")};

2. post

发现 post URL 为: http://moni.10jqka.com.cn/mncg/index/jyjl/120313901

调用函数getPostPage() 方法

String content = fetch.getPostPage(

                "http://moni.10jqka.com.cn/mncg/index/jyjl/120313901", data,
                cookie);

public String getPostPage(String postUrl, NameValuePair[] data, String cookie)throws Exception{

        PostMethod method = null;
        String contentStr = null;
        try
        {
            method = new PostMethod(postUrl);
            method.addRequestHeader("User-Agent", USER_AGENT);//
            method.addRequestHeader("Content-Type", CONTENT_TYPE);
            method.addRequestHeader("Host", "moni.10jqka.com.cn");
            method.addRequestHeader("X-Requested-With", "XMLHttpRequest");
            method.addRequestHeader("Referer","http://moni.10jqka.com.cn/120313901");
            method.getParams().setCookiePolicy(CookiePolicy.IGNORE_COOKIES);
            method.addRequestHeader("Cookie", cookie);

            method.setRequestBody(data);
            int statusCode = client.executeMethod(method);
            // HttpClient对于要求接受后继服务的请求，象POST和PUT等不能自动处理转发
            // 301或者302
            if (statusCode == HttpStatus.SC_MOVED_PERMANENTLY
                    || statusCode == HttpStatus.SC_MOVED_TEMPORARILY)
            {
                // 从头中取出转向的地址
                Header locationHeader = method.getResponseHeader("location");
                String location = null;
                if (locationHeader != null)
                {
                    location = locationHeader.getValue();
                    System.out
                            .println("The page was redirected to:" + location);
                } else
                {
                    System.err.println("Location field value is null.");
                }
                return "";
            }
            else
            {
                contentStr = new String(method.getResponseBodyAsString().getBytes("gbk"));
                System.out.println(contentStr);
            }
        } catch (Exception e)
        {
            log.error(e);
        } finally
        {
            if (method != null)
                method.releaseConnection();
        }
        return contentStr;
    }

3. 打印出结果，发现获得内容成功。

但是，我将post中的参数page 设为2 的时候，发现响应的还是 page=1 的内容，百思不得其解，期待您的解答