gaojin

浏览: 44585 次
性别:
来自: 湖北

最近访客更多访客>>

-dday

aiwoshanghe

lishijia

peng289047920

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

Jsoup

博客分类：

其他

Use selector-syntax to find elements

Problem

You want to find or manipulate elements using a CSS or jquery-like selector syntax.

Solution

Use the Element.select(String selector) and Elements.select(String selector) methods:

File input =newFile("/tmp/input.html");
Document doc =Jsoup.parse(input,"UTF-8","http://example.com/");

Elements links = doc.select("a[href]");// a with href
Elements pngs = doc.select("img[src$=.png]");
  // img with src ending .png

Element masthead = doc.select("div.masthead").first();
  // div with class=masthead

Elements resultLinks = doc.select("h3.r > a");// direct a after h3

Description

jsoup elements support a CSS (or jquery) like selector syntax to find matching elements, that allows very powerful and robust queries.

The select method is available in a Document, Element, or in Elements. It is contextual, so you can filter by selecting from a specific element, or by chaining select calls.

Select returns a list of Elements (as Elements), which provides a range of methods to extract and manipulate the results.

Selector overview

tagname: find elements by tag, e.g. a
ns|tag: find elements by tag in a namespace, e.g. fb|name finds <fb:name> elements
#id: find elements by ID, e.g. #logo
.class: find elements by class name, e.g. .masthead
[attribute]: elements with attribute, e.g. [href]
[^attr]: elements with an attribute name prefix, e.g. [^data-] finds elements with HTML5 dataset attributes
[attr=value]: elements with attribute value, e.g. [width=500]
[attr^=value], [attr$=value], [attr*=value]: elements with attributes that start with, end with, or contain the value, e.g. [href*=/path/]
[attr~=regex]: elements with attribute values that match the regular expression; e.g. img[src~=(?i)\.(png|jpe?g)]
*: all elements, e.g. *

Selector combinations

el#id: elements with ID, e.g. div#logo
el.class: elements with class, e.g. div.masthead
el[attr]: elements with attribute, e.g. a[href]
Any combination, e.g. a[href].highlight
ancestor child: child elements that descend from ancestor, e.g. .body p finds p elements anywhere under a block with class "body"
parent > child: child elements that descend directly from parent, e.g. div.content > p finds p elements; and body > * finds the direct children of the body tag
siblingA + siblingB: finds sibling B element immediately preceded by sibling A, e.g. div.head + div
siblingA ~ siblingX: finds sibling X element preceded by sibling A, e.g. h1 ~ p
el, el, el: group multiple selectors, find unique elements that match any of the selectors; e.g. div.masthead, div.logo

Pseudo selectors

:lt(n): find elements whose sibling index (i.e. its position in the DOM tree relative to its parent) is less than n; e.g. td:lt(3)
:gt(n): find elements whose sibling index is greater than n; e.g. div p:gt(2)
:eq(n): find elements whose sibling index is equal to n; e.g. form input:eq(1)
:has(seletor): find elements that contain elements matching the selector; e.g. div:has(p)
:not(selector): find elements that do not match the selector; e.g. div:not(.logo)
:contains(text): find elements that contain the given text. The search is case-insensitive; e.g. p:contains(jsoup)
:containsOwn(text): find elements that directly contain the given text
:matches(regex): find elements whose text matches the specified regular expression; e.g. div:matches((?i)login)
:matchesOwn(regex): find elements whose own text matches the specified regular expression
Note that the above indexed pseudo-selectors are 0-based, that is, the first element is at index 0, the second at 1, etc

See the Selector API reference for the full supported list and details.

http://jsoup.org/cookbook/extracting-data/selector-syntax

分享到：

Ubuntu10.10的网络配置 | vsftpd-2.2.2 centos6.4下简单配置

2014-01-13 14:30
浏览 838
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Jsoup

Use selector-syntax to find elements

Problem

Solution

Description

Selector overview

Selector combinations

Pseudo selectors

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Jsoup

Use selector-syntax to find elements

Problem

Solution

Description

Selector overview

Selector combinations

Pseudo selectors

评论

发表评论

相关推荐

MongoDB高级查询

Ubuntu解决重启后resolv.conf清空的问题

Ubuntu10.10的网络配置

vsftpd-2.2.2 centos6.4下简单配置

windows7批量设置文件权限命令

Windows Server 2008企业版支持大内存

radmin

几个常用正则表达式

PHP版本的认识

Windows开始→运行→输入的命令集锦

最近访客更多访客>>