Htmlparser Filter 简要归纳(转) -

李丹.杭州

浏览: 16087 次
性别:
来自: 杭州

最近访客更多访客>>

woodding2008

chenhaibo0806999

博主相关

博客

微博

相册

留言

关于我

文章分类

全部博客 (16)

社区版块

存档分类

Htmlparser Filter 简要归纳(转)

正则表达式

1 . 逻辑关系：与或非
AndFilter()
          Creates a new instance of an AndFilter.
AndFilter(NodeFilter[] predicates)
          Creates an AndFilter that accepts nodes acceptable to all given filters.
AndFilter(NodeFilter left, NodeFilter right)
          Creates an AndFilter that accepts nodes acceptable to both filters.

OrFilter()
          Creates a new instance of an OrFilter.
OrFilter(NodeFilter[] predicates)
          Creates an OrFilter that accepts nodes acceptable to any of the given filters.
OrFilter(NodeFilter left, NodeFilter right)
          Creates an OrFilter that accepts nodes acceptable to either filter.

OrFilter()
          Creates a new instance of an OrFilter.
OrFilter(NodeFilter[] predicates)
          Creates an OrFilter that accepts nodes acceptable to any of the given filters.
OrFilter(NodeFilter left, NodeFilter right)
          Creates an OrFilter that accepts nodes acceptable to either filter.

2. 内容

StringFilter：功能简单有限；复杂功能可使用RegexFilter (正则表达式)

StringFilter()
          Creates a new instance of StringFilter that accepts all string nodes.
StringFilter(String pattern)
          Creates a StringFilter that accepts text nodes containing a string.
StringFilter(String pattern, boolean sensitive)
          Creates a StringFilter that accepts text nodes containing a string.
StringFilter(String pattern, boolean sensitive, Locale locale)
          Creates a StringFilter that accepts text nodes containing a string.

RegexFilter()
          Creates a new instance of RegexFilter that accepts string nodes matching the regular expression ".*" using the FIND strategy.
RegexFilter(String pattern)
          Creates a new instance of RegexFilter that accepts string nodes matching a regular expression using the FIND strategy.
RegexFilter(String pattern, int strategy)
          Creates a new instance of RegexFilter that accepts string nodes matching a regular expression.

3 标签

TagNameFilter()利用标签名过滤 : div ,img , ...

NodeClassFilter()利用标签类别：LinkTag.class ...

HasAttributeFilter()利用属性：HasAttributeFilter(“class”, “className”)

LinkRegexFilter（）用正则表达式匹配链接

TagNameFilter()
          Creates a new instance of TagNameFilter.
TagNameFilter(String name)
          Creates a TagNameFilter that accepts tags with the given name.

NodeClassFilter()
          Creates a NodeClassFilter that accepts Html tags.
NodeClassFilter(Class cls)
          Creates a NodeClassFilter that accepts tags of the given class.
HasAttributeFilter()
          Creates a new instance of HasAttributeFilter.
HasAttributeFilter(String attribute)
          Creates a new instance of HasAttributeFilter that accepts tags with the given attribute.
HasAttributeFilter(String attribute, String value)
          Creates a new instance of HasAttributeFilter that accepts tags with the given attribute and value.
LinkRegexFilter(String regexPattern)
          Creates a LinkRegexFilter that accepts LinkTag nodes containing a URL that matches the supplied regex pattern.
LinkRegexFilter(String regexPattern, boolean caseSensitive)
          Creates a LinkRegexFilter that accepts LinkTag nodes containing a URL that matches the supplied regex pattern.
LinkStringFilter(String pattern)
          Creates a LinkStringFilter that accepts LinkTag nodes containing a URL that matches the supplied pattern.
LinkStringFilter(String pattern, boolean caseSensitive)
          Creates a LinkStringFilter that accepts LinkTag nodes containing a URL that matches the supplied pattern.

4 层次关系

HasParentFilter()
          Creates a new instance of HasParentFilter.
HasParentFilter(NodeFilter filter)
          Creates a new instance of HasParentFilter that accepts nodes with the direct parent acceptable to the filter.
HasParentFilter(NodeFilter filter, boolean recursive)
          Creates a new instance of HasParentFilter that accepts nodes with a parent acceptable to the filter.

HasChildFilter()
          Creates a new instance of a HasChildFilter.
HasChildFilter(NodeFilter filter)
          Creates a new instance of HasChildFilter that accepts nodes with a direct child acceptable to the filter.
HasChildFilter(NodeFilter filter, boolean recursive)
          Creates a new instance of HasChildFilter that accepts nodes with a child acceptable to the filter.

分享到：