regexp 简单运用

xushaoxun

浏览: 53151 次
性别:
来自: 厦门

最近访客更多访客>>

萧仁武

sn200837

楼兰雪

wanshi724

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

java

C C++C#Gmail F#

        //split
       String s = "hello world i am 23 years";
       String[] ss = s.split("\\s");

       //replace
       s.replaceAll("\\s", "#");
       s.replaceFirst("\\d", "#");

       //greedy(default) & reluctant
       String greedy = "\\d*";
       String lazy = "\\d*?";

        //Pattern & Matcher
        String phone = "my phone: 13352135478. Hers is 15984563215. call us later.";
        Pattern p = Pattern.compile("\\d{11}");
        Matcher m = p.matcher(phone);
        while(m.find()) {
            System.out.printf("find: %s, start: %d, end: %d\n", m.group(), m.start(), m.end());
        }

        //把整个input和regexp匹配, 类似Pattern.matches(regex, input);
       boolean wholeMatch = Pattern.matches("\\d{4}-\\d{7}", "0592-6103687");

        //groups
        String tel = "my tel is 0592-6103625. call me at 12:00";
        Pattern p2 = Pattern.compile("(\\d{4})-(\\d{7})");
        Matcher m2 = p2.matcher(tel);
        while(m2.find()) {
            int count = m2.groupCount();
            for(int i=0; i<=count; i++) {
                System.out.printf("group: %s. start: %d, end: %d\n", m2.group(i), m2.start(i), m2.end(i));
            }
        }

        //静态方法
        Pattern.matches("\\d{4}", "1125f");     //false

Pattern.matches("\\d{4}", "1125"); //true

//scanner
Scanner scanner = new Scanner("xushaoxun@gmail.com mail me if you have time");

System.out.println(scanner.next("(\\w+)@(\\w+)\\.(\\w{3})"));

Regular Expression

Character Classes

Character Classes
[abc]	a, b, or c (simple class )
[^abc]	Any character except a, b, or c (negation )
[a-zA-Z]	a through z, or A through Z, inclusive (range )
[a-d[m-p]]	a through d, or m through p: [a-dm-p] (union )
[a-z&&[def]]	d, e, or f (intersection )
[a-z&&[^m-p]]	a through z, and not m through p: [a-lq-z] (subtraction )

Predefined Character Classes

Predefined Character Classes
.	Any character (may or may not match line terminators)
\d	A digit: [0-9]
\D	A non-digit: [^0-9]
\s	A whitespace character: [ \t\n\x0B\f\r]
\S	A non-whitespace character: [^\s]
\w	A word character: [a-zA-Z_0-9]
\W	A non-word character: [^\w]

Quantifiers

Quantifiers	Meaning
Greedy	Reluctant	Possessive
X?	X??	X?+	X , once or not at all
X*	X*?	X*+	X , zero or more times
X+	X+?	X++	X , one or more times
X{n}	X{n}?	X{n}+	X , exactly n times
X{n,}	X{n,}?	X{n,}+	X , at least n times
X{n,m}	X{n,m}?	X{n,m}+	X , at least n but not more than m times

Capturing Groups

In the expression ((A)(B(C))) , for example, there are four such groups:

((A)(B(C)))
(A)
(B(C))
(C)

There is also a special group, group 0, which always represents the entire expression.

group function

public int start(int group)

public int end(int group)

public String group(int group)

Backreferences

The section of the input string matching the capturing group(s) is saved in memory for later recall via backreference . A backreference is specified in the regular expression as a backslash (\ ) followed by a digit indicating the number of the group to be recalled. To match any 2 digits, followed by the exact same two digits, you would use (\d\d)\1 as the regular expression

Boundary Matchers

Boundary Matchers
^	The beginning of a line
$	The end of a line
\b	A word boundary
\B	A non-word boundary
\A	The beginning of the input
\G	The end of the previous match
\Z	The end of the input but for the final terminator, if any
\z	The end of the input

Pattern class

static method

Pattern.matches (String regex, CharSequence input) ;

Pattern.compile (String regex, int flags) ;

instance method

Matcher matcher = pattern.matcher( CharSequence input );

pattern.split( CharSequence input) ;

java.lang.String equivalence

str.matches(regex);

String[] str.split(regex);

str.replace(regex, replacement);

Matcher class

Index Methods

Index methods provide useful index values that show precisely where the match was found in the input string:

public int start() : Returns the start index of the previous match.
public int start(int group) : Returns the start index of the subsequence captured by the given group during the previous match operation.
public int end() : Returns the offset after the last character matched.
public int end(int group) : Returns the offset after the last character of the subsequence captured by the given group during the previous match operation.

Study Methods

Study methods review the input string and return a boolean indicating whether or not the pattern is found.

public boolean lookingAt() : Attempts to match the input sequence, starting at the beginning of the region, against the pattern.
public boolean find() : Attempts to find the next subsequence of the input sequence that matches the pattern.
public boolean find(int start) : Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.
public boolean matches() : Attempts to match the entire region against the pattern.

Replacement Methods

Replacement methods are useful methods for replacing text in an input string.

public Matcher appendReplacement(StringBuffer sb, String replacement) : Implements a non-terminal append-and-replace step.
public StringBuffer appendTail(StringBuffer sb) : Implements a terminal append-and-replace step.
public String replaceAll(String replacement) : Replaces every subsequence of the input sequence that matches the pattern with the given replacement string.
public String replaceFirst(String replacement) : Replaces the first subsequence of the input sequence that matches the pattern with the given replacement string.
public static String quoteReplacement(String s) : Returns a literal replacement String for the specified String . This method produces a String that will work as a literal replacement s in the appendReplacement method of the Matcher class. The String produced will match the sequence of characters in s treated as a literal sequence. Slashes ('\' ) and dollar signs ('$' ) will be given no special meaning.

java.lang.String equivalence

str.replaceFirst(regex, replacement);

str.replaceAll(regex, replacement);

Some example:

Use start() and end()

Pattern pat = Pattern.compile (regex );

Matcher mat = pat.matcher(input);

while (mat.find()) {

System. out .println(mat.start());

System. out .println(mat.end());

}

Use matches() and lookAt()

String input = "ofoooooooooooo" ;

String regex = "foo" ;

Pattern pat = Pattern.compile (regex);

Matcher mat = pat.matcher(input);

System. out .println (mat.lookingAt()); //true

System. out .println (mat.matches()); //false

Use appendReplacement() and appendTail()

Pattern p = Pattern.compile("cat");

Matcher m = p.matcher("a cat");

StringBuffer sb = new StringBuffer();

while (m.find()) {

m.appendReplacement(sb, "dog");

}

m.appendTail(sb);

System.out.println(sb); //just like “a cat”.replaceAll(“..”);

分享到：

Class简单使用 | ImageServlet

2008-06-23 10:37
浏览 1339
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论