`
leonzhx
  • 浏览: 786172 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论
阅读更多

1.  Regular expressions are used to specify string patterns. You can use regular expressions whenever you need to locate strings that match a particular pattern.

 

2.  The following are a few straightforward constructs of RE syntax:

    1)  character class is a set of character alternatives, enclosed in brackets, such as [Jj], [0-9][A-Za-z], or [^0-9]. Here the - denotes a range (all characters whose Unicode values fall between the two bounds), and ^ denotes the complement (all characters except those specified).

    2)  To include a - inside a character class, make it the first or last item. To include a ], make it the first item. To include a ^, put it anywhere but the beginning. You only need to escape [ and \.

    3)  There are many predefined character classes such as \d (digits) or \p{Sc} (Unicode currency symbol).

    4)  Most characters match themselves.

    5)  The . symbol matches any character (except possibly line terminators, depending on flag settings).

    6)  Use \ as an escape character, for example, \. matches a period and \\ matches a backslash.

    7)  ^ and $ match the beginning and end of a line, respectively.

    8)  If X and Y are regular expressions, then XYmeans “any match for X followed by a match for Y”. X | Y means “any match for X or Y”.

    9)  You can apply quantifiers X+ (1 or more), X* (0 or more), and X? (0 or 1) to an expression X.

    10)  By default, a quantifier matches the largest possible repetition that makes the overall match succeed. You can modify that behavior with suffixes ? (reluctant, or stingy match: match the smallest repetition count) and +(possessive, or greedy match: match the largest count even if that makes the overall match fail). For example, the string cab matches [a-z]*ab but not [a-z]*+ab. In the first case, the expression [a-z]* only matches the character c, so that the characters ab match the remainder of the pattern. But the greedy version [a-z]*+ matches the characters cab, leaving the remainder of the pattern unmatched.

    11)  You can use groups to define subexpressions. Enclose the groups in ( ), for example, ([+-]?)([0-9]+). You can then ask the pattern matcher to return the match of each group or to refer back to a group with \n where n is the group number, starting with \1.



 


 
  

3.  For more information on the regular expression syntax, consult the API documentation for the Pattern class or the book Mastering Regular Expressions by Jeffrey E. F. Friedl (O’Reilly and Associates, 2006).

 

4.  The simplest use for a regular expression is to test whether a particular string matches it by calling String.matches method which invokes Pattern.matches(String regex, CharSequence input) method and this method actually is a shortcut for the following codes:

 

Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
return matcher.matches();

 

The input of the matcher is an object of any class that implements the CharSequence interface, such as a StringStringBuilder, or CharBuffer.

 

5.  When compiling the pattern, you can set one or more flags:

 

Pattern pattern = Pattern.compile(patternString,  Pattern.CASE_INSENSITIVE + Pattern.UNICODE_CASE);

 The following six flags are supported:

    1)  CASE_INSENSITIVE: Match characters independently of the letter case. By default, this flag takes only US ASCII characters into account.

    2)  UNICODE_CASE: When used in combination with CASE_INSENSITIVE, use Unicode letter case for matching.

    3)  MULTILINE^ and $ match the beginning and end of a line, not the entire input.

    4)  UNIX_LINES: Only recognize '\n' as a line terminator when matching ^ and $ in multiline mode.

    5)  DOTALL: Make the . symbol match all characters, including line terminators.

    6)  CANON_EQ: Take canonical equivalence of Unicode characters into account. For example, u followed by ¨ (diaeresis) matches ü.

 

6.  If the regular expression contains groups, the Matcher object can reveal the group boundaries. The methods

 

int start(int groupIndex)
int end(int groupIndex)

yield the starting index and the past-the-end index of a particular group. You can simply extract the matched string by calling

 

String group(int groupIndex)

Group 0 is the entire input; the group index for the first actual group is 1. Call the groupCount method to get the total group count. Nested groups are ordered by the opening parentheses.

 

7.  Usually, you don’t want to match the entire input against a regular expression, but to find one or more matching substrings in the input. Use the find method of the Matcher class to find the next match. If it returns true, use the start and end methods to find the extent of the match.

while (matcher.find())
{
   int start = matcher.start();
   int end = matcher.end();
   String match = input.substring(start, end);
   . . .
}

 

8.  The replaceAll method of the Matcher class replaces all occurrences of a regular expression with a replacement string:

Pattern pattern = Pattern.compile("[0-9]+");
Matcher matcher = pattern.matcher(input);
String output = matcher.replaceAll("#");

 

The replacement string can contain references to the groups in the pattern: $n is replaced with the nth group. Use \$ to include a $ character in the replacement text. If you have a string that may contain $ and \, and you don’t want them to be interpreted as group replacements, call matcher.replaceAll(Matcher.quoteReplacement(str)). The replaceFirst method replaces only the first occurrence of the pattern.

 

9.  The Pattern class has a split method that splits an input into an array of strings, using the regular expression matches as boundaries.

  • 大小: 163.7 KB
  • 大小: 125.5 KB
分享到:
评论

相关推荐

    Jeffrey E. F. Friedl - Mastering.Regular.Expressions.3rd.Edition

    Mastering Regular Expressions, Third Edition, now includes a full chapter devoted to PHP and its powerful and expressive suite of regular expression functions, in addition to enhanced ...

    Learn.C++.Programming.Language.Become.A.Complete.C++.Programmer.pdf

    Regular Expressions Chapter 38. I/O Streams Chapter 39. Locales Chapter 40. Numerics Chapter 41. Concurrency Chapter 42. Threads and Tasks Chapter 43. The C Standard Library Chapter 44. Compatibility

    JavaScript The Good Parts

    Regular Expressions Chapter 8. Methods Chapter 9. Style Chapter 10 Beautiful Features Appendix A. Awful Parts Appendix B. Bad Parts Appendix C. JSLint Appendix D. Syntax Diagrams Appendix E.JSON ...

    Computer-Based.Problem.Solving.Process

    Algorithmic Expression of a Hardware System Chapter 8. Using Computers to Solve Problems Part 3 Software Tools Supporting Program Execution Chapter 9. Computer Process Manipulation by Programs ...

    Speaking JavaScript

    JavaScript in depth: Learn details of ECMAScript 5, from syntax, variables, functions, and object-oriented programming to regular expressions and JSON with lots of examples. Pick a topic and jump in. ...

    Coding.Interview.Questions.3rd.Edition.epub )

    Chapter 1. Programming Basics Chapter 2. Design Interview Questions Chapter 3. Operating System Concepts Chapter 4. Computer Networking Basics Chapter 5. Database Concepts Chapter 6. Algorithms ...

    Packt.PostgreSQL.10.High.Performance.2018

    Chapter 1. PostgreSQL Versions Chapter 2. Database Hardware Chapter 3. Database Hardware Benchmarking Chapter 4. Disk Setup Chapter 5. Memory for Database Caching Chapter 6. Server Configuration ...

    PostgreSQL 10 High Performance

    Chapter 1. PostgreSQL Versions Chapter 2. Database Hardware Chapter 3. Database Hardware Benchmarking Chapter 4. Disk Setup Chapter 5. Memory for Database Caching Chapter 6. Server Configuration ...

    Advanced Linux Networking.

    Chapter 1. Kernel Network Configuration Chapter 2. TCP/IP Network Configuration Chapter 3. Alternative Network Stacks Chapter 4. Starting Servers Chapter 5. Configuring Other Computers via DHCP ...

    Get Programming with Go

    Chapter 1. Get Ready, Get Set, Go Unit 1 - IMPERATIVE PROGRAMMING Chapter 1. A Glorified Calculator Chapter 2. Loops And Branches Chapter 3. Variable Scope Chapter 4. Capstone: Ticket To Mars Unit 2...

    计算机仿真技术-Chapter8..pptx

    计算机仿真技术-Chapter8..pptx

    Chapter41.slides.html

    Chapter41.slides.html

    chapter03.R.r

    chapter03.R.r

    Chapter 01.pdf

    SQL Expert Exam Guide Chapter 01.pdf

    SEO.Decoded.1523887842.epub

    Chapter 1. Introduction Chapter 2. Who Is This Book For & What Am I Going To Teach? Chapter 3. Part #1 – Keyword Research Strategies Chapter 4. Understanding The Importance Of Relevancy Chapter 5. ...

    Android Programming: The Big Nerd Ranch Guide, 3rd Edition

    Chapter 1. Your First Android Application Chapter 2. Android and Model-View-Controller Chapter 3. The Activity Lifecycle Chapter 4. Debugging Android Apps Chapter 5. Your Second Activity Chapter 6. ...

    Addison.Wesley.Test-Driven.iOS.Development

    Chapter 1. About Software Testing and Unit Testing Chapter 2. Techniques for Test-Driven Development Chapter 3. How to Write a Unit Test Chapter 4. Tools for Testing Chapter 5. Test-Driven Development...

Global site tag (gtag.js) - Google Analytics