`
ytfei
  • 浏览: 90564 次
社区版块
存档分类
最新评论

正则表达式笔记(二)

阅读更多

转:http://www.regular-expressions.info/refadv.html

Regular Expression Advanced Syntax Reference

Grouping and Backreferences Syntax Description Example Modifiers Syntax Description Example Atomic Grouping and Possessive Quantifiers Syntax Description Example Lookaround Syntax Description Example Continuing from The Previous Match Syntax Description Example Conditionals Syntax Description Example Comments Syntax Description Example
(regex) Round brackets group the regex between them. They capture the text matched by the regex inside them that can be reused in a backreference, and they allow you to apply regex operators to the entire grouped regex. (abc){3} matches abcabcabc . First group matches abc .
(?:regex) Non-capturing parentheses group the regex so you can apply regex operators, but do not capture anything and do not create backreferences. (?:abc){3} matches abcabcabc . No groups.
\1 through \9 Substituted with the text matched between the 1st through 9th pair of capturing parentheses. Some regex flavors allow more than 9 backreferences. (abc|def)=\1 matches abc=abc or def=def , but not abc=def or def=abc .
(?i) Turn on case insensitivity for the remainder of the regular expression. (Older regex flavors may turn it on for the entire regex.) te(?i)st matches teST but not TEST .
(?-i) Turn off case insensitivity for the remainder of the regular expression. (?i)te(?-i)st matches TEst but not TEST .
(?s) Turn on "dot matches newline" for the remainder of the regular expression. (Older regex flavors may turn it on for the entire regex.)  
(?-s) Turn off "dot matches newline" for the remainder of the regular expression.  
(?m) Caret and dollar match after and before newlines for the remainder of the regular expression. (Older regex flavors may apply this to the entire regex.)  
(?-m) Caret and dollar only match at the start and end of the string for the remainder of the regular expression.  
(?x) Turn on free-spacing mode to ignore whitespace between regex tokens, and allow # comments.  
(?-x) Turn off free-spacing mode.  
(?i-sm) Turns on the options "i" and "m", and turns off "s" for the remainder of the regular expression. (Older regex flavors may apply this to the entire regex.)  
(?i-sm:regex) Matches the regex inside the span with the options "i" and "m" turned on, and "s" turned off. (?i:te)st matches TEst but not TEST .
(?>regex) Atomic groups prevent the regex engine from backtracking back into the group (forcing the group to discard part of its match) after a match has been found for the group. Backtracking can occur inside the group before it has matched completely, and the engine can backtrack past the entire group, discarding its match entirely. Eliminating needless backtracking provides a speed increase. Atomic grouping is often indispensable when nesting quantifiers to prevent a catastrophic amount of backtracking as the engine needlessly tries pointless permutations of the nested quantifiers. x(?>\w+)x is more efficient than x\w+x if the second x cannot be matched.
?+ , *+ , ++ and {m,n}+ Possessive quantifiers are a limited yet syntactically cleaner alternative to atomic grouping. Only available in a few regex flavors. They behave as normal greedy quantifiers, except that they will not give up part of their match for backtracking. x++ is identical to (?>x+)
(?=regex) Zero-width positive lookahead. Matches at a position where the pattern inside the lookahead can be matched. Matches only the position. It does not consume any characters or expand the match. In a pattern like one(?=two)three , both two and three have to match at the position where the match of one ends. t(?=s) matches the second t in streets .
(?!regex) Zero-width negative lookahead. Identical to positive lookahead, except that the overall match will only succeed if the regex inside the lookahead fails to match. t(?!s) matches the first t in streets .
(?<=regex) Zero-width positive lookbehind. Matches at a position if the pattern inside the lookahead can be matched ending at that position (i.e. to the left of that position). Depending on the regex flavor you're using, you may not be able to use quantifiers and/or alternation inside lookbehind. (?<=s)t matches the first t in streets .
(?<!regex) Zero-width negative lookbehind. Matches at a position if the pattern inside the lookahead cannot be matched ending at that position. (?<!s)t matches the second t in streets .
\G Matches at the position where the previous match ended, or the position where the current match attempt started (depending on the tool or regex flavor). Matches at the start of the string during the first match attempt. \G[a-z] first matches a , then matches b and then fails to match in ab_cd .
(?(?=regex)then|else) If the lookahead succeeds, the "then" part must match for the overall regex to match. If the lookahead fails, the "else" part must match for the overall regex to match. Not just positive lookahead, but all four lookarounds can be used. Note that the lookahead is zero-width, so the "then" and "else" parts need to match and consume the part of the text matched by the lookahead as well. (?(?<=a)b|c) matches the second b and the first c in babxcac
(?(1)then|else) If the first capturing group took part in the match attempt thus far, the "then" part must match for the overall regex to match. If the first capturing group did not take part in the match, the "else" part must match for the overall regex to match. (a)?(?(1)b|c) matches ab , the first c and the second c in babxcac
(?#comment) Everything between (?# and ) is ignored by the regex engine. a(?#foobar)b matches ab
分享到:
评论

相关推荐

    正则表达式笔记归纳

    #### 二、正则表达式的语法元素 正则表达式由一系列字符和特殊符号组成,用以匹配字符串中的特定模式。下面是一些基本的语法元素: 1. **字面量**:直接匹配一个或多个字符。 - 示例:`/xx/` 表示匹配字符串“xx...

    6正则表达式笔记[借鉴].pdf

    正则表达式是一种强大的文本处理工具,用于在字符串中进行模式匹配、搜索、替换和提取信息。它在软件开发中广泛应用于数据验证、文本处理、输入校验等场景。正则表达式通过特殊的语法和运算符,允许程序员构建灵活且...

    [小小明]Python正则表达式全套笔记v0.3(1.8万字干货)

    Python正则表达式全套笔记v0.3 本文档是小小明个人笔记,涵盖了正则表达式的各个方面,包括各种模式、分组、断言、匹配、查找、替换和切割等。文档中提供了详细的正则匹配规则表,涵盖了基本字符规则、预定义字符集...

    正则表达式笔记

    正则表达式是一种强大的文本处理工具,用于在字符串中进行模式匹配、查找和替换操作。在Java中,正则表达式主要涉及到`java.lang.String`、`java.util.regex.Pattern`和`java.util.regex.Matcher`这三个类。下面我们...

    java正则表达式学习笔记

    #### 二、Java正则表达式基本语法 在Java中使用正则表达式前,需要了解一些基本的语法符号: - **特殊字符**:`^` 表示字符串的开始;`$` 表示字符串的结束;`.` 表示任意单个字符。 - **量词**:`*` 表示前面的...

    Python正则表达式笔记

    Python正则表达式笔记 正则表达式是 Python 中的一种强大工具,用于匹配和处理字符串。下面是 Python 正则表达式笔记中的一些重要知识点: 1. 导入模块:在使用正则表达式之前,需要导入 re 模块。import re 2. ...

    正则表达式学习笔记

    ### 正则表达式学习笔记 #### 一、正则表达式概述 正则表达式是一种强有力的模式匹配工具,广泛应用于各种编程语言中,用于文本处理。正则表达式允许用户定义复杂的查找模式,这对于数据验证、搜索和替换操作特别...

    javascript正则表达式学习笔记

    这篇学习笔记将深入探讨JavaScript正则表达式的概念、语法和实际应用。 一、正则表达式基础 1. 创建正则表达式: - 字面量表示法:`/pattern/flags` - 构造函数:`new RegExp('pattern', 'flags')` 2. 常见的...

    正则表达式笔记.docx

    python正则表达式笔记

    js正则表达式笔记,直接运行

    标题与描述中的“js正则表达式笔记,直接运行”明确指出这是一份关于JavaScript正则表达式的笔记,其中包含了可以直接执行的代码示例。正则表达式在编程中是一种非常强大的工具,用于处理字符串模式匹配、搜索和替换...

    超经典正则表达式测试工具

    4. **学习正则的学习笔记**:可能包含一份详细的正则表达式学习资料,涵盖了基础概念如元字符、量词、字符类等,以及高级特性如分组、后向引用、预查等,方便用户系统学习。 5. **正则表达式参考手册**:可能提供了...

    基于java的开发源码-java多线程反射泛型及正则表达式学习笔记和源码.zip

    基于java的开发源码-java多线程反射泛型及正则表达式学习笔记和源码.zip 基于java的开发源码-java多线程反射泛型及正则表达式学习笔记和源码.zip 基于java的开发源码-java多线程反射泛型及正则表达式学习笔记和源码....

    正则表达式大全笔记总结

    ### 正则表达式大全笔记总结 #### 一、引言 正则表达式是一种用于匹配字符串的强大工具,在数据处理、文本分析等场景下极为常见。本文将对几个常见的正则表达式应用场景进行总结,包括中国电话号码验证、邮政编码...

    正则表达式系统教程.RAR

    正则表达式是一种强大的文本处理工具,用于在字符串中进行模式匹配和搜索替换。它在编程、数据分析、文本挖掘等领域有着广泛的应用。本教程旨在帮助你深入理解和熟练掌握正则表达式,通过学习,你可以有效地查找、...

    python正则表达式详解笔记,python正则表达式教学.doc

    #### 二、Python中的正则表达式基础 在Python中使用正则表达式时,需要导入`re`模块。这个模块提供了所有必要的正则表达式功能。 - **转义字符问题**:在正则表达式中,`\`常被用作转义字符。这意味着,如果你想要...

    正则表达式笔记+源码!!!!!!

    正则表达式(Regular Expression,简称regex)是用于匹配字符串的一种模式,广泛应用于文本处理、数据验证、搜索替换等场景。在IT行业中,熟练掌握正则表达式是提高工作效率的关键技能之一。 首先,我们来看看...

    JS正则表达式入门笔记实例

    这篇入门笔记实例将带你深入了解正则表达式的使用。 1. **正则表达式基础** - **模式定义**:正则表达式由特殊字符(元字符)和普通字符组成,用于描述文本模式。 - **创建方式**:可以使用`/pattern/flags`或`...

Global site tag (gtag.js) - Google Analytics