- 浏览: 242161 次
- 性别:
- 来自: 杭州
文章分类
- 全部博客 (173)
- ruby (38)
- rails (42)
- javascript (7)
- jquery (1)
- linux (15)
- design patterns (1)
- project management (6)
- IT (7)
- life (19)
- data structures and algorithm analysis (2)
- css (1)
- prototype (1)
- mysql (4)
- html (1)
- git (3)
- novels (1)
- c (1)
- Latex (13)
- erlang (1)
- 求职 (1)
- API (0)
- Shell (4)
- Rabbit MQ (1)
- 计算机基础 (1)
- svn (2)
- 疑问 (1)
最新评论
-
zhangyou1010:
回去倒立去,哈哈。
作为一个程序员,身体很重要! -
Hooopo:
Ruby MetaProgramming is all abo ...
Metaprogramming Ruby -
orcl_zhang:
yiqi1943 写道LZ现在上学还是工作呢工作好多年了。不过 ...
2011年 -
yiqi1943:
LZ现在上学还是工作呢
2011年 -
tjcjc:
query cache
就是一个简单的hash
key就是sq ...
Rails sql延迟加载和自带缓存
Regular expressions (“regexps”) match strings.
/abc/ =~ "abc"
When a match is successful, the return value
0
֒→ is the position of the first matching character.
An if construct will count a successful match as
puts 'match' if /abc/ =~ "abc"
true.
match
֒→
The matching substring can be anywhere in the
/abc/ =~ "cbaabc"
string.
3
֒→
When the string doesn’t match, the result is nil.
/abc/ =~ "ab!c"
nil
֒→
There may be more than one match in the string.
/abc/ =~ "abc and abc"
Matching always returns the index of the first
0
֒→ match.
Case matters.
/cow/ =~ "Cow"
nil
֒→
The regular expression doesn’t have to be on the
"foofarah" =~ /foo/
left.
0
֒→
10.1 Special Characters
You can anchor the match to the beginning of
/^abc/ =~ "!abc"
the string with ˆ (the caret character, sometimes
nil
֒→ called “hat”).
You can also anchor the match to the end
/abc$/ =~ "abc!"
of the string with a dollar sign character,
nil
֒→ often abbreviated “dollar.” Special characters
like the caret and dollar are what make regular
expressions more powerful than something like
"string".include?("ing").
\d Any digit
\D Any character except a digit
\s “whitespace”: space, tab, carriage return, line feed, or newline
\S Anything except whitespace
\w A “word character”: [A-Za-z0-9_]
\W Any character except a word character
Figure 10.1: Character Classes
A period (“dot”) matches any character.
/a.c/ =~ "does abc match?"
5
֒→
The asterisk character (“star”) matches any
/ab*c/ =~ "does abbbbc match?"
number of occurrences of the character preced-
5
֒→ ing it.
“Any number” includes zero.
/ab*c/ =~ "does ac match?"
5
֒→
Frequently, you’ll want to match one or more
/ab+c/ =~ "does ac match?"
occurrence but not zero. That’s done with the
nil
֒→ plus character.
The question mark character matches zero or
/ab?c/ =~ "does ac match?"
one occurrences but not more than one.
5
֒→
Special characters can be combined. The com-
/a.*b/ =~ "a ! b ! i j k b"
bination of a dot and star is used to match any
0
֒→ number of any kind of character.
To match all characters in a character class,
/[0123456789]+/ =~ "number 55"
enclose them within square brackets.
7
֒→
Character classes containing alphabetically
/[0-9][a-f]/ =~ "5f"
ordered runs of characters can be abbreviated
0
֒→ with the dash.
Within brackets, characters like the dot, plus,
/[.]/ =~ "b"
and star are not special.
nil
֒→
Outside of brackets, special characters can be
/\[a\]\+/ =~ "[a]+"
stripped of their powers by “escaping” them with
0
֒→ a backslash.
To include open and close brackets inside of
/^[\[=\]]+$/ =~ '=]=[='
brackets, escape them with a backslash. This
0
֒→ expression matches any sequence of one or more
characters, all of which must be either [, ], or =.
(The two anchors ensure that there are no char-
acters before or after the matching characters.)
Putting a caret at the beginning of a character
/[^ab]/ =~ "z"
class causes the set to contain all characters
0
֒→ except the ones listed.
Some character classes are so common they’re
/=\d=[x\d]=/ =~ "=5=x="
given abbreviations. \d is the same character
0
֒→ class as [0-9]. Other characters can be added
to the abbreviation, in which case brackets are
needed. See Figure 10.1, on the previous page,
for a complete list of abbreviations.
10.2 Grouping and Alternatives
Parentheses can group sequences of characters
/(ab)+/ =~ "ababab"
so that special characters apply to the whole
0
֒→ sequence.
Special characters can appear within groups.
/(ab*)+/ =~ "aababbabbb"
Here, the group containing one a and any num-
0
֒→ ber of b’s is repeated one or more times.
The vertical bar character is used to allow alter-
/a|b/ =~ "a"
natives. Here, either a or b match.
0
֒→
A vertical bar divides the regular expression into
/^Fine birds|cows ate\.$/ =~
two smaller regular expressions. A match means
"Fine birds ate seeds."
that either the entire left regexp matches or the
0
֒→ entire right one does.
This regular expression does not mean “Match
either 'Fine birds ate.' or 'Fine cows ate.'” It actu-
ally matches either a string beginning with "Fine
birds" or one ending in "cows ate."
This regular expression matches only the two
/^Fine (birds|cows) ate\.$/ =~
alternate sentences, not the infinite number of
"Fine birds ate seeds."
possibilities the previous example’s regexp does.
nil
֒→
10.3 Taking Strings Apart
Like the =~ operator, match returns nil if there’s
re = /(\w+), (\w+), or (\w+)/
no match. If there is, it returns a MatchData
s = 'Without a Bob, ox, or bin!'
object. You can pull information out of that
match = re.match(s)
object.
֒→ #<MatchData:0x323c44>
A MatchData is indexable. Its zeroth element is
match[0]
the entire match.
֒→ "Bob, ox, or bin"
Each following element stores the result of what
match[1]
a group matched, counting from left to right.
֒→ "Bob"
Groups are often used to pull apart strings and
"#{match[3]} and #{match[1]}"
construct new ones.
֒→ "bin and Bob"
pre_match returns any portion of the string
match.pre_match
before the part that matched.
֒→ "Without a "
post_match returns any portion of the string
match.post_match
after the part that matched. match.pre_match,
֒→ "!" match[0], and match.post_match can be added
together to reconstruct the original string.
The plus and star special characters are greedy:
str = "a bee in my bonnet"
they match as many characters as they can.
/a.*b/.match(str)[0]
Expect that to catch you by surprise sometimes.
֒→ "a bee in my b"
You can make plus and star match as few char-
/a.*?b/.match(str)[0]
acters as they can by suffixing them with a ques-
֒→ "a b" tion mark.
You can use a regular expression to slice a
"has 5 and 3" [/\d+/]
string. The result is the first substring that
֒→ "5" matches the regular expression.
10.4 Variables Behind the Scenes
Both =~ and match set some variables. All begin
re = /(\w+), (\w+), or (\w+)/
with $. Each parenthesized group gets its own
s = 'Without a Bob, ox, or bin!'
number, from $1 up through $9. You might
re =~ s
expect $0 to name the entire string that matched,
[$1, $2, $3]
but it’s already used for something else: the
֒→ ["Bob" , "ox" , "bin" ] name of the program being executed.
$& is the equivalent of match[0].
$&
֒→ "Bob, ox, or bin"
These two variables are used to store the string
$‘ + $'
before the match and the string after the match.
֒→ "Without a !" (The first is a backward quote / backtick; the
second a normal quote.)
These variables are probably most often used to immediately do some-
thing with a string that’s “equal enough” to some pattern. Like this:
if name =~ /(.+), (.+)/
name = "#{$2} #{$1}"
end
10.5 Regular Expression Options
Normally, the period in a regular expression
/a.*b/ =~ "az\nzb"
does not match the end-of-line character. There-
nil
֒→ fore, .* or .+ matches won’t span lines.
Adding the m (multiline) option makes a period
/a.*b/m =~ "az\nzb"
match end-of-line characters, so the regular
0
֒→ expression match can span lines.
This is a far too annoying way to do a case-
/[cC][aA][tT]/ =~ "Cat"
insensitive match.
0
֒→
The i (insensitive) option is a better way.
/cat/i =~ "Cat"
0
֒→
/abc/ =~ "abc"
When a match is successful, the return value
0
֒→ is the position of the first matching character.
An if construct will count a successful match as
puts 'match' if /abc/ =~ "abc"
true.
match
֒→
The matching substring can be anywhere in the
/abc/ =~ "cbaabc"
string.
3
֒→
When the string doesn’t match, the result is nil.
/abc/ =~ "ab!c"
nil
֒→
There may be more than one match in the string.
/abc/ =~ "abc and abc"
Matching always returns the index of the first
0
֒→ match.
Case matters.
/cow/ =~ "Cow"
nil
֒→
The regular expression doesn’t have to be on the
"foofarah" =~ /foo/
left.
0
֒→
10.1 Special Characters
You can anchor the match to the beginning of
/^abc/ =~ "!abc"
the string with ˆ (the caret character, sometimes
nil
֒→ called “hat”).
You can also anchor the match to the end
/abc$/ =~ "abc!"
of the string with a dollar sign character,
nil
֒→ often abbreviated “dollar.” Special characters
like the caret and dollar are what make regular
expressions more powerful than something like
"string".include?("ing").
\d Any digit
\D Any character except a digit
\s “whitespace”: space, tab, carriage return, line feed, or newline
\S Anything except whitespace
\w A “word character”: [A-Za-z0-9_]
\W Any character except a word character
Figure 10.1: Character Classes
A period (“dot”) matches any character.
/a.c/ =~ "does abc match?"
5
֒→
The asterisk character (“star”) matches any
/ab*c/ =~ "does abbbbc match?"
number of occurrences of the character preced-
5
֒→ ing it.
“Any number” includes zero.
/ab*c/ =~ "does ac match?"
5
֒→
Frequently, you’ll want to match one or more
/ab+c/ =~ "does ac match?"
occurrence but not zero. That’s done with the
nil
֒→ plus character.
The question mark character matches zero or
/ab?c/ =~ "does ac match?"
one occurrences but not more than one.
5
֒→
Special characters can be combined. The com-
/a.*b/ =~ "a ! b ! i j k b"
bination of a dot and star is used to match any
0
֒→ number of any kind of character.
To match all characters in a character class,
/[0123456789]+/ =~ "number 55"
enclose them within square brackets.
7
֒→
Character classes containing alphabetically
/[0-9][a-f]/ =~ "5f"
ordered runs of characters can be abbreviated
0
֒→ with the dash.
Within brackets, characters like the dot, plus,
/[.]/ =~ "b"
and star are not special.
nil
֒→
Outside of brackets, special characters can be
/\[a\]\+/ =~ "[a]+"
stripped of their powers by “escaping” them with
0
֒→ a backslash.
To include open and close brackets inside of
/^[\[=\]]+$/ =~ '=]=[='
brackets, escape them with a backslash. This
0
֒→ expression matches any sequence of one or more
characters, all of which must be either [, ], or =.
(The two anchors ensure that there are no char-
acters before or after the matching characters.)
Putting a caret at the beginning of a character
/[^ab]/ =~ "z"
class causes the set to contain all characters
0
֒→ except the ones listed.
Some character classes are so common they’re
/=\d=[x\d]=/ =~ "=5=x="
given abbreviations. \d is the same character
0
֒→ class as [0-9]. Other characters can be added
to the abbreviation, in which case brackets are
needed. See Figure 10.1, on the previous page,
for a complete list of abbreviations.
10.2 Grouping and Alternatives
Parentheses can group sequences of characters
/(ab)+/ =~ "ababab"
so that special characters apply to the whole
0
֒→ sequence.
Special characters can appear within groups.
/(ab*)+/ =~ "aababbabbb"
Here, the group containing one a and any num-
0
֒→ ber of b’s is repeated one or more times.
The vertical bar character is used to allow alter-
/a|b/ =~ "a"
natives. Here, either a or b match.
0
֒→
A vertical bar divides the regular expression into
/^Fine birds|cows ate\.$/ =~
two smaller regular expressions. A match means
"Fine birds ate seeds."
that either the entire left regexp matches or the
0
֒→ entire right one does.
This regular expression does not mean “Match
either 'Fine birds ate.' or 'Fine cows ate.'” It actu-
ally matches either a string beginning with "Fine
birds" or one ending in "cows ate."
This regular expression matches only the two
/^Fine (birds|cows) ate\.$/ =~
alternate sentences, not the infinite number of
"Fine birds ate seeds."
possibilities the previous example’s regexp does.
nil
֒→
10.3 Taking Strings Apart
Like the =~ operator, match returns nil if there’s
re = /(\w+), (\w+), or (\w+)/
no match. If there is, it returns a MatchData
s = 'Without a Bob, ox, or bin!'
object. You can pull information out of that
match = re.match(s)
object.
֒→ #<MatchData:0x323c44>
A MatchData is indexable. Its zeroth element is
match[0]
the entire match.
֒→ "Bob, ox, or bin"
Each following element stores the result of what
match[1]
a group matched, counting from left to right.
֒→ "Bob"
Groups are often used to pull apart strings and
"#{match[3]} and #{match[1]}"
construct new ones.
֒→ "bin and Bob"
pre_match returns any portion of the string
match.pre_match
before the part that matched.
֒→ "Without a "
post_match returns any portion of the string
match.post_match
after the part that matched. match.pre_match,
֒→ "!" match[0], and match.post_match can be added
together to reconstruct the original string.
The plus and star special characters are greedy:
str = "a bee in my bonnet"
they match as many characters as they can.
/a.*b/.match(str)[0]
Expect that to catch you by surprise sometimes.
֒→ "a bee in my b"
You can make plus and star match as few char-
/a.*?b/.match(str)[0]
acters as they can by suffixing them with a ques-
֒→ "a b" tion mark.
You can use a regular expression to slice a
"has 5 and 3" [/\d+/]
string. The result is the first substring that
֒→ "5" matches the regular expression.
10.4 Variables Behind the Scenes
Both =~ and match set some variables. All begin
re = /(\w+), (\w+), or (\w+)/
with $. Each parenthesized group gets its own
s = 'Without a Bob, ox, or bin!'
number, from $1 up through $9. You might
re =~ s
expect $0 to name the entire string that matched,
[$1, $2, $3]
but it’s already used for something else: the
֒→ ["Bob" , "ox" , "bin" ] name of the program being executed.
$& is the equivalent of match[0].
$&
֒→ "Bob, ox, or bin"
These two variables are used to store the string
$‘ + $'
before the match and the string after the match.
֒→ "Without a !" (The first is a backward quote / backtick; the
second a normal quote.)
These variables are probably most often used to immediately do some-
thing with a string that’s “equal enough” to some pattern. Like this:
if name =~ /(.+), (.+)/
name = "#{$2} #{$1}"
end
10.5 Regular Expression Options
Normally, the period in a regular expression
/a.*b/ =~ "az\nzb"
does not match the end-of-line character. There-
nil
֒→ fore, .* or .+ matches won’t span lines.
Adding the m (multiline) option makes a period
/a.*b/m =~ "az\nzb"
match end-of-line characters, so the regular
0
֒→ expression match can span lines.
This is a far too annoying way to do a case-
/[cC][aA][tT]/ =~ "Cat"
insensitive match.
0
֒→
The i (insensitive) option is a better way.
/cat/i =~ "Cat"
0
֒→
发表评论
-
Ruby 搭建环境
2013-06-01 11:17 2050http://kidlet.sinaapp.com/blog/ ... -
ActiveRecord::Dirty
2011-11-21 10:29 785引用Track unsaved attribute chang ... -
Metaprogramming Ruby
2011-09-30 16:11 1181P30 In a sense, the class keywo ... -
RVM Install
2011-09-17 15:17 872http://beginrescueend.com/ -
json
2011-09-15 09:51 814http://flori.github.com/json/ -
Rails计算某月最后一天
2011-08-12 10:46 1426经常忘记这个函数.mark下. 引用end_of_day, e ... -
关于浮点数精度的问题
2011-05-11 15:50 1279在项目里遇到一个很诡异的问题,因为有一些浮点数的计算,总 ... -
Ruby Memoization(转载)
2010-11-28 23:45 820转载http://fuliang.iteye.com/blog ... -
included() vs extended()
2010-11-04 19:48 755# A little helper from _why cl ... -
ruby的to_proc
2010-10-21 00:41 9071,先看api 引用Method#proc meth.to_p ... -
Nesting Is Different From Inclusion
2010-10-17 10:02 790Nesting Is Different From Inclu ... -
ruby里的方法作用域
2010-08-11 09:51 1093在java里private方法在Java当中的含义是只在当前类 ... -
Benchmark
2010-06-17 14:10 8921,length > 0和blank?和emtpy? & ... -
ruby的笔记
2010-05-20 14:23 947最近看了看ruby元编程的一些东西。简单的记下。 1,ruby ... -
闭包(回顾,转载)
2010-03-22 23:02 829闭包的一个重要特征是:过程(方法)内部定义的变量,即使在方法调 ... -
ruby cookbook -- 10.7检查对象是否具有必需的属性
2010-03-01 23:51 780检查是否具有实例变量 class Object de ... -
ruby cookbook -- 10.6. Listening for Changes to a Class监听类的变化
2010-03-01 23:30 780当增加新方法,类方法删除和取消定义的现有方法 class T ... -
ruby cookbook -- 10.4Getting a Reference to a Method(获得方法引用)
2010-03-01 23:25 818A Method object can be stored ... -
irb配置
2010-02-24 13:21 1081#.irbrc require 'rubygems' ... -
ruby cookbook -- 使分配程序能够使用注册回调的返回值
2010-02-23 19:23 788#使分配程序能够使用注册回调的返回值 A simple cha ...
相关推荐
Mastering Regular Expressions(3rd) 英文无水印pdf 第3版 pdf所有页面使用FoxitReader和PDF-XChangeViewer测试都可以打开 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,...
正则表达式(Regular Expressions)是一种强有力的文本匹配工具,用于在字符串中执行模式匹配和提取信息。从给出的文件内容来看,我们正在讨论一本关于正则表达式的电子书——《Introducing Regular Expressions》,...
《Mastering Regular Expressions》(第三版)是正则表达式领域的权威著作,由拥有近30年开发经验的专家Jeffrey E.F. Friedl撰写。这本书深入浅出地介绍了正则表达式的概念、语法以及实际应用,是编程者提升正则...
而Delphi自XE2版本起,内置了RegularExpressions组件,它基于.NET Framework的System.Text.RegularExpressions类库,提供了一套原生的正则表达式支持。虽然它可能没有PerlRegEx那么灵活,但对于大部分日常的正则...
《Wrox - Beginning Regular Expressions》是一本专为初学者设计的正则表达式入门教程。这本书深入浅出地介绍了正则表达式的基本概念、语法和应用,旨在帮助读者掌握这一强大的文本处理工具。 正则表达式(Regular ...
#### 标题:Mastering Regular Expressions - **主要内容**:本书深入探讨了正则表达式的高级用法和技术细节,旨在帮助读者掌握正则表达式的各个方面。 #### 描述:Mastering Regular Expressions.pdf - **内容...
**"Regular Expressions Cookbook.pdf"** 这个标题明确指出本书的主题是正则表达式(Regular Expressions,简称 Regex)。正则表达式是一种强大的文本处理工具,被广泛应用于搜索、替换以及解析文本等任务中。...
PCRE(Perl Compatible Regular Expressions)是一个Perl库,包括 perl 兼容的正规表达式库.这些在执行正规表达式模式匹配时用与Perl 5同样的语法和语义是很有用的。Boost太庞大了,使用boost regex后,程序的编译速度...
Mastering Python Regular Expressions 英文无水印pdf pdf所有页面使用FoxitReader和PDF-XChangeViewer测试都可以打开 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请...
正则表达式(Regular Expressions)是一种强大的文本处理工具,用于在字符串中执行搜索、替换、提取等操作,它是一种在计算机科学和编程领域广泛使用的工具。正则表达式被设计为一种模式,能够匹配一系列符合特定...
To introduce readers to regular expressions in several technologies. While the material is primarily for people who have little or no experience with regular expressions, there is also some content ...
PCRE(Perl Compatible Regular Expressions)是一个Perl库,包括 perl 兼容的正则表达式库。这些在执行正规表达式模式匹配时用与Perl 5同样的语法和语义是很有用的。
书名:Mastering Regular Expressions, 3rd Edition 格式:CHM 语言:English 简介: Regular expressions are an extremely powerful tool for manipulating text and data. They are now standard ...
本部分内容主要介绍了正则表达式的相关知识,包括锚点、字符集、特殊字符、字符类、量词、模式修饰符、逃脱字符、正则表达式元字符、前后匹配、位置匹配等。 1. 锚点:锚点是正则表达式中的特殊字符,用于指定匹配...