ruby正则表达式

notreally

浏览: 121600 次
性别:
来自: 上海

最近访客更多访客>>

Hu_zhijia

woodding2008

wp571

chokee

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

ruby && rails

正则表达式 Ruby 框架 C C++

转

Ruby的正则表达式以"//"作为基本框架，表达式内容位于"/"之间。表达式返回一个RegExp的对象。
表达式的一般规则:
/a/匹配字符a。
/\?/匹配特殊字符?。特殊字符包括^, $, ? , ., /, \, [, ], {, }, (, ), +, *.
.匹配任意字符，例如/a./匹配ab和ac。
/[ab]c/匹配ac和bc,[]之间代表范围。例如：/[a-z]/ , /[a-zA-Z0-9]/。
/[^a-zA-Z0-9]/匹配不在该范围内的字符串。
/[\d]/代表任意数字，/[\w]/代表任意字母，数字或者_，/[\s]/代表空白字符，包括空格，TAB和换行。
/[\D]/，/[\W]/，/[\S]/均为上述的否定情况。

高级规则：
?代表0或1个字符。/Mrs?\.?/匹配"Mr"，"Mrs"，"Mr."，"Mrs."。
*代表0或多个字符。/Hello*/匹配"Hello","HelloJavaeye"。
+代表1或多个字符。/a+c/匹配："abc"，"abbdrec"等等。
/d{3}/匹配3个数字。
/d{1,10}/匹配1-10个数字。
/d{3,}/匹配3个数字以上。
/([A-Z]\d){5}/匹配首位是大写字母，后面4个是数字的字符串。

String和RegExp均支持=~和match2个查询匹配方法。在irb中：
>> "The alphabet starts with abc" =~ /abc/
=> 25
>> /abc/.match("The alphabet starts with abc.")
=> #<MatchData:0x1b0d88>

可以看出，如果能够匹配，=~返回匹配的字符串位置，而match返回一个MatchData对象。如果不匹配返回nil。

MatchData可以取出其中符合各个子匹配的内容。看下面的例子：

We have a String : Peel,Emma,Mrs.,talented amateur
The order of the name : last name,first name, title, occupation
正则表达式为： /[A-Za-z]+,[A-Za-z]+,Mrs?\./

irb中：
>> /[A-Za-z]+,[A-Za-z]+,Mrs?\./.match("Peel,Emma,Mrs.,talented amateur")
=> #<MatchData:0x401f0a6c>
但是我们想从匹配正则表达式的String里面只取出last name和title相关的字符串，那么正则表达式可以如下：

/([A-Za-z]+),[A-Za-z]+,(Mrs?\.)/

注意([A-Za-z]+) 和 (Mrs?\.)，执行下面的代码：
>> /([A-Za-z]+),[A-Za-z]+,(Mrs?\.)/.match("Peel,Emma,Mrs.,talented amateur")
=> #<MatchData:0x401e0a7c>
>> puts "Dear #{$2} #{$1},"
=> Dear Mrs. Peel
()中的表达式就是子表达式。

下面的代码和上面效果一样，
m = /([A-Za-z]+),[A-Za-z]+,(Mrs?\.)/.match("Peel,Emma,Mrs.,talented amateur")
puts "Dear #{m[2]} #{m[1]}，"

这里m[0]返回匹配主表达式的字符串。

下面的方法是等同的：
m[n] == m.captures[n]

一些相关的方法，看下面的代码例子：

string = "My phone number is (123) 555-1234."
phone_re = /$(\d{3})$\s+(\d{3})-(\d{4})/
m = phone_re.match(string)
print "The part of the string before the part that matched was:"
puts m.pre_match
print "The part of the string after the part that matched was:"
puts m.post_match
print "The second capture began at character "
puts m.begin(2)
print "The third capture ended at character "
puts m.end(3)

Output:

The string up to the part that matched was: My phone number is
The string after the part that matched was: .
The second capture began at character 25
The third capture ended at character 33

分享到：

ubuntu初学者常用命令 | JQuery技术总结

2008-05-08 18:23
浏览 1330
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论