10.1.2 Character Classes
Individual literal characters can be combined into character classes
by placing them within
square brackets. A character class matches any one
character that is contained within it. Thus, the regular expression
/[abc]/
matches any one of the letters a, b, or c. Negated character
classes can also be defined -- these match any character except those contained
within the brackets. A negated character
class is specified by placing a
caret (^
) as the first character inside the left bracket. The regexp
/[^abc]/
matches any one character other than a, b, or c. Character
classes can use a
hyphen to indicate a range of
characters. To match any one lowercase character from the Latin alphabet, use
/[a-z]/
, and to match any letter or digit from the Latin alphabet, use
/[a-zA-Z0-9]/
.
Because certain character classes are commonly used, the
JavaScript regular expression syntax includes special characters and escape
sequences to represent these common classes. For example, \s
matches the space character, the
tab character, and any other
Unicode whitespace
character, and \S
matches any character that is not
Unicode whitespace. Table 10-2
lists these characters and
summarizes character class syntax. (Note that several of these character class
escape sequences match only ASCII characters and have not been extended to work
with Unicode characters. You can explicitly define your own Unicode character
classes; for example, /[\u0400-04FF]/
matches any one Cyrillic
character.)
Table 10-2. Regular expression character classes
Character
Matches
[...]
|
Any one character between the brackets.
|
[^...]
|
Any one character not between the brackets.
|
.
|
Any character except newline or another Unicode line
terminator.
|
\w
|
Any ASCII word character. Equivalent to [a-zA-Z0-9_]
.
|
\W
|
Any character that is not an ASCII word character. Equivalent
to [^a-zA-Z0-9_]
.
|
\s
|
Any Unicode whitespace character.
|
\S
|
Any character that is not Unicode whitespace. Note that
\w
and \S
are not the same thing.
|
\d
|
Any ASCII
digit. Equivalent to [0-9]
.
|
\D
|
Any character other than an ASCII digit. Equivalent to
[^0-9]
.
|
[\b]
|
A literal backspace (special case).
|
Note that the special character class escapes can be used
within square brackets. \s
matches any whitespace character and
\d
matches any digit, so /[\s\d]/
matches any one whitespace
character or digit. Note that there is one special case. As we'll see later, the
\b
escape has a special meaning.
When used within a character class, however, it represents
the backspace character. Thus, to represent a backspace character literally in a
regular expression, use the character class with one element: /[\b]/
.
10.1.3 Repetition
With the regular expression syntax we have learned so far,
we can describe a two-digit number as /\d\d/
and a four-digit number as
/\d\d\d\d/
. But we don't have any way to describe, for example, a
number that can have any number of digits or a string of three letters followed
by an optional digit. These more complex patterns use regular expression syntax
that specifies how many times an element of a regular expression may be
repeated.
The characters that specify repetition always follow the
pattern to which they are being applied. Because certain types of repetition are
quite commonly used, there are special characters to represent these cases. For
example, +
matches one or more occurrences of the previous pattern. Table 10-3
summarizes the
repetition syntax. The following lines show some examples:
/\d{2,4}/ // Match between two and four digits
/\w{3}\d?/ // Match exactly three word characters and an optional digit
/\s+java\s+/ // Match "java" with one or more spaces before and after
/[^"]*/ // Match zero or more non-quote characters
Table 10-3. Regular expression repetition characters
Character
Meaning
{
n
,
m
}
|
Match the previous item at least
n
times but no more than m
times.
|
{
n
,}
|
Match the previous item n
or more times.
|
{
n
}
|
Match exactly n
occurrences of the previous
item.
|
?
|
Match zero or
one occurrences of the previous item. That is, the previous item is optional.
Equivalent to {0,1}
.
|
+
|
Match one or more occurrences of the
previous item. Equivalent to {1,}
.
|
*
|
Match zero or more occurrences of the
previous item. Equivalent to {0,}
.
|
Be careful when using the *
and ?
repetition
characters. Since these characters may match zero instances of whatever precedes
them, they are allowed to match nothing. For example, the regular expression
/a*/
actually matches the string "bbbb", because the string contains
zero occurrences of the letter a!
分享到:
相关推荐
JavaScript's implementation allows us to perform complex tasks with a few lines of code using regular expressions to match and extract data out of text. This book starts by exploring what a pattern ...
《Mastering Regular Expressions》(第三版)是正则表达式领域的权威著作,由拥有近30年开发经验的专家Jeffrey E.F. Friedl撰写。这本书深入浅出地介绍了正则表达式的概念、语法以及实际应用,是编程者提升正则...
《Wrox - Beginning Regular Expressions》是一本专为初学者设计的正则表达式入门教程。这本书深入浅出地介绍了正则表达式的基本概念、语法和应用,旨在帮助读者掌握这一强大的文本处理工具。 正则表达式(Regular ...
**"Regular Expressions Cookbook.pdf"** 这个标题明确指出本书的主题是正则表达式(Regular Expressions,简称 Regex)。正则表达式是一种强大的文本处理工具,被广泛应用于搜索、替换以及解析文本等任务中。...
正则表达式(Regular Expressions,简称regex)是编程领域中一种强大的文本处理工具,它用于模式匹配、数据提取、验证输入等任务。这个“Regular.Expressions”资料包显然是为开发者设计的,旨在深入理解正则表达式...
Introducing Regular Expressions JavaScript and TypeScript 英文epub 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除
正则表达式(Regular Expressions)是一种强大的文本处理工具,用于在字符串中执行搜索、替换、提取等操作,它是一种在计算机科学和编程领域广泛使用的工具。正则表达式被设计为一种模式,能够匹配一系列符合特定...
Introducing Regular Expressions JavaScript and TypeScript 英文无水印pdf pdf所有页面使用FoxitReader和PDF-XChangeViewer测试都可以打开 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载...
This book is your easy-to-digest and concise guide to regular expressions in JavaScript, this book teaches you the very basics and can be used in the browser or on the server. Explore and understand ...
正则表达式(Regular Expression)是编程语言中用于模式匹配和文本搜索的强大工具,广泛应用于数据验证、文本提取、搜索与替换等多个领域。 本书涵盖了正则表达式的各个方面,包括基础语法、高级特性和实践应用。在...
正则表达式(Regular Expressions)是编程领域中一种强大的文本处理工具,它能高效地进行模式匹配和数据提取。正则表达式由一个或多个字符组成的字符串,这些字符可以是字母、数字、特殊符号,或者是对这些元素的...
C# 提供了.NET Framework下的`System.Text.RegularExpressions`命名空间,该命名空间包含`Regex`类,用于处理正则表达式操作。例如: ```csharp string pattern = @"\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b"; ...
Take the guesswork out of using regular expressions. With more than 140 practical recipes, this cookbook provides everything you need to solve a wide range of real-world problems. Novices will learn ...
正则表达式(Regular Expressions,简称regex)是IT领域中一种强大的文本处理工具,用于在字符串中查找、替换或提取符合特定模式的文本。在本教程“Sams Teach Yourself Regular Expressions in 10 Minutes”中,你...
`Chapter 9 - JavaScript Regular Expressions.pptx`是关于正则表达式,一种强大的文本匹配工具。通过正则表达式,开发者可以高效地进行文本搜索、替换和验证,这对于处理用户输入数据和分析字符串极为有用。 最后...
- **编程语言内置支持**:许多编程语言,如Python、Java、JavaScript等,都内置了对正则表达式的支持,这使得开发者可以轻松地将正则表达式集成到他们的应用程序中。 #### 基本正则表达式技能 本书第二章深入探讨...