- 浏览: 34658 次
- 性别:
- 来自: 上海
文章分类
最新评论
Understanding the difference between current-line addressing in ed and global-line addressing in sed is very important. In ed you use addressing to expand the number of lines that are the object of a command; in sed, you use addressing to restrict the number of lines affected by a command.
command [options] script filename
sed -f scrptfile inputfile
$ sed '
> s/ MA/, Massachusetts/
> s/ PA/, Pennsylvania/
> s/ CA/, California/' list
The -n option suppresses the automatic output. When specifying this option, each instruction intended to produce output must contain a print command, p.
sed -n -e 's/MA/Massachusetts/p' list
awk -v var=value 'instruction' inputfile
.
Matches any single character except newline. In awk, dot can match newline also.
*
Matches any number (including zero) of the single character (including a character specified by a regular expression) that immediately precedes it.
[...] Matches any one of the class of characters enclosed between the brackets. A circumflex (^) as first character inside brackets reverses the match to all characters except newline and those listed in the class. In awk, newline will also match. A hyphen (-) is used to indicate a range of characters. The close bracket (]) as the first character in class is a member of the class. All other metacharacters lose their meaning when specified as members of a class.
^
First character of regular expression, matches the beginning of the line. Matches the beginning of a string in awk, even if the string contains embedded newlines.
$
As last character of regular expression, matches the end of the line. Matches the end of a string in awk, even if the string contains embedded newlines.
\{n,m\}
Matches a range of occurrences of the single character (including a character specified by a regular expression) that immediately precedes it. \{n\} will match exactly n occurrences, \{n,\} will match at least n occurrences, and \{n,m\} will match any number of occurrences between n and m. (sed and grep only, may not be in some very old versions.)
\
Escapes the special character that follows
Extended Metacharacters (egrep and awk):
+
Matches one or more occurrences of the preceding regular expression.
?
Matches zero or one occurrences of the preceding regular expression.
|
Specifies that either the preceding or following regular expression can be matched (alternation).
()
Groups regular expressions.
{n,m}
Matches a range of occurrences of the single character (including a character specified by a regular expression) that immediately precedes it. {n} will match exactly n occurrences, {n,} will match at least n occurrences, and {n,m} will match any number of occurrences between n and m. (POSIX egrep and POSIX awk, not in traditional egrep or awk.)
Inside square brackets, the standard metacharacters lose their meaning.
Special Characters in Character Classes
\ Escapes any special character (awk only)
- Indicates a range when not in the first or last position.
^ Indicates a reverse match only when in the first position.
The close bracket (]) is interpreted as a member of the class if it occurs as the first character in the class (or as the first character after a circumflex). The hyphen loses its special meaning within a class if it is the first or last character.
In awk, you could also use the backslash to escape the hyphen or close bracket wherever either one occurs in the range, but the syntax is messier.
Basic Regular Expressions (BREs), which are the kind used by grep and sed, and Extended Regular Expressions, which are the kind used by egrep and awk.
Character classes. A POSIX character class consists of keywords bracketed by [: and :]. The keywords describe different classes of characters such as alphabetic characters, control characters, and so on (see Table 3.3).
[:alnum:] Printable characters (includes whitespace)
[:alpha:] Alphabetic characters
[:blank:] Space and tab characters
[:cntrl:] Control characters
[:digit:] Numeric characters
[:graph:] Printable and visible (non-space) characters
[:lower:] Lowercase characters
[:print:] Alphanumeric characters
[:punct:] Punctuation characters
[:space:] Whitespace characters
[:upper:] Uppercase characters
[:xdigit:] Hexadecimal digits
Collating symbols. A collating symbol is a multicharacter sequence that should be treated as a unit. It consists of the characters bracketed by [. and .].
Equivalence classes. An equivalence class lists a set of characters that should be considered equivalent, such as e and è. It consists of a named element from the locale, bracketed by [= and =].
The vertical bar (|) metacharacter, part of the extended set of metacharacters, allows you to specify a union of regular expressions.
compan(y|ies)
$ egrep "(^| )[\"[{(]*book[]})\"?\!.,;:'s]*( |$)" bookwords
This file tests for book in various places, such as
book at the beginning of a line or
at the end of a line book
as well as the plural books and
"book of the year award"
to look for a line with the word "book"
A GREAT book!
A great book? No.
told them about (the books) until it
Here are the books that you requested
Yes, it is a good book for children
amazing that it was called a "harmful book" when
once you get to the end of the book, you can't believe
a special metacharacter for matching a string at the beginning of a word, \<, and one for matching a string at the end of a word, \>. Used as a pair, they can match a string only when it is a complete word.
$ gres '"[^"]*"' '00' sampleLine
.Se 00 "Appendix"
1........5
5........10
10.......20
100......200
$ sed 's/\([0-9][0-9]*\)\.\{5,\}\([0-9][0-9]*\)/\1-\2/' sample
1-5
5-10
10-20
100-200
his mistake is simply a problem of the order of the commands in the script.
Sed also maintains a second temporary buffer called the hold space. You can copy the contents of the pattern space to the hold space and retrieve them later.
A sed command can specify zero, one, or two addresses. An address can be a regular expression describing a pattern, a line number, or a line addressing symbol.
If no address is specified, then the command is applied to each line.
If there is only one address, the command is applied to any line matching the address.
If two comma-separated addresses are specified, the command is performed on the first line matching the first address and all succeeding lines up to and including a line matching the second address.
If an address is followed by an exclamation mark (!), the command is applied to all lines that do not match the address.
The line number refers to an internal line count maintained by sed. This counter is not reset for multiple input files.
Similarly, the input stream has only one last line. It can be specified using the addressing symbol $.
eg.
d
1d
$d
/^$/d
/^\.TS/,/^\.TE/d
50,$d
1,/^$/d #This example deletes from the first line up to the first blank line.
An exclamation mark (!) following an address reverses the sense of the match. For instance, the following script deletes all lines except those inside tbl input:
/^\.TS/,/^\.TE/!d
Braces ({}) are used in sed to nest one address inside another or to apply multiple commands at the same address.
/^\.TS/,/^\.TE/{
/^$/d #to delete blank lines only inside blocks of tbl input
s/^\.ps 10/.ps 8/
s/^\.vs 12/.vs 10/
}
/---/!s/--/\\(em/g
If you find a line containing three consecutive hyphens, don't apply the edit. On all other lines, the substitute command will be applied.
Substitution
[address]s/pattern/replacement/flags
n A number(1 to 512) indicating that a replacement should be made for only the nth occurrence of the pattern.
g Make changes globally on all occurrences in the pattern space. Normally only the first occurrence is replaced.
p Print the contents of the pattern space.
w file
Write the contents of the pattern space to file.
The substitute command is applied to the lines matching the address. If no address is specified, it is applied to all lines that match the pattern. If a regular expression is supplied as an address, and no pattern is specifed, the substitute command matches what is matched by the address.
In the replacement section, only the following characters have special meaning:
& Replaced by the string matched by the regular expression.
\n Matches the nth substring previously specified in the pattern using "\(" and "\)".
\ Used to escape the ampersand, the blackslash, and the substitution command's delimiter. In addtion, it can be used to escape the newline and create a multiline replacement string.
#! /bin/sh
grep "^\.XX" $* | sort -u |
sed '
s/^\.XX \(.*\)$/\/^\\.XX \/s\/\1\/\1\//'
Delete
The delete command is also a command that can change the flow of control in a script. That is because once it is executed, no further commands are executed on the "empty" pattern space.
Append, Insert, and Change
[line-address]a\
text
[line-address]i\
text
[address]c\
text
The insert command places the supplied text before the current line in the pattern space.
The append command places it after the current line.
The change command replaces the content of the pattern space with the supplied text.
The text must begins on the next line. To input multiple lines of text, each successive line must end with a backslash, with the exception of the very last line.
E.g,
/<Larry's Address>/i\
4600 Cross Court \
French Lick, IN
The append and insert commands can be applied only a single line address, not a range of lines. The change command, however, can address a range of lines. In this case, it replaces all addressed lines with a single copy of the text. In other words, it deletes each line in the range but the supplid text is output only once.
E.g,
/^From /,/^$/c\
<Mail Header Removed>
The insert and append commands do not affect the contents of the pattern space. The supplied text will not match any address in subsequent commands in the script, nor can thse commands affect the text(different with s command). No matter what changes occur to alter the pattern space, the supplied text will still be output appropriately. Also, the supplied text does not affect sed's internal line counter(nor do s,d commands).
#cat data
line1
#sed '1{
i\
before line1
a\
after line1
s/line1/subline1\nsubline2/g
s/line/ /g}' data
before line1
sub 1
sub 2
after line1
Print line number
[address]=
The next command (n) outputs the contents of the pattern space and then reads the next line of input without returning to the top of the script.
[address]n
E.g,
/^\.H1/{
n
/^$/d
}
Match any line beginning with the string '.H1', then print that line and read in the next line. If that line is blank, delete it.
The quit command (q) causes sed to stop reading new input lines (and stop sending them to the output). (timesaver)
[address]q
发表评论
-
Mac OS x
2012-12-10 01:25 0http://anders.com/guides/native ... -
Linux远程桌面
2010-10-26 14:21 837在Windows上,显示Linux远程桌面的方法: xma ... -
一步一学Linux与Windows 共享文件Samba
2010-09-21 17:52 777[摘自]http://www.linuxsir.org/mai ... -
CentOS 开机优化
2010-09-20 10:49 1968近日,在VMware上安装CentOS 4.6,发现有如下问题 ... -
Install Linux
2008-10-19 17:44 662shell> yum -y groupinstall & ... -
Shell
2008-07-07 19:29 934#prompt > echo $PS1 #searche ... -
Linux Command - File System
2008-05-11 08:22 722> dfisk -l> df -T -h> ... -
linux commands
2008-05-11 06:34 755mkfifo > mkfifo my_pipe ... -
Linux 进程管理
2008-05-11 05:13 711http://www.linuxsir.org/main/?q ... -
Installation - ActiveMQ CPP Library 2.1.3
2008-05-05 19:05 1836ActiveMQ CPP Library 2.1.3 # In ... -
Bash配置文件
2008-05-05 17:53 1259表2-6 ... -
yum 命令
2008-05-05 14:08 1800使用yum之前,请切换为root用户。 添加/删除/更新软 ...
相关推荐
**sed工具介绍** `sed` 是“流编辑器”(Stream Editor)的缩写,它是一种功能强大的文本处理工具,广泛应用于Linux和Unix系统中。`sed` 可以读取数据流,对输入的数据进行各种操作,如替换、删除、插入等,并将...
**sed-4.2.1-setup** 是一个安装程序,用于在计算机上部署 **sed** 工具的4.2.1版本。**sed**,全称“Stream Editor”,是Unix和类Unix操作系统中的一款强大文本处理工具。它能够对输入流(标准输入或文件)进行读取...
**sed和awk工具的介绍与应用** sed和awk是广泛应用于UNIX系统中的两个文本处理工具。sed是流编辑器(stream editor)的缩写,而awk则是一种编程语言,得名于其三位开发者:Alfred Aho、Peter J. Weinberger和Brian ...
在Windows环境下,`cmd`命令行工具通常用于执行各种系统级操作,而`sed`(流编辑器,Stream Editor)是Unix/Linux系统中一个强大的文本处理工具,它在Windows下的应用可能需要额外的配置。本篇文章将详细介绍如何在...
### Sed AWK编程指南知识点详解 #### 一、引言 在计算机科学领域,文本处理是必不可少的一部分。其中,`sed` 和 `awk` 是两种非常强大的文本处理工具,广泛应用于Linux/Unix环境中。本指南将详细介绍这两个工具的...
标题中的"sed.exe win x32 x64"指的是在Windows操作系统中,为32位(x32)和64位(x64)系统提供的sed命令行工具。sed(流编辑器Stream Editor)是一个功能强大的文本处理工具,常用于Linux和Unix系统中,但在Windows上...
通过sed截取一行匹配内容 sed是一种在线编辑器,它一次处理一行内容。处理时,把当前处理的行存储在临时缓冲区中,称为“模式空间”(pattern space),接着用sed命令处理缓冲区中的内容,处理完成后,把缓冲区的...
Shell、awk、sed 面试题汇总 以下是从给定的文件中生成的相关知识点: Shell 1. 变量赋值:在 Shell 中,可以使用多种方法来赋值变量,包括直接赋值、使用 `read` 命令、使用命令行参数和使用命令的输出。 2. ...
《SED与AWK 高清第三版》是一本专注于Linux系统中强大文本处理工具sed和awk的教程。在Linux环境中,sed和awk是不可或缺的工具,它们能够高效地处理大量文本数据,进行搜索、替换、格式化等操作,极大地提高了运维...
### 基本的SED命令详解 #### 一、SED命令概述 SED(Stream Editor)是一种强大的文本处理工具,主要用于对文件进行批量编辑操作。它能够执行诸如替换、删除、插入等多种文本处理任务,尤其适合处理结构化数据或...
Sed和Awk是UNIX和Linux系统中极为重要的流编辑器和文本处理工具,它们能够通过简单的命令或脚本高效处理文本文件,实现复杂的文本转换和报告生成。接下来,我们将根据提供的文件内容详细地说明Sed和Awk的关键知识点...
在处理文本数据时,Sed和awk是两个非常强大的工具。它们广泛应用于Unix和类Unix系统中,比如Linux。Sed,即流编辑器,是一个非交互式的文本处理工具,它能够执行文本替换、插入、删除等操作,而awk则是一个强大的...
### 通用线程sed实例详解 #### sed简介与特点 sed是一种极其强大的UNIX流编辑器,因其高效且灵活的功能在日常运维与开发工作中备受青睐。本文将深入介绍sed的基础概念及其高级用法,并通过一系列实用示例帮助读者...
### sed和awk单行命令比较 本文将对两种常见的文本处理工具——`sed`和`awk`进行详细的对比分析,并通过具体的示例来说明这两种工具在处理文本时的不同之处。 #### 1. 基本介绍 - **sed** (stream editor):是一种...
### SED与AWK在Linux下的应用技巧 #### 概述 在Linux系统中,`sed`(stream editor)和`awk`都是极其强大的文本处理工具。它们的主要用途是在命令行环境中对文件进行批量编辑、查找和替换等操作。虽然两者都能完成...
《Sed与Awk》是IT领域中关于文本处理的经典之作,主要讲解了两种强大的命令行工具:Sed(流编辑器)和Awk(数据处理语言)。这两款工具在Linux和Unix系统中广泛使用,尤其适用于数据提取、转换、报告生成等任务。 ...
标题与描述:“Sed - An Introduction and Tutorial by Bruce Barnett” Sed,即Stream Editor(流编辑器),是一种功能强大的文本处理工具,广泛应用于Unix和类Unix系统中。它能够读取输入流,对其进行一系列预...
SED,全称Stream EDitor,是Unix环境下的流编辑器,主要用于对文本进行过滤和转换,支持正则表达式。它以行为单位对文本进行处理,并且可以执行插入、删除、替换以及其它复杂的文本转换操作。SED可以使用单行命令...