`

HANDY ONE-LINE SCRIPTS FOR AWK

阅读更多
HANDY ONE-LINE SCRIPTS FOR AWK                               30 April 2008
Compiled by Eric Pement - eric [at] pement.org               version 0.27

Latest version of this file (in English) is usually at:
   http://www.pement.org/awk/awk1line.txt

This file will also be available in other languages:
   Chinese  - http://ximix.org/translation/awk1line_zh-CN.txt   

USAGE:

   Unix: awk '/pattern/ {print "$1"}'    # standard Unix shells
DOS/Win: awk '/pattern/ {print "$1"}'    # compiled with DJGPP, Cygwin
         awk "/pattern/ {print \"$1\"}"  # GnuWin32, UnxUtils, Mingw

Note that the DJGPP compilation (for DOS or Windows-32) permits an awk
script to follow Unix quoting syntax '/like/ {"this"}'. HOWEVER, if the
command interpreter is CMD.EXE or COMMAND.COM, single quotes will not
protect the redirection arrows (<, >) nor do they protect pipes (|).
These are special symbols which require "double quotes" to protect them
from interpretation as operating system directives. If the command
interpreter is bash, ksh or another Unix shell, then single and double
quotes will follow the standard Unix usage.

Users of MS-DOS or Microsoft Windows must remember that the percent
sign (%) is used to indicate environment variables, so this symbol must
be doubled (%%) to yield a single percent sign visible to awk.

If a script will not need to be quoted in Unix, DOS, or CMD, then I
normally omit the quote marks. If an example is peculiar to GNU awk,
the command 'gawk' will be used. Please notify me if you find errors or
new commands to add to this list (total length under 65 characters). I
usually try to put the shortest script first. To conserve space, I
normally use '1' instead of '{print}' to print each line. Either one
will work.

FILE SPACING:

 # double space a file
 awk '1;{print ""}'
 awk 'BEGIN{ORS="\n\n"};1'

 # double space a file which already has blank lines in it. Output file
 # should contain no more than one blank line between lines of text.
 # NOTE: On Unix systems, DOS lines which have only CRLF (\r\n) are
 # often treated as non-blank, and thus 'NF' alone will return TRUE.
 awk 'NF{print $0 "\n"}'

 # triple space a file
 awk '1;{print "\n"}'

NUMBERING AND CALCULATIONS:

 # precede each line by its line number FOR THAT FILE (left alignment).
 # Using a tab (\t) instead of space will preserve margins.
 awk '{print FNR "\t" $0}' files*

 # precede each line by its line number FOR ALL FILES TOGETHER, with tab.
 awk '{print NR "\t" $0}' files*

 # number each line of a file (number on left, right-aligned)
 # Double the percent signs if typing from the DOS command prompt.
 awk '{printf("%5d : %s\n", NR,$0)}'

 # number each line of file, but only print numbers if line is not blank
 # Remember caveats about Unix treatment of \r (mentioned above)
 awk 'NF{$0=++a " :" $0};1'
 awk '{print (NF? ++a " :" :"") $0}'

 # count lines (emulates "wc -l")
 awk 'END{print NR}'

 # print the sums of the fields of every line
 awk '{s=0; for (i=1; i<=NF; i++) s=s+$i; print s}'

 # add all fields in all lines and print the sum
 awk '{for (i=1; i<=NF; i++) s=s+$i}; END{print s}'

 # print every line after replacing each field with its absolute value
 awk '{for (i=1; i<=NF; i++) if ($i < 0) $i = -$i; print }'
 awk '{for (i=1; i<=NF; i++) $i = ($i < 0) ? -$i : $i; print }'

 # print the total number of fields ("words") in all lines
 awk '{ total = total + NF }; END {print total}' file

 # print the total number of lines that contain "Beth"
 awk '/Beth/{n++}; END {print n+0}' file

 # print the largest first field and the line that contains it
 # Intended for finding the longest string in field #1
 awk '$1 > max {max=$1; maxline=$0}; END{ print max, maxline}'

 # print the number of fields in each line, followed by the line
 awk '{ print NF ":" $0 } '

 # print the last field of each line
 awk '{ print $NF }'

 # print the last field of the last line
 awk '{ field = $NF }; END{ print field }'

 # print every line with more than 4 fields
 awk 'NF > 4'

 # print every line where the value of the last field is > 4
 awk '$NF > 4'

STRING CREATION:

 # create a string of a specific length (e.g., generate 513 spaces)
 awk 'BEGIN{while (a++<513) s=s " "; print s}'

 # insert a string of specific length at a certain character position
 # Example: insert 49 spaces after column #6 of each input line.
 gawk --re-interval 'BEGIN{while(a++<49)s=s " "};{sub(/^.{6}/,"&" s)};1'

ARRAY CREATION:

 # These next 2 entries are not one-line scripts, but the technique
 # is so handy that it merits inclusion here.
 
 # create an array named "month", indexed by numbers, so that month[1]
 # is 'Jan', month[2] is 'Feb', month[3] is 'Mar' and so on.
 split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec", month, " ")

 # create an array named "mdigit", indexed by strings, so that
 # mdigit["Jan"] is 1, mdigit["Feb"] is 2, etc. Requires "month" array
 for (i=1; i<=12; i++) mdigit[month[i]] = i

TEXT CONVERSION AND SUBSTITUTION:

 # IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format
 awk '{sub(/\r$/,"")};1'   # assumes EACH line ends with Ctrl-M

 # IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format
 awk '{sub(/$/,"\r")};1'

 # IN DOS ENVIRONMENT: convert Unix newlines (LF) to DOS format
 awk 1

 # IN DOS ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format
 # Cannot be done with DOS versions of awk, other than gawk:
 gawk -v BINMODE="w" '1' infile >outfile

 # Use "tr" instead.
 tr -d \r <infile >outfile            # GNU tr version 1.22 or higher

 # delete leading whitespace (spaces, tabs) from front of each line
 # aligns all text flush left
 awk '{sub(/^[ \t]+/, "")};1'

 # delete trailing whitespace (spaces, tabs) from end of each line
 awk '{sub(/[ \t]+$/, "")};1'

 # delete BOTH leading and trailing whitespace from each line
 awk '{gsub(/^[ \t]+|[ \t]+$/,"")};1'
 awk '{$1=$1};1'           # also removes extra space between fields

 # insert 5 blank spaces at beginning of each line (make page offset)
 awk '{sub(/^/, "     ")};1'

 # align all text flush right on a 79-column width
 awk '{printf "%79s\n", $0}' file*

 # center all text on a 79-character width
 awk '{l=length();s=int((79-l)/2); printf "%"(s+l)"s\n",$0}' file*

 # substitute (find and replace) "foo" with "bar" on each line
 awk '{sub(/foo/,"bar")}; 1'           # replace only 1st instance
 gawk '{$0=gensub(/foo/,"bar",4)}; 1'  # replace only 4th instance
 awk '{gsub(/foo/,"bar")}; 1'          # replace ALL instances in a line

 # substitute "foo" with "bar" ONLY for lines which contain "baz"
 awk '/baz/{gsub(/foo/, "bar")}; 1'

 # substitute "foo" with "bar" EXCEPT for lines which contain "baz"
 awk '!/baz/{gsub(/foo/, "bar")}; 1'

 # change "scarlet" or "ruby" or "puce" to "red"
 awk '{gsub(/scarlet|ruby|puce/, "red")}; 1'

 # reverse order of lines (emulates "tac")
 awk '{a[i++]=$0} END {for (j=i-1; j>=0;) print a[j--] }' file*

 # if a line ends with a backslash, append the next line to it (fails if
 # there are multiple lines ending with backslash...)
 awk '/\\$/ {sub(/\\$/,""); getline t; print $0 t; next}; 1' file*

 # print and sort the login names of all users
 awk -F ":" '{print $1 | "sort" }' /etc/passwd

 # print the first 2 fields, in opposite order, of every line
 awk '{print $2, $1}' file

 # switch the first 2 fields of every line
 awk '{temp = $1; $1 = $2; $2 = temp}' file

 # print every line, deleting the second field of that line
 awk '{ $2 = ""; print }'

 # print in reverse order the fields of every line
 awk '{for (i=NF; i>0; i--) printf("%s ",$i);print ""}' file

 # concatenate every 5 lines of input, using a comma separator
 # between fields
 awk 'ORS=NR%5?",":"\n"' file

SELECTIVE PRINTING OF CERTAIN LINES:

 # print first 10 lines of file (emulates behavior of "head")
 awk 'NR < 11'

 # print first line of file (emulates "head -1")
 awk 'NR>1{exit};1'

  # print the last 2 lines of a file (emulates "tail -2")
 awk '{y=x "\n" $0; x=$0};END{print y}'

 # print the last line of a file (emulates "tail -1")
 awk 'END{print}'

 # print only lines which match regular expression (emulates "grep")
 awk '/regex/'

 # print only lines which do NOT match regex (emulates "grep -v")
 awk '!/regex/'

 # print any line where field #5 is equal to "abc123"
 awk '$5 == "abc123"'

 # print only those lines where field #5 is NOT equal to "abc123"
 # This will also print lines which have less than 5 fields.
 awk '$5 != "abc123"'
 awk '!($5 == "abc123")'

 # matching a field against a regular expression
 awk '$7  ~ /^[a-f]/'    # print line if field #7 matches regex
 awk '$7 !~ /^[a-f]/'    # print line if field #7 does NOT match regex

 # print the line immediately before a regex, but not the line
 # containing the regex
 awk '/regex/{print x};{x=$0}'
 awk '/regex/{print (NR==1 ? "match on line 1" : x)};{x=$0}'

 # print the line immediately after a regex, but not the line
 # containing the regex
 awk '/regex/{getline;print}'

 # grep for AAA and BBB and CCC (in any order on the same line)
 awk '/AAA/ && /BBB/ && /CCC/'

 # grep for AAA and BBB and CCC (in that order)
 awk '/AAA.*BBB.*CCC/'

 # print only lines of 65 characters or longer
 awk 'length > 64'

 # print only lines of less than 65 characters
 awk 'length < 64'

 # print section of file from regular expression to end of file
 awk '/regex/,0'
 awk '/regex/,EOF'

 # print section of file based on line numbers (lines 8-12, inclusive)
 awk 'NR==8,NR==12'

 # print line number 52
 awk 'NR==52'
 awk 'NR==52 {print;exit}'          # more efficient on large files

 # print section of file between two regular expressions (inclusive)
 awk '/Iowa/,/Montana/'             # case sensitive

SELECTIVE DELETION OF CERTAIN LINES:

 # delete ALL blank lines from a file (same as "grep '.' ")
 awk NF
 awk '/./'

 # remove duplicate, consecutive lines (emulates "uniq")
 awk 'a !~ $0; {a=$0}'

 # remove duplicate, nonconsecutive lines
 awk '!a[$0]++'                     # most concise script
 awk '!($0 in a){a[$0];print}'      # most efficient script

CREDITS AND THANKS:

Special thanks to the late Peter S. Tillier (U.K.) for helping me with
the first release of this FAQ file, and to Daniel Jana, Yisu Dong, and
others for their suggestions and corrections.

For additional syntax instructions, including the way to apply editing
commands from a disk file instead of the command line, consult:

  "sed & awk, 2nd Edition," by Dale Dougherty and Arnold Robbins
  (O'Reilly, 1997)

  "UNIX Text Processing," by Dale Dougherty and Tim O'Reilly (Hayden
  Books, 1987)

  "GAWK: Effective awk Programming," 3d edition, by Arnold D. Robbins
  (O'Reilly, 2003) or at http://www.gnu.org/software/gawk/manual/

To fully exploit the power of awk, one must understand "regular
expressions." For detailed discussion of regular expressions, see
"Mastering Regular Expressions, 3d edition" by Jeffrey Friedl (O'Reilly,
2006).

The info and manual ("man") pages on Unix systems may be helpful (try
"man awk", "man nawk", "man gawk", "man regexp", or the section on
regular expressions in "man ed").

USE OF '\t' IN awk SCRIPTS: For clarity in documentation, I have used
'\t' to indicate a tab character (0x09) in the scripts.  All versions of
awk should recognize this abbreviation.

#---end of file---
分享到:
评论

相关推荐

    PyPI 官网下载 | handy-dandy-0.1.3.tar.gz

    《PyPI官网下载:深入理解handy-dandy-0.1.3.tar.gz》 PyPI(Python Package Index)是Python编程语言的官方软件仓库,它为开发者提供了一个平台,可以发布并分享他们的Python库。在PyPI官网上,我们可以找到各种...

    Python库 | k3handy-0.1.2-py3-none-any.whl

    这个库的发布形式是一个名为`k3handy-0.1.2-py3-none-any.whl`的文件,这是一个Python的Wheel格式的包,它是一种预编译的二进制包,可以直接安装在支持Python 3的系统上。 Wheel文件的命名结构遵循一定的规范,`k3...

    致美化鼠标:Simplify-Handy-Cursors-12198

    标题 "致美化鼠标:Simplify-Handy-Cursors-12198" 暗示我们关注的焦点是一款名为 "Simplify Handy Cursors" 的鼠标指针增强软件或主题包。这款产品旨在提升用户的计算机体验,通过提供美观、简洁的鼠标光标设计,使...

    PyPI 官网下载 | k3handy-0.1.2-py3-none-any.whl

    资源来自pypi官网。 资源全名:k3handy-0.1.2-py3-none-any.whl

    Handy ABN-crx插件

    Handy ABN-crx插件是一款专为英语用户设计的浏览器扩展程序,旨在简化澳大利亚商业号码(Australian Business Number,简称ABN)的填写过程。在处理与澳大利亚企业相关的在线事务时,ABN是一个至关重要的识别号码,...

    The-Handy-Math-Answer

    根据提供的文件信息,我们可以推断出《The Handy Math Answer Book》是一本旨在为读者提供广泛数学概念、原理及其应用解答的书籍。虽然原文档没有直接给出具体的数学知识点,但通过标题“数学是什么”以及该书属于...

    Python库 | a2y_handy-0.8.0-py3-none-any.whl

    资源分类:Python库 所属语言:Python 资源全名:a2y_handy-0.8.0-py3-none-any.whl 资源来源:官方 安装方法:https://lanzao.blog.csdn.net/article/details/101784059

    handy-master 源码

    "Handy-Master"是一个可能的开源项目或者框架的源码资源,它的名称暗示着它可能是一个方便、实用的工具集或管理系统。由于没有提供更具体的信息,我将基于一般性的开源项目源码分析来解释可能包含的知识点。 1. **...

    HTML, XHTML, and CSS All-in-one Desk Reference For Dummies

    This handy, one–stop guide catches you up on XHTML basics and CSS fundamentals. You′ll learn how to work with Positionable CSS to create floating elements, margins, and multi–column layouts, and ...

    DFT的matlab源代码-Handy-VASP-DFT-Calculations-Scripts:使用VASP进行DFT计算时,某些阅读/绘

    DFT的matlab源代码便捷的VASP-DFT计算脚本 使用VASP进行DFT计算时,某些阅读/绘图或数据提取脚本可能会派上用场 此处的大多数脚本都假定当前文件夹中有一些VASP输出文件,否则用户将指向它们(尤其是Python脚本)

    The-Handy-One

    "The-Handy-One" 这个项目的命名,似乎暗示了它是为独立工作者或个人需求量身定制的一份HTML学习资源。在这个压缩包"The-Handy-One-main"中,我们可以期待找到一系列关于HTML的基础教程和实践案例。 HTML是一种标记...

    PyPI 官网下载 | django-handy-2.2.2.tar.gz

    **PyPI 官网下载 | django-handy-2.2.2.tar.gz** 在Python的世界里,`PyPI`(Python Package Index)是官方的第三方软件包仓库,它为开发者提供了一个集中发布和分享自己创建的Python库的平台。用户可以方便地通过`...

    a2y_handy-0.8.8-py3-none-any.whl

    Python安装包,亲测可用。使用pip install 文件名.whl安装使用

    a2y_handy-0.8.7-py3-none-any.whl

    Python安装包,亲测可用。使用pip install 文件名.whl安装使用

    a2y_handy-0.8.4-py3-none-any.whl

    Python安装包,亲测可用。使用pip install 文件名.whl安装使用

    Handy Note - handy browser note book-crx插件

    语言:English 消化Web上的内容时做笔记。 在网上冲浪时做笔记可提高工作效率。 便捷便笺可帮助您在消化Web内容时做笔记。 做笔记后,您可以将其下载为PDF。 注释会在选项卡和URL之间持久存在,也就是说,即使您关闭...

    handy-tests-bundle:一个方便的 symfony 测试工具包

    更新您的供应商将此行添加到您的composer.json "require": { "carlescliment/handy-tests-bundle": "dev-master"}执行php composer.phar update carlescliment/handy-tests-bundle2. 在app/AppKernel.php加载包 if ...

    Handy-Caculator

    《Handy-Caculator:iOS与Apple Watch的无缝计算体验》 在当今移动设备高度集成的时代,跨平台的交互设计愈发重要。今天我们要探讨的是"Handy-Caculator",一款专为iPhone用户设计的计算器应用,其独特之处在于能够...

Global site tag (gtag.js) - Google Analytics