Linux系统通过shell提供了大量方便的工具,如:awk、grep、sort、more、less、tail等等,方便程序员或者从事数据分析的人员对一些小文件的快速分析,掌握好这些工具,可以极大地提高简单数据分析的效率。
一、awk常用技巧和方法
-
文件每行按第二列去重并打印第二列不同的值及出现次数:
awk -F"\t" '{a[$2]+=1}END{for(x in a) print x"\t"a[x]}' a
-
求两个文件中第一列数据的交集:
假设文件名为a和b,文件每行用\t分隔
awk -F"\t" 'ARGIND==1{a[$1]=1}ARGIND==2{if($1 in a) print}' a b
其中:
-F表示指定每行的分隔符
ARGIND==1表示处理第一个文件
END
二、grep常用技巧和方法
-
把文件中包含china的行打印出来:
grep "china" a
把文件中以china字符串起始的行打印出来:
grep "^china" a
-
把文件中出现在abc和xx之间的字符串打印出来:
grep -o -P "(?<=abc).*?(?=xx)" a
END
四、组合用法
-
统计accesslog日志中的各ip出现次数,并按次数从小到大排序:
awk '{a[$1]+=1}END{for(x in a) print x,a[x]}' file | sort +1rg -2 > file.output
-
文件中包含有china的行数:
grep "china" file | wc -l
-
查看线上实时滚动日志中带有warning的日志:
tail -f file.log | grep "warning"
END
五、其它常用shell脚本
-
对一个100w行的文件A随机取100行:
cat A | shuf -n 100
-
在/home目录下面查询以.log结尾的文件并统计文件总大小:
find /home -name *.log | xargs du -s | awk '{print;sum+=$1}END{print sum}'
删除/home目录下面的所有.svn文件:
find /home -name .svn | xargs rm -rf
相关推荐
Practical Guide to Linux Commands, Editors, and Shell Programming, A, 4th Edition By Mark G. Sobell, Matthew Helmke Published Nov 9, 2017 by Addison-Wesley Professional. The Most Useful Tutorial and...
Shell Programming in Unix, Linux and OS X is a thoroughly updated revision of Kochan and Wood’s classic Unix Shell Programming tutorial. Following the methodology of the original text, the book ...
* **User Interaction:** Teaches how to write Bash scripts that interact with users or other shell features, allowing for dynamic and interactive command-line applications. * **Automation:** Enables ...
“This book is a very useful tool for anyone who wants to ‘look under the hood’ so to speak, and really start putting the power of Linux to work. What I find particularly frustrating about man pages...
This cheat sheet provides a comprehensive overview of essential shell commands, constructs, and utilities commonly used in UNIX/Linux environments. It is designed to serve as a quick reference guide ...
How to be efficient at the command line by using aliases, tab completion, and your shell history. How to schedule and automate jobs using cron. How to switch users and run processes as others. ...
MEGA can be used with either a graphical user interface (useful for visual exploration of data and results) or a command-line interface (useful for batch or scripted execution). The graphical user ...
Shell scripting is a powerful tool in Linux for automating tasks and managing system operations. It involves writing scripts using a shell (like Bash or Sh) to execute commands and perform complex ...
** In the context of Unix and shell programming, a command refers to an instruction or a set of instructions that are executed by the shell or another program. Commands can be simple, like `ls` for ...
Linux Pocket Guide provides an organized learning path to help you gain mastery of the most useful and important commands. Whether you’re a novice who needs to get up to speed on Linux or an ...
总之,"useful-script"项目提供了一个学习和借鉴Shell脚本实战经验的机会,特别是关于并行处理的部分,这对于任何在Linux或Unix环境中工作的IT专业人士来说都是非常有价值的。通过对这些脚本的深入学习,我们可以...
This section covers various hardware utilities and tools useful for system administrators: - **Hardware Utilities**: - **MAKEDEV Script**: A script used to create device files. - **mknod Command**:...
预计写一系列linux系统相关的知识总结 Linux Command Shell script programming Linux useful skills Windows Batch Deploy server on linux
DOS内部命令 用于退出当前的命令处理器(COMMAND.COM) 恢复前一个命令处理器。 Ctrl+d 跟exit一样效果,表中止本次操作。 logout 当csh时可用来退出,其他shell不可用。 clear 清屏,清除(之前的内容并未删除,只是...
Clone with many useful features. The Midnight Commander comes with mouse support on xterms and optionally on the Linux console. The Midnight Commander is a directory browsing tool which bears a ...
Ubuntu is one of the most popular Linux distributions, widely used by developers and system administrators for its stability, security, and robust package management system. This cheat sheet provides ...
在Linux和Unix-like操作系统中,Bash(Bourne-Again SHell)是一种广泛使用的命令行解释器。它允许用户通过终端与系统交互,执行各种任务。为了提高效率,Bash支持创建别名,即将复杂的命令或命令序列简化为一个简短...