浏览 1225 次
锁定老帖子 主题:Linux文件分割与合并
精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (0)
|
|
---|---|
作者 | 正文 |
发表时间:2014-08-26
Linux里面提供了,许多简洁的shell命令,而这些命令不仅简洁,而且作用十分强大,散仙今天要与大家分享的是关于在linux里面如何拆分,分割文件,和如何合并文件,当然在文章末,还会记录一下,另外几个有用的命令。
关于文件分割最有用的命令就是split命令了,下面看下split的用法: <pre name="code" class="java"> NAME split - split a file into pieces SYNOPSIS split [OPTION]... [INPUT [PREFIX]] DESCRIPTION Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default size is 1000 lines, and default PREFIX is ‘x’. With no INPUT, or when INPUT is -, read standard input. Mandatory arguments to long options are mandatory for short options too. -a, --suffix-length=N use suffixes of length N (default 2) -b, --bytes=SIZE put SIZE bytes per output file -C, --line-bytes=SIZE put at most SIZE bytes of lines per output file -d, --numeric-suffixes use numeric suffixes instead of alphabetic -l, --lines=NUMBER put NUMBER lines per output file --verbose print a diagnostic just before each output file is opened --help display this help and exit --version output version information and exit SIZE may be (or may be an integer optionally followed by) one of following: KB 1000, K 1024, MB 1000*1000, M 1024*1024, and so on for G, T, P, E, Z, Y. AUTHOR Written by Torbjorn Granlund and Richard M. Stallman. REPORTING BUGS Report split bugs to bug-coreutils@gnu.org GNU coreutils home page: <http://www.gnu.org/software/coreutils/> General help using GNU software: <http://www.gnu.org/gethelp/> Report split translation bugs to <http://translationproject.org/team/> COPYRIGHT Copyright © 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. SEE ALSO The full documentation for split is maintained as a Texinfo manual. If the info and split programs are properly installed at your site, the command :</pre> (1)按行数切分:split -l 2000 要分割的文件 子文件的前缀 (2)按大小切分: split -b 20m 要分割的文件 子文件的前缀 (3)split -l 2482 ../BLM/BLM.txt -d -a 4 BLM_ 将 文件 BLM.txt 分成若干个小文件,每个文件2482行(-l 2482),文件前缀为BLM_ ,系数不是字母而是数字(-d),后缀系数为四位数(-a 4) 上面说的分割,下面我们来看下如何合并多个文件,如果是比较重要的大文件或数据文件,在分割的时候,可以作md5校验和,在合并时重新校验数据,如果发现不一致,则是传输损坏等情况造成,可重新传输: <pre name="code" class="java">[search@h1 823]$ md5sum a.txt 2dbf68d4aba8dbe6a485293f8464be64 a.txt [search@h1 823]$ </pre> 使用cat命令进行合并: cat *.txt >> total.txt 最后记录一下,几个比较有用的命令,对一个日志文件中的ip地址,去重,统计,排序,怎么做? <pre name="code" class="java">cat test.txt|awk '{print $1}'|sort|uniq -c</pre> 声明:ITeye文章版权属于作者,受法律保护。没有作者书面许可不得转载。
推荐链接
|
|
返回顶楼 | |