linux performance tuning

stephen80

浏览: 108055 次
性别:
来自: 北京

最近访客更多访客>>

wu1239

范泽添

guotufu

a1473321851

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

linux

performance Linux Firefox 360 Cache

学习了以下工具。
top, load
vmstat 2 3 ,
iostat,
mpstat,
ps.
pidstat -d 2 ,
pidstat -r ,
lsof
strace
sar,
sar -o sart 10 5000 >/dev/null 2>&1 &
oprofile .
ps -el |awk '{ if ( $6 > (0)) { print $0 } }'

两本好书
Linux Debugging and Performance Tuning Tips and Techniques
Performance.Tuning.for.Linux.Servers.May.2005.eBook-BBL

sudo opcontrol --init
sudo opcontrol --no-vmlinux --event=CPU_CLK_UNHALTED:600000 --image=ad_uni
sudo opcontrol --start
./ad_unittest
opcontrol --dump
opreport --long-filenames --symbols --exclude-dependent --demangle=smart

sudo opcontrol --reset

/var/lib/oprofile/oprofiled.log

1        16.6667 apsara::a_IstreamBuffer::bad()
1        16.6667 apsara::cangjie::CangjieDefinitionDatabase::Register(apsara::cangjie::Message const&)
1        16.6667 apsara::security::SHA1::transform()
1        16.6667 boost::detail::sp_counted_base::weak_release()
1        16.6667 std::_Rb_tree<std::string, std::pair<std::string const, apsara::Any>, std::_Select1st<std::pair<std::string const, apsara::Any> >, std::less<std::string>, std::allocator<std::pair<std::string const, apsara::Any> > >::_M_rightmost()

http://oprofile.sourceforge.net/examples/

Call-graph profiling unsupported on this kernel/hardware.

env.AppendUnique(CCFLAGS = '-pg')
env.AppendUnique(LINKFLAGS = '-pg')
env.AppendUnique(CCFLAGS ='-fno-inline ')
gprof -p ext >myrep

寻找最可疑的应用 Quick Q&A
1. Find CPU/Memory top demon这个简单, 就是top命令就满足要求了
#>top
Tasks: 156 total,   5 running, 150 sleeping,   0 stopped,   1 zombie
Cpu(s): 32.7%us, 4.0%sy, 0.0%ni, 63.0%id, 0.2%wa, 0.2%hi, 0.0%si, 0.0%st
Mem:   2074056k total, 2018904k used,    55152k free,   205852k buffers
Swap: 2000052k total,        0k used, 2000052k free, 1155496k cached

PID USER      PR NI VIRT RES SHR S %CPU %MEM    TIME+ COMMAND
20820 yhe       20   0 2641m 61m 16m S   54 3.0   1:31.32 Picasa3.exe
6908 yhe       20   0 334m 145m 31m R    8 7.2   5:26.29 firefox
18716 yhe       20   0 98652 56m 9652 S    6 2.8   1:13.62 wish8.5

2.Find top Disk hunger这个就废些周折, 需要动用pidstat, 下面是每3秒报告一次

#>pidstat 3 -d
Linux 2.6.24-23-generic (yhe-laptop)     03/11/2009

03:12:15 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s Command
03:12:18 PM      4575      0.00      6.62      0.00 kjournald
03:12:18 PM     20233    683.44   3197.35      0.00 insight3.exe

03:12:18 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s Command
03:12:21 PM      4575      0.00      3.99      0.00 kjournald
03:12:21 PM     20233      0.00   4433.22      0.00 insight3.exe

3.Grab the top network Process这个一步做不来, 分成两个步骤.
a) 用iptarf 监视网络流量,找到占用带宽最大的ip和端口, 不过用sudo iftop -N -P -B可能更方便一些
#>sudo iftop -N -P -B
yhe-laptop.local:55539            => 58.211.78.242:80                   502B    329B    480B

                                  <=                                   12.5KB 10.4KB 17.8KB
稍作提示, o可以freaze当前排序. 最右边的三列是2, 10, 40 秒的速度.

b) 用netstat -t -u -p 来找到tcp 或者udp中占用这个端口的进程即可
#>netstat -t -u -p
tcp        0      0 yhe-laptop.local:55539 58.211.78.242:www       ESTABLISHED 24257/firefox
tcp        0      0 yhe-laptop.local:55539 58.211.78.242:www       ESTABLISHED 24257/firefox

4. 运行时dumpcore
就是gcore了...
# gcore $PID

5. 进程继承关系
另附,进程继承关系的获取方法, 超长,与上一条,共同来自:http://blog.chinaunix.net/u/19651/showart_362880.html ps -eo user,pid,ppid,%cpu,%mem,vsz,rss,tty,stat,start,time,wchan,command --forest
俺觉得 ps -eo user,pid,ppid,tty,stat,wchan,command --forest就够了.
root      6402     1 ?        Ss   -      /usr/sbin/gdm
root      6405 6402 ?        S    -       \_ /usr/sbin/gdm
root      6407 6405 tty7     RLs+ -           \_ /usr/bin/X :0 -br -audit 0 -auth /var/lib/gdm/:0.Xauth -nolisten tcp vt7
yhe       6773 6405 ?        Ssl -           \_ /usr/bin/gnome-session
yhe       6859 6773 ?        Z    -               \_ [fcitx] <defunct>
yhe       6867 6773 ?        Ss   -               \_ /usr/bin/seahorse-agent --execute /usr/bin/gno.
yhe       6872 6773 ?        Sl   -               \_ gnome-settings-daemon
yhe       6876 6872 ?        Sl   -               |   \_ /usr/bin/pulseaudio --log-target=syslog
yhe       6879 6876 ?        S    -               |       \_ /usr/lib/pulseaudio/pulse/gconf-helper
yhe       6896 6773 ?        S    -               \_ metacity --sm-client-id ...
yhe       6897 6773 ?        S    -               \_ nautilus --sm-config-prefix /nautilus-igAnc9/
yhe       6898 6773 ?        S    -               \_ gnome-panel --sm-config-prefix /gnome-panel-1s3
yhe       6906 6773 ?        S    -               \_ update-notifier --sm-config-prefix /update-noti
yhe       6910 6773 ?        Sl   -               \_ stardict --sm-config-prefix /stardict-FFPTD9/ -
yhe       6913 6773 ?        S    -               \_ bluetooth-applet --singleton
yhe       6917 6773 ?        S    -               \_ tracker-applet
yhe       6919 6773 ?        Sl   -               \_ /usr/lib/evolution/2.22/evolution-alarm-notify
yhe       6921 6773 ?        SNl -               \_ trackerd
yhe       6924 6773 ?        S    -               \_ python /usr/share/system-config-printer/apple
yhe       6925 6773 ?        S    -               \_ nm-applet --sm-disable

内核资源和负载分析1. 中断查找当前最频繁的中断源, 1秒一次,100次采样. sar 是systat的一个工具.
#>sar -I ALL 1 100
04:11:42 PM      INTR    intr/s
04:11:43 PM         0    532.35
04:11:43 PM         1      0.00
04:11:43 PM         2      0.00
04:11:43 PM         3      0.00
04:11:43 PM         4      0.00
04:11:43 PM         5      0.00
04:11:43 PM         6      0.00
04:11:43 PM         7      0.00
04:11:43 PM         8      0.00
04:11:43 PM         9      0.00
04:11:43 PM        10      0.00
04:11:43 PM        11      0.00
04:11:43 PM        12      0.00
04:11:43 PM        13      0.00
04:11:43 PM        14      0.00
04:11:43 PM        15      0.00

2. cpu时间分配情况统计$mpstat 1
04:37:31 PM CPU   %user   %nice    %sys %iowait    %irq   %soft %steal   %idle    intr/s
04:37:32 PM all    0.97    0.00    0.00    0.00    0.00    0.00    0.00   99.03    241.00
04:37:33 PM all    1.43    0.00    0.00    0.00    0.00    0.00    0.00   98.57    230.00
04:37:34 PM all    1.44    0.00    0.00    0.00    0.00    0.00    0.00   98.56    238.00
04:37:35 PM all    0.95    0.00    0.00    0.00    0.00    0.00    0.00   99.05    209.00
04:37:36 PM all    1.96    0.00    1.96    0.00    0.00    0.00    0.00   96.08    355.00
04:37:37 PM all   12.94    0.00    6.47    0.00    0.00    0.00    0.00   80.60    364.00
04:37:38 PM all   11.71    0.00    5.37    0.00    0.00    0.00    0.00   82.93    488.00
04:37:39 PM all   11.65    0.00    6.31    0.00    0.00    0.00    0.00   82.04    431.00
04:37:40 PM all   14.35    0.00    7.18    0.00    0.00    0.00    0.00   78.47    393.00
04:37:41 PM all   10.24    0.00    8.78    0.00    0.00    0.00    0.00   80.98    298.00
04:37:42 PM all   16.75    0.00    6.70    0.00    0.00    0.48    0.00   76.08    404.00
04:37:43 PM all   11.16    0.00    9.30    1.40    0.00    0.00    0.00   78.14    441.00
04:37:44 PM all   32.70    0.00    9.95    0.00    0.47    0.47    0.00   56.40    511.00

3. 获取内核slab分配器内存资源

-s 可以指定按照什么标准排序.

$slabtop -d 1 # 魅秒刷新一次
Active / Total Objects (% used)    : 426706 / 590817 (72.2%)
Active / Total Slabs (% used)      : 32772 / 32772 (100.0%)
Active / Total Caches (% used)     : 53 / 62 (85.5%)
Active / Total Size (% used)       : 110749.59K / 127158.79K (87.1%)
Minimum / Average / Maximum Object : 0.01K / 0.21K / 8.00K

OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
183960 49946 27%    0.05K   2520       73     10080K buffer_head
174048 162193 93%    0.48K 21756        8     87024K ext3_inode_cache
127530 127503 99%    0.13K   4251       30     17004K dentry
22269 13321 59%    0.29K   1713       13      6852K radix_tree_node
14280 14280 100%    0.05K    168       85       672K sysfs_dir_cache
11868 10915 91%    0.09K    258       46      1032K vm_area_struct
9984   6365 63%    0.06K    156       64       624K kmalloc-64
7560   6745 89%    0.19K    360       21      1440K kmalloc-192
6656   6642 99%    0.01K     13      512        52K kmalloc-8
5632   4090 72%    0.02K     22      256        88K kmalloc-16
5632   5022 89%    0.02K     22      256        88K anon_vma
3258   3230 99%    0.43K    362        9      1448K shmem_inode_cache
3091   3043 98%    0.34K    281       11      1124K proc_inode_cach

4. 虚拟内存和磁盘负载vmstat 1, 就是1秒每次报告关于memory, swap, bio以及cpu的负载. 另一个有用的使用方式是:
~$ vmstat -s 1
      2074056 K total memory
      2002148 K used memory
      1244268 K active memory
       592188 K inactive memory
        71908 K free memory
        87896 K buffer memory
      1034620 K swap cache
      2000052 K total swap
        19824 K used swap
      1980228 K free swap
      1366692 non-nice user cpu ticks
        12341 nice user cpu ticks
       375984 system cpu ticks
     18453503 idle cpu ticks
       137706 IO-wait cpu ticks
         4124 IRQ cpu ticks
         7955 softirq cpu ticks
            0 stolen cpu ticks
      3495648 pages paged in
      6360532 pages paged out
           32 pages swapped in
         4960 pages swapped out
     42464919 interrupts
    180937216 CPU context switches
   1236748696 boot time
       100148 forks

5. 系统io 负载情况统计关键的工具就是iostat, 磁盘-d, NFS -n. -k 以kb为单位, -x显示扩展信息.
$iostat -k -d 1 -x
Linux 2.6.24-23-generic (yhe-laptop)     03/12/2009

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await svctm %util
sda               0.48    10.40    2.21    5.76    35.31    64.66    25.08     0.20   24.87   1.82   1.45

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await svctm %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

6. sar, 集大成者....基本用法sar 1 100 option: 每秒1次, 100次采样, 可以统计: I/O -b, paging -B, 任务创建速度-c, block dev -d, 中断 -I, 网络 -n , runqueue -q , CPU -P, 内存资源 -r, 内存负载 -R, inode -v, swap负载 -W, 调度负载 -w, TTy负载 -y.

$sar 1 10 -R
Linux 2.6.24-23-generic (yhe-laptop)     03/12/2009

05:07:31 PM   frmpg/s   bufpg/s   campg/s
05:07:32 PM     -4.00      1.00     -1.00
05:07:33 PM      0.00      0.00      0.99
05:07:34 PM      0.00      1.00     -1.00
进程行为分析
第一个要说是pidstat, 可以用-p 指定一个进程, 也可以不指定,从而找到活跃的进程.主要能分析磁盘使用信息-d, page fault和内存使用情况 -r, cpu使用信息-u, 进程切换信息-w. 还可以用-T CHILD 来监视一个进程以及其所有子进程. 其他? man下吧.
$pidstat 1 -r
Linux 2.6.24-23-generic (yhe-laptop)     03/12/2009

03:26:35 PM       PID minflt/s majflt/s     VSZ    RSS   %MEM Command
03:26:36 PM      4752      1.90      0.00    1836    556   0.03 portmap
03:26:36 PM      6897      0.95      0.00 134332 63824   3.08 nautilus
03:26:36 PM     24257   1698.10      0.00 463172 183700   8.86 firefox
03:26:36 PM     29655      2.86      0.00    7928   4136   0.20 bash
03:26:36 PM     32035    302.86      0.00    2016    756   0.04 pidstat

$pidstat 1 -w
03:27:17 PM       PID   cswch/s nvcswch/s Command
03:27:18 PM         3      2.00      0.00 migration/0
03:27:18 PM         4      6.00      0.00 ksoftirqd/0
03:27:18 PM         6      2.00      0.00 migration/1
03:27:18 PM         7     10.00      0.00 ksoftirqd/1
03:27:18 PM         9      1.00      0.00 events/0
03:27:18 PM        10      1.00      0.00 events/1
03:27:18 PM      1578      7.00      0.00 ata/0
03:27:18 PM      1581      8.00      0.00 ata/1
03:27:18 PM      2450     14.00      0.00 scsi_eh_3
03:27:18 PM      5243     12.00      4.00 kondemand/0
03:27:18 PM      6016      1.00      0.00 ntpd
03:27:18 PM      6033      1.00      0.00 dhcdbd
03:27:18 PM      6182      9.00      0.00 hald-addon-stor
03:27:18 PM      6407     42.00      2.00 Xorg
03:27:18 PM      6863      4.00      0.00 fcitx
03:27:18 PM      6872      2.00      0.00 gnome-settings-
03:27:18 PM      6892      2.00      0.00 gnome-screensav
03:27:18 PM      6897     10.00      0.00 nautilus
03:27:18 PM      6910      7.00      0.00 stardict
03:27:18 PM      6925      2.00      0.00 nm-applet
03:27:18 PM      6935      2.00      0.00 gnome-power-man
03:27:18 PM      7104      1.00      0.00 cpufreq-applet
03:27:18 PM      7110      1.00      3.00 gvfsd-trash
03:27:18 PM      7121      3.00      0.00 netspeed_applet
03:27:18 PM      7693     26.00      0.00 xchat
03:27:18 PM     18716     18.00     22.00 wish8.5
03:27:18 PM     24257    166.00      6.00 firefox
03:27:18 PM     26607     21.00      0.00 bandwidthd
03:27:18 PM     26608     23.00      0.00 bandwidthd
03:27:18 PM     26609     23.00      0.00 bandwidthd
03:27:18 PM     26610     23.00      0.00 bandwidthd
03:27:18 PM     27102      1.00      0.00 soffice.bin
03:27:18 PM     29651     48.00      0.00 gnome-terminal
03:27:18 PM     32056      1.00     33.00 pidstat

另有一个重要的工具是strace, 可以分析进程的系统调用信息并可以做统计:
$strace -p 24257 -c    #指定一个已经在运行的程序,统计一段时间内的系统调用情况
Process 24257 attached - interrupt to quit
Process 24257 detached
% time     seconds usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
73.97    0.004445           3      1541           poll
13.26    0.000797           0      9515           gettimeofday
12.76    0.000767           0      2028      1542 read
0.00    0.000000           0         1           restart_syscall
0.00    0.000000           0       230           write
0.00    0.000000           0         2           select
0.00    0.000000           0         1           writev
0.00    0.000000           0        35         1 futex
0.00    0.000000           0         2         2 inotify_add_watch
------ ----------- ----------- --------- --------- ----------------
100.00    0.006009                 13355      1545 total

多说两句就是-T 可以统计每个系统调用所用时间, -v可以显示血淋淋的细节信息.-e可以粗略的选择要trace的系统调用分类,例如-e trace=ipc, 更多信息请man.

接下来就是ltrace,追踪库函数, 对应strace, 用法差不多, 也有-T, -c 等, 具体不列.

自：http://blog.chinaunix.net/u2/79526/showart_1879044.html

分享到：

问题是 | saas architecture

2010-01-19 15:23
浏览 1458
评论(0)
分类:操作系统
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论