1 安装perf
[root@www.cndba.cn ~]# yum install perf Loaded plugins: product-id, refresh-packagekit, security, subscription-manager This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register. Setting up Install Process cndba.cn | 4.1 kB 00:00 ... cndba.cn/primary_db | 3.1 MB 00:00 ... Resolving Dependencies --> Running transaction check ---> Package perf.x86_64 0:2.6.32-573.el6 will be installed --> Finished Dependency Resolution Dependencies Resolved ======================================================================================================== Package Arch Version Repository Size ======================================================================================================== Installing: perf x86_64 2.6.32-573.el6 cndba.cn 4.0 M Transaction Summary ======================================================================================================== Install 1 Package(s) Total download size: 4.0 M Installed size: 2.2 M Is this ok [y/N]: y Downloading Packages: Running rpm_check_debug Running Transaction Test Transaction Test Succeeded Running Transaction Warning: RPMDB altered outside of yum. Installing : perf-2.6.32-573.el6.x86_64 1/1 Verifying : perf-2.6.32-573.el6.x86_64 1/1 Installed: perf.x86_64 0:2.6.32-573.el6 Complete! [root@www.cndba.cn ~]#
2 命令说明
NAME perf - Performance analysis tools for Linux SYNOPSIS perf [--version] [--help] [OPTIONS] COMMAND [ARGS] OPTIONS --debug Setup debug variable (see list below) in value range (0, 10). Use like: --debug verbose # sets verbose = 1 --debug verbose=2 # sets verbose = 2 List of debug variables allowed to set: verbose - general debug messages ordered-events - ordered events object debug messages data-convert - data convert command debug messages --buildid-dir Setup buildid cache directory. It has higher priority than buildid.dir config file option. DESCRIPTION Performance counters for Linux are a new kernel-based subsystem that provide a framework for all things performance analysis. It covers hardware level (CPU/PMU, Performance Monitoring Unit) features and software features (software counters, tracepoints) as well. SEE ALSO perf-stat(1), perf-top(1), perf-record(1), perf-report(1), perf-list(1)
Perf是内置于Linux内核源码树中的性能剖析(profiling)工具。它基于事件采样原理,以性能事件为基础,支持针对处理器相关性能指标与操作系统相关性能指标的性能剖析。
常用于性能瓶颈的查找与热点代码的定位。
Perf是一个包含23种子工具的工具集,以下是最常用的5种:
perf-list
perf-stat
perf-top
perf-record
perf-report
[root@www.cndba.cn ~]# perf usage: perf [--version] [--help] COMMAND [ARGS] The most commonly used perf commands are: annotate Read perf.data (created by perf record) and display annotated code archive Create archive with object files with build-ids found in perf.data file bench General framework for benchmark suites buildid-cache Manage build-id cache. buildid-list List the buildids in a perf.data file diff Read perf.data files and display the differential profile evlist List the event names in a perf.data file inject Filter to augment the events stream with additional information kmem Tool to trace/measure kernel memory(slab) properties kvm Tool to trace/measure kvm guest os list List all symbolic event types lock Analyze lock events mem Profile memory accesses record Run a command and record its profile into perf.data report Read perf.data (created by perf record) and display the profile sched Tool to trace/measure scheduler properties (latencies) script Read perf.data (created by perf record) and display trace output stat Run a command and gather performance counter statistics test Runs sanity tests. timechart Tool to visualize total system behavior during a workload top System profiling tool. trace strace inspired tool probe Define new dynamic tracepoints See 'perf help COMMAND' for more information on a specific command.
2.1 Perf list
Perf list用来查看perf所支持的性能事件,有软件的也有硬件的。
NAME perf-list - List all symbolic event types SYNOPSIS perf list [hw|sw|cache|tracepoint|pmu|event_glob] DESCRIPTION This command displays the symbolic event types which can be selected in the various perf commands with the -e option. [root@www.cndba.cn ~]# perf list sw cpu-clock [Software event] task-clock [Software event] page-faults OR faults [Software event] context-switches OR cs [Software event] cpu-migrations OR migrations [Software event] minor-faults [Software event] major-faults [Software event] alignment-faults [Software event] emulation-faults [Software event] dummy [Software event]
2.2 Perf top
对于一个指定的性能事件(默认是CPU周期),显示消耗最多的函数或指令。
NAME perf-top - System profiling tool. SYNOPSIS perf top [-e <EVENT> | --event=EVENT] [<options>] DESCRIPTION This command generates and displays a performance counter profile in real time.
perf top主要用于实时分析各个函数在某个性能事件上的热度,能够快速的定位热点函数,包括应用程序函数、
模块函数与内核函数,甚至能够定位到热点指令。默认的性能事件为cpu cycles。
输出格式
[root@www.cndba.cn /]# perf top
第一列:符号引发的性能事件的比例,默认指占用的cpu周期比例。
第二列:符号所在的DSO(Dynamic Shared Object),可以是应用程序、内核、动态链接库、模块。
第三列:DSO的类型。[.]表示此符号属于用户态的ELF文件,包括可执行文件与动态链接库)。[k]表述此符号属于内核或模块。
第四列:符号名。有些符号不能解析为函数名,只能用地址表示。
输入?号可以看到更多的帮助信息:
常用交互命令
h:显示帮助 UP/DOWN/PGUP/PGDN/SPACE:上下和翻页。 a:annotate current symbol,注解当前符号。能够给出汇编语言的注解,给出各条指令的采样率。 d:过滤掉所有不属于此DSO的符号。非常方便查看同一类别的符号。 P:将当前信息保存到perf.hist.N中。
常用命令行参数
-e <event>:指明要分析的性能事件。 -p <pid>:Profile events on existing Process ID (comma sperated list). 仅分析目标进程及其创建的线程。 -k <path>:Path to vmlinux. Required for annotation functionality. 带符号表的内核映像所在的路径。 -K:不显示属于内核或模块的符号。 -U:不显示属于用户态程序的符号。 -d <n>:界面的刷新周期,默认为2s,因为perf top默认每2s从mmap的内存区域读取一次性能数据。 -G:得到函数的调用关系图。 perf top -G [fractal],路径概率为相对值,加起来为100%,调用顺序为从下往上。 perf top -G graph,路径概率为绝对值,加起来为该函数的热度。
使用例子
# perf top // 默认配置 # perf top -g // 得到调用关系图 # perf top -e cycles // 指定性能事件 # perf top -p 5694,5908 // 查看这两个进程的cpu cycles使用情况 # perf top -s comm,pid,symbol // 显示调用symbol的进程名和进程号 # perf top --comms nginx,top // 仅显示属于指定进程的符号 # perf top --symbols kfree // 仅显示指定的符号
2.3 Perf stat
用于分析指定程序的性能概况。
NAME perf-stat - Run a command and gather performance counter statistics SYNOPSIS perf stat [-e <EVENT> | --event=EVENT] [-a] <command> perf stat [-e <EVENT> | --event=EVENT] [-a] — <command> [<options>] DESCRIPTION This command runs a command and gathers performance counter statistics from it.
输出格式
[oracle@www.cndba.cn ~]$ perf stat lsnrctl LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 25-OCT-2016 23:28:13 Copyright (c) 1991, 2014, Oracle. All rights reserved. Welcome to LSNRCTL, type "help" for information. LSNRCTL> start Starting /u01/app/oracle/product/12.1.0/db_1/bin/tnslsnr: please wait... TNSLSNR for Linux: Version 12.1.0.2.0 - Production System parameter file is /u01/app/oracle/product/12.1.0/db_1/network/admin/listener.ora Log messages written to /u01/app/oracle/diag/tnslsnr/Dave/listener/alert/log.xml Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=Dave)(PORT=1521))) Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521))) Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=Dave)(PORT=1521))) STATUS of the LISTENER ------------------------ Alias LISTENER Version TNSLSNR for Linux: Version 12.1.0.2.0 - Production Start Date 25-OCT-2016 23:28:16 Uptime 0 days 0 hr. 0 min. 5 sec Trace Level off Security ON: Local OS Authentication SNMP OFF Listener Parameter File /u01/app/oracle/product/12.1.0/db_1/network/admin/listener.ora Listener Log File /u01/app/oracle/diag/tnslsnr/Dave/listener/alert/log.xml Listening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=Dave)(PORT=1521))) (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521))) The listener supports no services The command completed successfully LSNRCTL> exit Performance counter stats for 'lsnrctl': 40.052786 task-clock (msec) # 0.004 CPUs utilized 100 context-switches # 0.002 M/sec 0 cpu-migrations # 0.000 K/sec 6,821 page-faults # 0.170 M/sec <not supported> cycles <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend <not supported> instructions <not supported> branches <not supported> branch-misses 10.956606982 seconds time elapsed [oracle@www.cndba.cn ~]$
输出包括lsnrctl的执行时间以及10个性能事件的统计。
task-clock:任务真正占用的处理器时间,单位为ms。CPUs utilized = task-clock / time elapsed,CPU的占用率。
context-switches:上下文的切换次数。
CPU-migrations:处理器迁移次数。Linux为了维持多个处理器的负载均衡,在特定条件下会将某个任务从一个CPU迁移到另一个CPU。
page-faults:缺页异常的次数。当应用程序请求的页面尚未建立、请求的页面不在内存中,或者请求的页面虽然在内存中,但物理地址和虚拟地址的映射关系尚未建立时,都会触发一次缺页异常。另外TLB不命中,页面访问权限不匹配等情况也会触发缺页异常。
cycles:消耗的处理器周期数。
stalled-cycles-frontend:略过。
stalled-cycles-backend:略过。
instructions:执行了多少条指令。IPC为平均每个cpu cycle执行了多少条指令。
branches:遇到的分支指令数。branch-misses是预测错误的分支指令数。
常用参数
-p:stat events on existing process id (comma separated list). 仅分析目标进程及其创建的线程。 -a:system-wide collection from all CPUs. 从所有CPU上收集性能数据。 -r:repeat command and print average + stddev (max: 100). 重复执行命令求平均。 -C:Count only on the list of CPUs provided (comma separated list), 从指定CPU上收集性能数据。 -v:be more verbose (show counter open errors, etc), 显示更多性能数据。 -n:null run - don't start any counters,只显示任务的执行时间 。 -x SEP:指定输出列的分隔符。 -o file:指定输出文件,--append指定追加模式。 --pre <cmd>:执行目标程序前先执行的程序。 --post <cmd>:执行目标程序后再执行的程序。
使用例子
执行10次程序,给出标准偏差与期望的比值: # perf stat -r 10 ls > /dev/null 显示更详细的信息: # perf stat -v ls > /dev/null 只显示任务执行时间,不显示性能计数器: # perf stat -n ls > /dev/null 单独给出每个CPU上的信息: # perf stat -a -A ls > /dev/null ls命令执行了多少次系统调用: # perf stat -e syscalls:sys_enter ls
2.4 Perf record
收集采样信息,并将其记录在数据文件中。
随后可以通过其它工具(perf-report)对数据文件进行分析,结果类似于perf-top的。
NAME perf-record - Run a command and record its profile into perf.data SYNOPSIS perf record [-e <EVENT> | --event=EVENT] [-l] [-a] <command> perf record [-e <EVENT> | --event=EVENT] [-l] [-a] — <command> [<options>] DESCRIPTION This command runs a command and gathers a performance counter profile from it, into perf.data - without displaying anything. This file can then be inspected later on, using perf report.
常用参数
-e:Select the PMU event. -a:System-wide collection from all CPUs. -p:Record events on existing process ID (comma separated list). -A:Append to the output file to do incremental profiling. -f:Overwrite existing data file. -o:Output file name. -g:Do call-graph (stack chain/backtrace) recording. -C:Collect samples only on the list of CPUs provided.
使用例子
记录dbw进程的性能数据:
[root@www.cndba.cn ~]# pgrep dbw 5309 [root@www.cndba.cn ~]# perf record -p 5309 -o /tmp/dave.log ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.009 MB /tmp/dave.log (~409 samples) ]
2.5 perf-report
读取perf record创建的数据文件,并给出热点分析结果。
NAME perf-report - Read perf.data (created by perf record) and display the profile SYNOPSIS perf report [-i <file> | --input=file] DESCRIPTION This command displays the performance counter profile information recorded via perf record.
使用例子
[root@www.cndba.cn ~]# perf report -i /tmp/dave.log
版权声明:本文为博主原创文章,未经博主允许不得转载。