Introduction
To make a good software, in terms of throughput, resource control or latency etc..., one fundamental aspect is that you need to know how to do measurement. There are many free tools for you to do profile and benchmark your application.
In this post, I would like to show how to use perf and hotspot on Linux for C++ application profile visualization.
![]() |
| hotspot for linux perf |
![]() |
| gprof2dot for linux perf |
![]() |
| gprof2dot for gprof |
Wordings
- instrumentation profiling
- need to insert code hooks explicitly record metrics
- sampling profiling
- profiling
- visualization your program
- function call stack
- function execution time
- benchmark
- timing your program
- you use this to understand how long does your program need to run for a task
Tools
╔════════════╦══════════════════════════╦════════════════════════════════════════════════════════════════════════════╗ ║ Tool name ║ Type ║ Comment ║ ╠════════════╬══════════════════════════╬════════════════════════════════════════════════════════════════════════════╣ ║ gprof ║ sampling profiler ║ it misses the key events, this is not what you want for micro optimization ║ ╠════════════╬══════════════════════════╬════════════════════════════════════════════════════════════════════════════╣ ║ perf ║ sampling profiler ║ good, have cache miss counting, branch miss counting ║ ╠════════════╬══════════════════════════╬════════════════════════════════════════════════════════════════════════════╣ ║ Vtune ║ sampling profiler ║ created by intel ║ ╠════════════╬══════════════════════════╬════════════════════════════════════════════════════════════════════════════╣ ║ DTrace ║ profiler ║ ║ ╠════════════╬══════════════════════════╬════════════════════════════════════════════════════════════════════════════╣ ║ valgrind ║ instrumentation profiler ║ slow ║ ╠════════════╬══════════════════════════╬════════════════════════════════════════════════════════════════════════════╣ ║ callgrind ║ instrumentation profiler ║ it is too intrusive, does not catch I/O slowness/jitter ║ ╠════════════╬══════════════════════════╬════════════════════════════════════════════════════════════════════════════╣ ║ Optick ║ instrumentation profiler ║ good for game application profiling ║ ╠════════════╬══════════════════════════╬════════════════════════════════════════════════════════════════════════════╣ ║ gperftools ║ benchmark ║ it is not representative of a realistic environment ║ ╠════════════╬══════════════════════════╬════════════════════════════════════════════════════════════════════════════╣ ║ hotspot ║ profiler reader ║ it can read perf record output file ║ ╚════════════╩══════════════════════════╩════════════════════════════════════════════════════════════════════════════╝
perf, hotspot [4], [5]
Installation
# install through os repository $ sudo apt install linux-tools-$(uname -r) linux-tools-generic # you can build perf locally $ sudo apt install flex bison libelf-dev libunwind-dev libaudit-dev libslang2-dev libdw-dev $ git clone https://github.com/torvalds/linux --depth=1 $ cd linux/tools/perf/ $ make $ make install $ sudo cp perf /usr/bin $ perf # require enough permission $ sudo su # As Root $ sysctl -w kernel.perf_event_paranoid=-1 $ echo 0 > /proc/sys/kernel/kptr_restrict $ exit
Usage --- perf
# build your app $ g++ you_app.cpp -g3 -o you_app # record, report and annotate $ perf record ./you_app $ ls you_app you_app.cpp perf.data $ perf report $ perf annotate # stat $ perf stat ./you_app 861.62 msec task-clock # 0.999 CPUs utilized 67 context-switches # 77.761 /sec 0 cpu-migrations # 0.000 /sec 139 page-faults # 161.325 /sec 4,393,074,476 cycles # 5.099 GHz 14,250,101,192 instructions # 3.24 insn per cycle 2,259,534,632 branches # 2.622 G/sec 15,829,588 branch-misses # 0.70% of all branches 0.862198714 seconds time elapsed 0.857723000 seconds user 0.004026000 seconds sys
Usage --- hotspot
# install hotspot $ sudo apt-get install hotspot # usage --- generate perf output $ perf record --call-graph dwarf <your application> $ hotspot ./perf.data
gprof [1]
Installation and usage --- gprof
# installation for gprof on ubuntu $ apt-get install binutils # build your app $ g++ your_app.cpp -pg -o your_app # run app $ ./your_app # view result $ ls -hal gmon.out your_app.cpp your_app $ gprof ./your_app | grep -v std | grep -v static | grep -v cxx index % time self children called name <spontaneous> [1] 100.0 0.00 0.01 main [1] 0.00 0.01 1/1 Demo_Word_Ladder() [3] 0.00 0.00 1/1 CmdOpts<main::Opts>::parse(int, char const* const*) [444] 0.00 0.00 1/2 main::Opts::~Opts() [437] ----------------------------------------------- ----------------------------------------------- 0.00 0.01 1/1 main [1] [3] 100.0 0.00 0.01 1 Demo_Word_Ladder() [3] 0.00 0.01 1/1 Word_Ladder_Solution::Run_Test(int, int) [4] 0.00 0.00 1/1 Word_Ladder_Solution::Word_Ladder_Solution() [440] 0.00 0.00 1/1 Word_Ladder_Solution::~Word_Ladder_Solution() [441] ----------------------------------------------- 0.00 0.01 1/1 Demo_Word_Ladder() [3] [4] 100.0 0.00 0.01 1 Word_Ladder_Solution::Run_Test(int, int) [4] ----------------------------------------------- 0.00 0.01 200/200 Word_Ladder_Solution::Run_Test(int, int) [4]
gprof2dot [2], [3]
Installation and usage --- gprof2dot, gprof, perf
# installation $ pip install gprof2dot # view image --- perf $ perf script | c++filt | gprof2dot -w -f perf | dot -Tpng -o output.png # view image --- gprof $ gprof ./your_app | gprof2dot -w | dot -Tpng -o output_gprof.png
Reference
[1] D. (2020, October 7). Profiling with gprof. YouTube. https://www.youtube.com/watch?v=re79V7hNiBY
[2] J. (n.d.). GitHub - jrfonseca/gprof2dot: Converts profiling output to a dot graph. GitHub. https://github.com/jrfonseca/gprof2dot
[3] Hide long description of function while profiling with gprof2dot. (n.d.). Stack Overflow. https://stackoverflow.com/a/30457325/2358836
[4] Neutrino’s Blog: 在 Linux 上使用 Perf 做效能分析(入門篇). (n.d.). https://tigercosmos.xyz/post/2020/08/system/perf-basic/
[5] K. (n.d.). GitHub - KDAB/hotspot: The Linux perf GUI for performance analysis. GitHub. https://github.com/KDAB/hotspot#debian--ubuntu




Comments
Post a Comment