A path to topdown counters for AMD Zen4
A recent Phoronix article described new performance counters being added to the Linux kernel. This seemed intriguing to me as a way to get analysis that previously had only done on Intel however it required a few steps.
The first step was getting a Linux 6.2 kernel installed. The procedure was straightforward where I followed these instructions to install the “mainline” tool and then the graphical UI to install and boot into a new kernel.
However, after installing the kernel when I invoked perf, I got
WARNING: perf not found for kernel 6.2.0-060200
You may need to install the following packages for this specific kernel:
linux-tools-6.2.0-060200-generic
linux-cloud-tools-6.2.0-060200-generic
You may also want to install one of the following packages to keep up to date:
linux-tools-generic
linux-cloud-tools-generic
and this was not available from the standard repository.
So the next step was finding and following instructions for downloading kernel sources and building perf. This was straightforward and including a “make” in the tools/perf directory.
At this point, trying “perf -a –topdown” does not work yet, but if I do a “perf list”, I can see a set of counters such as these:
backend_bound_group:
backend_bound_cpu
[Fraction of dispatch slots that remained unused because of stalls not related to the memory subsystem]
backend_bound_memory
[Fraction of dispatch slots that remained unused because of stalls due to the memory subsystem]
Looking at these a little further
mev@sacramento:~/source/linux-6.2/tools/perf$ ./perf stat --event=backend_bound_cpu
event syntax error: 'backend_bound_cpu'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf stat [
I can’t immediately pick these because they seem to be an event group that I can see further when I list the -v –detail option on perf-list:
ist of pre-defined events (to be used in -e or -M):
Metric Groups:
PipelineL2:
backend_bound_cpu
[backend_bound * (1 - d_ratio(ex_no_retire.load_not_complete, ex_no_retire.not_complete))]
backend_bound_group:
backend_bound_cpu
[backend_bound * (1 - d_ratio(ex_no_retire.load_not_complete, ex_no_retire.not_complete))]
However, using the “-M” option to perf stat will give me the underlying counters for this metric:
mev@sacramento:~/source/linux-6.2/tools/perf$ ./perf stat -M backend_bound_cpu ~/hello
hello world
Performance counter stats for '/home/mev/hello':
1,234,099 ex_no_retire.load_not_complete # 2.7 % backend_bound_cpu
5,366,366 de_no_dispatch_per_slot.backend_stalls
1,336,714 ex_no_retire.not_complete
2,551,447 ls_not_halted_cyc
0.001079579 seconds time elapsed
0.001128000 seconds user
0.000000000 seconds sys
So success overall and believe I can now get these metrics from an AMD Zen4 core.
Comments
A path to topdown counters for AMD Zen4 — No Comments
HTML tags allowed in your comment: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>