topdown tool – adding support for level 3 backend, rusage – Performance analysis, tools and experiments

I have updated the topdown wrapper script to add some additional cache statistics for level 3 backend related information. I have also added a -x option to display some rusage information.

Following is an illustration using the following command that looks at level 3 frontend information for the build-linux-kernel benchmark.

 ./wspy/topdown -l 3 -x -o topdown.txt phoronix-test-suite batch-run scimark2

The output in topdown.txt is as follows:

on_cpu         0.112
elapsed        86.737
utime          77.504
stime          77.504
nvcsw          413 (79.73%)
nivcsw         105 (20.27%)
inblock        8
inblock        952
retire         0.510
ms_uops                0.002
speculation    0.058
branch_misses          72.66%
machine_clears         27.34%
frontend       0.027
idq_uops_delivered_0   0.005
icache_stall               0.001
itlb_misses                0.000
idq_uops_delivered_1   0.011
idq_uops_delivered_2   0.018
idq_uops_delivered_3   0.020
dsb_ops                    56.38%
backend        0.405
resource_stalls.sb     0.001
stalls_ldm_pending     0.555
l2_refs                    0.013
l2_misses                  0.007
l2_miss_ratio              55.80%
l3_refs                    0.001
l3_misses                  0.001
l3_miss_ratio              39.04%

A brief explanation using this output

The On_cpu ratio comes from using the elapsed time and system and user time. It isn’t quite 12.5% for the single-threaded scimark2 because it also includes some periods where the phoronix test suite is idle.
The l2 and l3 statistics are the number of references/misses relative to the number of cycles. That will always be slightly small e.g. if a reference takes 11 cycles then we’re really talking about 13*11 = 132 cycles of l2 reference time per 1000 cycles, however keeps it in the same units as other metrics
The l2 and l3 miss ratios are relative of the two statistics. Overall it shows still a moderate level of l2 and l3 misses are contributing the the backend stall nature of this benchmark

Overall, this combination of metrics gives a fairly quick and dirty overview of a workload run that one can then dig deeper for specifics.

Performance analysis, tools and experiments

An eclectic collection

topdown tool – adding support for level 3 backend, rusage

Comments

topdown tool – adding support for level 3 backend, rusage — No Comments

Leave a Reply Cancel reply