Experiment – virtualization and performance counters
During a recent look at likwid-perfctr the performance counters didn’t look right in several aspects:
- CPU core to CPU core differences in what should be a symmetric benchmark and where wspy results showed processes balanced
- run to run differences with vastly different amounts of cycles retired
- absolute differences such as showing much too small amounts of CPU cycles
So while the runs were good at showing the format of the tool, the data just looked wrong.
I had a hypothesis that the virtual performance counters were not correctly tabulated via the MSR interface. In that experiment, the system was booted under Xen hypervisor with vpmu=1 parameter. This seemed to let the “perf” tool report results, but perhaps not likwid-perfctr which used the msr kernel module.
For this experiment, I reran the last experiment that ran each predefined group on same system booted in bare metal.
#!/bin/bash likwid-perfctr -a | tail +3 | awk '{ print $1 }' | while read group do likwid-perfctr --output cray_2${group}.txt -f -c 0-7 -g ${group} phoronix-test-suite batch-run c-ray done
This time the results make a lot more sense without the anomalies listed above.
- UOPS_RETIRE UOPs retirement
- FLOPS_AVX Packed AVX MFLOP/s
- TLB_DATA L2 data TLB miss rate/ratio
- CACHES Cache bandwidth in MBytes/s
- CYCLE_ACTIVITYCycle Activities
- CLOCK Power and Energy consumption
- L3 L3 cache bandwidth in MBytes/s
- BRANCH Branch prediction miss rate/ratio
- UOPS UOPs execution info
- TLB_INSTR L1 Instruction TLB miss rate/ratio
- RECOVERY Recovery duration
- L2CACHE L2 cache miss rate/ratio
- UOPS_ISSUE UOPs issueing
- L2 L2 cache bandwidth in MBytes/s
- ENERGY Power and Energy consumption
- FALSE_SHARE False sharing
- DATA Load to store ratio
- L3CACHE L3 cache miss rate/ratio
- ICACHE Instruction cache miss rate/ratio
- UOPS_EXEC UOPs execution
While further experiments might compare the vpmu interface for perf_event_open(2) based tools like perf, this experiment suggests avoiding msr device based tools like likwid-perfctr on virtualized systems.
Comments
Experiment – virtualization and performance counters — No Comments
HTML tags allowed in your comment: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>