Description - phoronix/compress-pbzip2

This test measures the time needed to compress a file (a .tar package of the Linux kernel source code) using BZIP2 compression.

Metrics (Intel) - phoronix/compress-pbzip2
sh - pid 2426
	On_CPU   0.956
	On_Core  7.645
	IPC      0.913
	Retire   0.451	(45.1%)
	FrontEnd 0.098	(9.8%)
	Spec     0.300	(30.0%)
	Backend  0.151	(15.1%)
	Elapsed   9.71
	Procs    14
	Maxrss   740K
	Minflt   191115
	Majflt   0
	Inblock  0
	Oublock  0
	Msgsnd   0
	Msgrcv   0
	Nsignals 0
	Nvcsw    212	(21.7%)
	Nivcsw   763
	Utime    73.876767
	Stime    0.357475
	Start    743780.58
	Finish   743790.29

The voluntary context switches and On_CPU of 95% suggest some I/O (despite the Inblock/Oublock) but overall, this is scheduled mostly on the processor. The speculative miss rate is very high and the IPC slightly on the low side.

Metrics (AMD) - phoronix/compress-pbzip2
sh - pid 6298
	On_CPU   0.933
	On_Core  14.931
	IPC      0.928
	FrontCyc 0.130	(13.0%)
	BackCyc  0.129	(12.9%)
	Elapsed   5.35
	Procs    22
	Maxrss   763K
	Minflt   198517
	Majflt   0
	Inblock  0
	Oublock  0
	Msgsnd   0
	Msgrcv   0
	Nsignals 0
	Nvcsw    361	(4.3%)
	Nivcsw   8078
	Utime    79.398898
	Stime    0.480230
	Start    753843.64
	Finish   753848.99

Similar IPC between AMD and Intel.

Process Tree - phoronix/compress-pbzip2
Process Tree
The process tree is simple.

    2426) sh
      2427) compress-pbzip2
        2428) pbzip2
        2429) pbzip2
        2430) pbzip2
        2431) pbzip2
        2432) pbzip2
        2433) pbzip2
        2434) pbzip2
        2435) pbzip2
        2436) pbzip2
        2437) pbzip2
        2438) pbzip2
        2439) pbzip2

About this graph
Shows all cores scheduled most of the time.


The IPC is somewhat lower and has some noise.

About this graph
There is a high amount of speculative misses.

Topdown (Intel)
retire         0.443
ms_uops                0.002
speculation    0.296
branch_misses          81.24%
machine_clears         18.76%
frontend       0.100
idq_uops_delivered_0   0.023
idq_uops_delivered_1   0.037
idq_uops_delivered_2   0.054
idq_uops_delivered_3   0.085
backend        0.162
resource_stalls.sb     0.030
stalls_ldm_pending     0.236

Most of the bad speculation is due to branch prediction, but there are also machine clears.

Next steps: Understand why the branch prediction is poor.