Y-Cruncher is a multi-threaded Pi benchmark.
This benchmark is also described here. It claims are record for computing the most digits of pi.
Metrics (Intel) - phoronix/y-crunchersh - pid 18152 On_CPU 0.897 On_Core 7.173 IPC 1.178 Retire 0.480 (48.0%) FrontEnd 0.138 (13.8%) Spec 0.122 (12.2%) Backend 0.260 (26.0%) Elapsed 65.10 Procs 21 Maxrss 2565K Minflt 661484 Majflt 0 Inblock 0 Oublock 976608 Msgsnd 0 Msgrcv 0 Nsignals 0 Nvcsw 31375 (94.5%) Nivcsw 1820 Utime 465.602228 Stime 1.364009 Start 90263.07 Finish 90328.17
The program has ~94.5% voluntary context switches and blocks written out, so there is some I/O. Otherwise a limiter are backend stalls, resulting in an IPC slightly over 1 and a low retirement rate.
Metrics (AMD) - phoronix/y-crunchersh - pid 6382 On_CPU 0.761 On_Core 12.168 IPC 0.842 FrontCyc 0.008 (0.8%) BackCyc 0.014 (1.4%) Elapsed 64.46 Procs 37 Maxrss 2563K Minflt 661224 Majflt 0 Inblock 32 Oublock 976608 Msgsnd 0 Msgrcv 0 Nsignals 0 Nvcsw 68925 (96.9%) Nivcsw 2198 Utime 781.762266 Stime 2.612581 Start 159119.22 Finish 159183.68
AMD IPC is just slightly lower.
Process Tree - phoronix/y-cruncher
Process Tree
The program runs two processes per virtual core.
18152) sh
18153) y-cruncher
18154) y-cruncher
18156) sh
18157) 13-HSW ~ Airi
18158) 13-HSW ~ Airi
18159) 13-HSW ~ Airi
18160) 13-HSW ~ Airi
18161) 13-HSW ~ Airi
18162) 13-HSW ~ Airi
18163) 13-HSW ~ Airi
18164) 13-HSW ~ Airi
18165) 13-HSW ~ Airi
18166) 13-HSW ~ Airi
18167) 13-HSW ~ Airi
18168) 13-HSW ~ Airi
18169) 13-HSW ~ Airi
18170) 13-HSW ~ Airi
18171) 13-HSW ~ Airi
18172) 13-HSW ~ Airi
18155) sed
About this graph
Some noise as these processes are scheduled on all cores.

The overall IPC is consistent and slightly over 1.
About this graph
Backend stalls are the largest limiter.
retire 0.588 ms_uops 0.001 speculation 0.004 branch_misses 5.28% machine_clears 94.72% frontend 0.137 idq_uops_delivered_0 0.054 icache_stall 0.009 itlb_misses 0.000 idq_uops_delivered_1 0.060 idq_uops_delivered_2 0.071 idq_uops_delivered_3 0.088 dsb_ops 55.17% backend 0.271 resource_stalls.sb 0.033 stalls_ldm_pending 0.214
Overall retirement rate is higher than reported above (and this also seems more consistent with the IPC). Shows a few frontend stalls (branch resteers?) and some memory stalls.
Next steps: None
