tscp – Performance analysis, tools and experiments

Description - phoronix/tscp

This is a performance test of TSCP, Tom Kerrigan’s Simple Chess Program, which has a built-in performance benchmark.

This test is single-threaded and runs in just over a second on my Intel system. Runs below were pinned to core 1.

Metrics (Intel) - phoronix/tscp

sh - pid 13617
	On_CPU   0.125
	On_Core  1.004
	IPC      1.799
	Retire   0.383	(38.3%)
	FrontEnd 0.323	(32.3%)
	Spec     0.192	(19.2%)
	Backend  0.102	(10.2%)
	Elapsed   1.27
	Procs    3
	Maxrss   10K
	Minflt   238
	Majflt   0
	Inblock  0
	Oublock  16
	Msgsnd   0
	Msgrcv   0
	Nsignals 0
	Nvcsw    18	(81.8%)
	Nivcsw   4
	Utime    1.273975
	Stime    0.000926
	Start    566559.11
	Finish   566560.38

In its brief time, the benchmark is scheduled 100% on the CPU. Frontend stalls and bad speculation are the largest issues.

Metrics (AMD) - phoronix/tscp

sh - pid 15958
	On_CPU   0.063
	On_Core  1.001
	IPC      1.519
	FrontCyc 0.022	(2.2%)
	BackCyc  0.345	(34.5%)
	Elapsed   1.62
	Procs    3
	Maxrss   10K
	Minflt   242
	Majflt   0
	Inblock  0
	Oublock  16
	Msgsnd   0
	Msgrcv   0
	Nsignals 0
	Nvcsw    18	(9.7%)
	Nivcsw   167
	Utime    1.621309
	Stime    0.000000
	Start    576771.03
	Finish   576772.65

With the granularity of my other charts at once per second, I don’t get much more details than lists above.

retire         0.378
ms_uops                0.006
speculation    0.226
branch_misses          95.66%
machine_clears         4.34%
frontend       0.321
idq_uops_delivered_0   0.079
idq_uops_delivered_1   0.112
idq_uops_delivered_2   0.196
idq_uops_delivered_3   0.265
backend        0.075
resource_stalls.sb     0.005
stalls_ldm_pending     0.577

Topdown details show the speculation issues are branch prediction. There are a fair number of cycles with outstanding memory requests, however these get hidden behind frontend stalls and speculation stalls and hence backend is not a big factor.

Process Tree - phoronix/tscp
Process Tree
The process tree is simple

   13617) sh
      13618) tscp
        13619) tscp

Overall, this seems mostly like an extremely small toy benchmark.

Next steps: None