Description - phoronix/ffte

FFTE is a package by Daisuke Takahashi to compute Discrete Fourier Transforms of 1-, 2- and 3- dimensional sequences of length (2^p)*(3^q)*(5^r)

The test runs quickly in ~5 seconds. While it may start up processes on multiple cores, it otherwise behaves single-threaded. All tests were run pinned to core 1.

Metrics (Intel) - phoronix/ffte
sh - pid 19980
	On_CPU   0.125
	On_Core  1.000
	IPC      2.833
	Retire   0.710	(71.0%)
	FrontEnd 0.013	(1.3%)
	Spec     0.001	(0.1%)
	Backend  0.276	(27.6%)
	Elapsed   3.52
	Procs    4
	Maxrss   10K
	Minflt   328
	Majflt   0
	Inblock  0
	Oublock  16
	Msgsnd   0
	Msgrcv   0
	Nsignals 0
	Nvcsw    25	(69.4%)
	Nivcsw   11
	Utime    3.519370
	Stime    0.000646
	Start    657764.29
	Finish   657767.81

Elapsed time of 3.5 seconds and On_Core of 100%. This has a high IPC with backend stalls the largest issue. There are a number of voluntary context switches, presumably I/O related.

Metrics (AMD) - phoronix/ffte
phoronix-test-s - pid 27370
	On_CPU   0.042
	On_Core  0.677
	IPC      3.586
	FrontCyc 0.026	(2.6%)
	BackCyc  0.041	(4.1%)
	Elapsed  13.13
	Procs    288
	Maxrss   36K
	Minflt   59927
	Majflt   0
	Inblock  0
	Oublock  640
	Msgsnd   0
	Msgrcv   0
	Nsignals 0
	Nvcsw    3033	(59.5%)
	Nivcsw   2065
	Utime    8.650602
	Stime    0.243819
	Start    667869.77
	Finish   667882.90

AMD system shows an even higher IPC.

Process Tree - phoronix/ffte
Process Tree
The process tree is simple.

    19980) sh
      19981) ffte
        19982) ffte
        19983) speed1d


On_CPU goes to 100%, most of noise due to very short running of this benchmark.


The IPC is consistently high.


There is a high retire rate and some backend stalls.

Topdown (Intel)
retire         0.688
ms_uops                0.004
speculation    0.006
branch_misses          42.02%
machine_clears         57.98%
frontend       0.027
idq_uops_delivered_0   0.007
idq_uops_delivered_1   0.012
idq_uops_delivered_2   0.015
idq_uops_delivered_3   0.023
backend        0.280
resource_stalls.sb     0.003
stalls_ldm_pending     0.506

Backend stalls seem to be memory related.

Overall, a tiny toy benchmark.

Next steps: None