Description - phoronix/cachebench

This is a performance test of CacheBench, which is part of LLCbench. CacheBench is designed to test the memory and cache bandwidth performance.

As described in the cachebench paper, cachebench strides through memory and tries to operations on a vector of numbers, e.g.

// read
   for (i = 0;i < length;i++)
      sum += vector[i];

// write
   for (i = 0;i < length;i++)
      vector[i] = sum++;

// modify
   for (i = 0;i < length;i++)
      vector[i]++;

Phoronix times these through loops of varying sizes and then prints the time spent in the last iteration.

However, as best I can see, the implementation of cachebench as well as the Phoronix test have three large issues

  • Access patterns are regular and insufficient checks are made of the results that gcc and other compilers can optimize away computations.
  • The access patterns are very regular, i.e. incrementally striding through the vector, and prefetchers and other optimizations defeat the attempt to look at different cache sizes
  • Phoronix invokes this test with a "-m16" option indicating memory sizes of 64K. That might have made sense when cachebench was written (~20 years ago), but now this is not much larger than L1 cache sizes

The net combination is that I'm not sure this benchmark serves a purpose in comparing test results. As an example, following is output I got which doesn't make much logical sense (the values printed are a logical sum of cache operations most of which are small enough to fit into L1 if not optimized away by hardware and software.

CacheBench:
    pts/cachebench-1.1.2 [Test: Read]
    Test 1 of 3
    Estimated Trial Run Count:    3
    Estimated Test Run-Time:      7 Minutes
    Estimated Time To Completion: 21 Minutes [05:52 CDT]
        Started Run 1 @ 05:31:51
        Started Run 2 @ 05:33:57
        Started Run 3 @ 05:36:04

    Test: Read:
        3289.2156434286
        3291.455884
        3282.0603952857

    Average: 3287.58 MB/s
    Deviation: 0.15%


CacheBench:
    pts/cachebench-1.1.2 [Test: Write]
    Test 2 of 3
    Estimated Trial Run Count:    3
    Estimated Test Run-Time:      7 Minutes
    Estimated Time To Completion: 14 Minutes [05:51 CDT]
        Started Run 1 @ 05:38:17
        Started Run 2 @ 05:40:23
        Started Run 3 @ 05:42:29

    Test: Write:
        25516.759954095
        25262.961700714
        25669.610336143

    Average: 25483.11 MB/s
    Deviation: 0.81%


CacheBench:
    pts/cachebench-1.1.2 [Test: Read / Modify / Write]
    Test 3 of 3
    Estimated Trial Run Count:    3
    Estimated Time To Completion: 7 Minutes [05:51 CDT]
        Started Run 1 @ 05:44:42
        Started Run 2 @ 05:46:48
        Started Run 3 @ 05:48:54

    Test: Read / Modify / Write:
        40692.982519286
        42455.34285481
        42416.419197952

    Average: 41854.91 MB/s
    Deviation: 2.40%