The create threads workload is a very simple micro-benchmark as part of the osbench suite. The description is quite literally:

Allocate 1,000,000 small chunks of memory, and then free them. Each chunk is 4-128 bytes in size.

and the running is not much more exciting than that:

Benchmark: Allocate/free 1000000 memory chunks (4-128 bytes)...
80.971003 ns / alloc

Similarly, the code is very simple:

    const double t0 = get_time();

    for (int i = 0; i < NUM_ALLOCS; ++i) {
      const size_t memory_size = ((i % 32) + 1) * 4;
      s_addresses[i] = malloc(memory_size);
      ((char*)s_addresses[i])[0] = 1;
    }

    for (int i = 0; i < NUM_ALLOCS; ++i) {
      free(s_addresses[i]);
    }

    double dt = get_time() - t0;
    if (dt < best_time) {
      best_time = dt;
    }

This is placed in a loop and allowed to run for five seconds with the best time reported.