Getting access to perf events on GitHub-hosted ubuntu-latest

Hi!

I am looking to adding benchmarks to my project. As wall-clock time has a high-variance, I would like to use perf-events to measure cpu-instructions executed instead. This should be much more stable.

However, my testing shows that perf-event are not available for the ubuntu-latest runner:

$ sudo sh -c 'echo 0 >/proc/sys/kernel/perf_event_paranoid'
$ perf stat ls || true
Cargo.lock

 Performance counter stats for 'ls':

              1.05 msec task-clock                #    0.618 CPUs utilized          
                 0      context-switches          #    0.000 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               107      page-faults               #    0.102 M/sec                  
   <not supported>      cycles                                                      
   <not supported>      instructions  # sad octocat                                                
   <not supported>      branches                                                    
   <not supported>      branch-misses                                               

       0.001694797 seconds time elapsed

       0.001523000 seconds user
       0.000000000 seconds sys

(the above repo contains couple of tries to poke at perf-events)

Is there perhaps some magic config somewhere which I can tweak to fix this? Alternatively, is there perhaps a non-perf based way to measure how many CPU cycles my program spends?

Not sure about perf, but one thing you can do to get CPU cycles, or at least simulated CPU cycles, is to use Cachegrind; it runs a simulated CPU and tells you the instruction count. SQLite uses this technique: https://sqlite.org/cpu.html#performance_measurement

I wrote an article about the technique with a lot more detail here: https://pythonspeed.com/articles/consistent-benchmarking-in-ci/

1 Like

Oh :wave:! Your post was exactly the thing that caused me to look into this closer, thanks for writing it up :heart: Cachegrind is indeed a cool solution, but it unfortunately is way to slow for my use-case :frowning:

That’s why I am looking at perf: like Cachegrind, it counts instructions, so it is much less noisy than wall-clock time. At the same time, it incurs almost no slow-down.

1 Like