Skip to content

TutorialLikwidPerf

Nichols A. Romero edited this page May 30, 2023 · 5 revisions

How to use LIKWID and perf

Running LIKWID with perf backend

Since LIKWID 4.3 it is possible to use Linux kernel's perf_event interface to measure the events and still benefit from the features of LIKWID (Performance groups, derived metrics and the MarkerAPI).

To configure LIKWID, set ACCESSMODE=perf_event before compiling. Afterwards, just do a make && make install. No root permissions required for installation if the install destination is writable or the user.

Currently, perf_event does not support thermal readings, thus this does not work with the perf_event backend. There might also be some other drawbacks when access to a MSR is required (likwid-features, likwid-setFrequencies, ...), see limitations.

How is counter access controlled?

The perf_event subsystem uses a single file to determine how much access to the counters in granted to the user. This file /proc/sys/kernel/perf_event_paranoid contains single values ranging from -1 to 4. The vanilla Linux kernel only knows -1 to 2, additional values are introduced by the Linux distribution's patchset.

  • 2: allow only user-space measurements (default since Linux 4.6).
  • 1: allow both kernel and user measurements (default before Linux 4.6).
  • 0: allow access to CPU-specific data but not raw tracepoint samples.
  • -1 no restrictions.

The additional value 4 commonly means "no access at all".

For LIKWID, the values -1 to 2 can be used with reduced feature set based on the value:

  • -1 - 2: All hardware thread local counters are supported (FIXC* and PMC*)
    • 1 - 2: The measurements are limited to the PID and its children
    • -1 - 0: The measurement contains all events happening on the specified hardware thread (*)
  • -1 - 0: Hardware thread local and Uncore counters are supported

(*) This is the mode LIKWID uses when running with direct or accessdaemon access mode.

There exist different methods to grant users and even single applications (keyword capabilities) to perform measurements but this is out of scope of this page.

Why the execpid, perfflags and perfpid options?

When running LIKWID with perf_event, you can limit the measurements to some PID. Commonly, likwid-perfctr is used as a wrapper for the real application. If you want to limit the counting to the application, you can use the --execpid option.

If you want to measure some other application (not wrapped by likwid-perfctr), you can use the --perfpid <PID> option.

Usage example: SHELL1

$ ./exec

SHELL2

$ pgrep exec
12345
$ grep -i cpus_allowed_list /proc/12345/status | awk '{print $2}'
0-7
$ likwid-perfctr -c 0-7 -g MEM --perfpid 12345 -S 10s

With --perfflags <FLAGS> you can specific additional flags handed over to the perf_event backend (see flags field for perf_event_open syscall).

Why do I get zero when measuring memory bandwidth and memory data volume?

In order to measure memory volume and memory bandwidth, LIKWID must be able to access the Uncore events via the performance counters from /sys/devices.

For example on an AMD-based systems, you will need both a /sys/devices/amd_df and /sys/devices/amd_l3 to be available.

These are normally created by the Linux kernel at boot time if perf events is compiled into the kernel, but sometimes they are available as kernel module instead and may not be available by default.

One can get a sense of whether the perf functionality is compiled into the kernel or available as a module by doing a grep -i perf_events /boot/$(uname -r). You will see something like this:

# Performance monitoring
#
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_PERF_EVENTS_INTEL_RAPL=m
CONFIG_PERF_EVENTS_INTEL_CSTATE=m
# CONFIG_PERF_EVENTS_AMD_POWER is not set
CONFIG_PERF_EVENTS_AMD_UNCORE=m
# end of Performance monitoring

In this example taken from an Ubuntu Linux with an AMD Zen3 processor, one can see the perf events for the AMD Uncore counter are available as module. In order to have access to the Uncore counter, a system administrator will need to do:

insmod /lib/modules/$(uname -r)/kernel/arch/x86/events/amd/amd-uncore.ko

Finding the specific kernel module (*.ko) will involve some amount of guesswork. But after the module is loaded. the missing perf counters in /sys/devices/amd_df and /sys/devices/amd_l3 will be available and the memory bandwidth and memory data volume will yield a non-zero result.

Feature limitations

LIKWID does more than reading just the performance counter registers. There are other registers that contain useful information like the CPU turbo frequency limits, the state of prefetchers, Uncore frequency (Intel only), ...

Here is a list of features that don't work:

LIKWID specifies counter names so you can address in which counter the event should be measured. Perf_event doesn't allow to specify the counter as it schedules the events more freely and decides itself where to count the event (at least I havn't found a way). Although you cannot specify more counters that are available on your system as LIKWID limits it, perf_event might schedule them differently as you think, e.g. the instructions event of perf_event does not always use the fixed-purpose counter that counts only instructions but uses a general-purpose counter for that. If you have specified additional events that have to use the general-purpose counters, perf_event might multiplex the events on the general-purpose counters (commonly no problem but in rare cases the counts might be off a little).

Using LIKWID's information with perf

The great commandline tool for perf_event is perf. It provides a set of pre- defined events but it is also possible to add raw events. If you want to measure a specific event known by LIKWID, you can use the event data to add it as a raw event to perf.

When running likwid-perfctr -e or likwid-perfctr -E <searchstr> you get a list like this:

UOPS_RETIRED_ALL, 0xC2, 0x1, PMC
UOPS_RETIRED_CORE_ALL, 0xC2, 0x1, PMC
UOPS_RETIRED_RETIRE_SLOTS, 0xC2, 0x2, PMC

The first is the event name, followed by the event identifier and the umask. The final entry in a line defines the counter (group) that can be used for the event.

For events in the PMC group, you can use r<umask><eventid> as input to perf, so if we want to add UOPS_RETIRED_ALL, it looks like this: perf -e r01C2 a.out

You can probably add event options as well but you have to get the bit index from sysfs. The data resides in /sys/devices/cpu/format/. LIKWID option names and perf option names are not similar, but not difficult:

  • EDGEDETECT -> edge
  • THRESHOLD -> cmask
  • INVERT -> invert
  • ANYTHREAD -> any

I didn't find any documentation how to add Uncore raw events.

Clone this wiki locally