Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] problems with long region names (segfaults and errors) #551

Closed
melven opened this issue Sep 6, 2023 · 5 comments · Fixed by #552
Closed

[BUG] problems with long region names (segfaults and errors) #551

melven opened this issue Sep 6, 2023 · 5 comments · Fixed by #552
Labels

Comments

@melven
Copy link
Contributor

melven commented Sep 6, 2023

Describe the bug
With region names longer than ~100 characters, I get not a valid region description: ... sometimes followed by a segfault.

To Reproduce

  • Compile example: gcc -I/path/to/likwid/include -L/path/to/likwid/lib -llikwid -DLIKWID_PERFMON -fopenmp example.c
  • Run example: likwid-perfctr -g MEM_DP -C 0 -m ./a.out

Please supply the output of the command with -d added to the command line:

  • With likwid 5.2.1, I sometimes get segfaults, sometimes it works for regions with shorter names:
> likwid-perfctr --version
likwid-perfctr -- Version 5.2.1 (commit: 233ab943543480cd46058b34616c174198ba0459)
> likwid-perfctr -g MEM_DP -C 0 -m ./a.out
--------------------------------------------------------------------------------
CPU name:       Intel(R) Xeon(R) Gold 6342 CPU @ 2.80GHz
CPU type:       Intel Icelake SP processor
CPU clock:      2.80 GHz
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Line 0:fooXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX-0
 not a valid region description: fooXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Segmentation fault (core dumped)
  • For likwid 5.2.2, the behavior is a bit different but still looks like a memory error (write behind string buffer?):
> likwid-perfctr --version
likwid-perfctr -- Version 5.2.2 (commit: 233ab943543480cd46058b34616c174198ba0459)
> likwid-perfctr -g MEM_DP -C 0 -m ./a.out
--------------------------------------------------------------------------------
CPU name:       Intel(R) Xeon(R) Gold 6342 CPU @ 2.80GHz
CPU type:       Intel Icelake SP processor
CPU clock:      2.80 GHz
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Line 0:fooXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX-0
 not a valid region description: fooXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Region , Group 1: MEM_DP
+-------------------+------------+
[...]
+-----------------------------------+--------------+

free(): double free detected in tcache 2
Aborted (core dumped)

Example code

#include <stdlib.h>
#include <stdio.h>
#include <omp.h>
// This block enables compilation of the code with and without LIKWID in place
#ifdef LIKWID_PERFMON
#include <likwid-marker.h>
#else
#define LIKWID_MARKER_INIT
#define LIKWID_MARKER_THREADINIT
#define LIKWID_MARKER_SWITCH
#define LIKWID_MARKER_REGISTER(regionTag)
#define LIKWID_MARKER_START(regionTag)
#define LIKWID_MARKER_STOP(regionTag)
#define LIKWID_MARKER_CLOSE
#define LIKWID_MARKER_GET(regionTag, nevents, events, time, count)
#endif

#define N 10000

int main(int argc, char* argv[])
{
    int i;
    double data[N];
    const char* longRegionName = "fooXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX";
    const char* shortRegionName = "foo";
    LIKWID_MARKER_INIT;

#pragma omp parallel
{
    LIKWID_MARKER_START(longRegionName);
    #pragma omp for
    for(i = 0; i < N; i++)
    {
        data[i] = omp_get_thread_num();
    }
    LIKWID_MARKER_STOP(longRegionName);
}

#pragma omp parallel
{
    LIKWID_MARKER_START(shortRegionName);
    #pragma omp for
    for(i = 0; i < N; i++)
    {
        data[i] = omp_get_thread_num();
    }
    LIKWID_MARKER_STOP(shortRegionName);
}
    LIKWID_MARKER_CLOSE;
    return 0;
}
@melven melven added the bug label Sep 6, 2023
@melven
Copy link
Contributor Author

melven commented Sep 6, 2023

Hello from Cologne ;)

problem is not urgent for me - as a workaround I just shorten my region names
(they are generated automatically from C++ template function names -> thus sometimes long)...

Best,
Melven

@TomTheBear
Copy link
Member

Hi Melven,

we have that issue also with others trying to integrate the MarkerAPI into some framework with auto-generated region names (e.g. OpenSYCL). The question is what is a reasonable maximal length for the region names. Or no string length limit at all? Your opinion?

The currently hardcoded limit is 100 characters.

Best,
Thomas

@melven
Copy link
Contributor Author

melven commented Sep 6, 2023

Hi Thomas,

I suspect there is an additional bug: it should not segfault when the string is too long.
Would be fine for me if it takes the first 100 chars (just truncating the name).
If someone really has tag names that differ only in the 105th character, that regions would just appear as one...

Best,
Melven

@TomTheBear
Copy link
Member

Please check the linked commit/branch. It follows your suggestion to truncate the region tag if larger than 100 characters.

@melven
Copy link
Contributor Author

melven commented Sep 8, 2023

Works fine in all my tests, thanks!

@TomTheBear TomTheBear linked a pull request Sep 8, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants