Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Sanitizers for CI and regular build targets #95

Closed
jserv opened this issue Jan 31, 2019 · 11 comments
Closed

Enable Sanitizers for CI and regular build targets #95

jserv opened this issue Jan 31, 2019 · 11 comments
Assignees
Labels
feature Outstanding features we should implement
Milestone

Comments

@jserv
Copy link
Member

jserv commented Jan 31, 2019

@wusyong noticed potential memory misbehavior, and we should resolve these problems by enabling sanitizers:

  • AddressSanitizer: detects addressability issues;
  • LeakSanitizer: detects memory leaks;
  • ThreadSanitizer: detects data races and deadlocks
  • MemorySanitizer: detects use of uninitialized memory

Expected output:

  1. New build options to enable sanitizers;
  2. Eliminate the issues pointed by sanitizers;
  3. Travis-CI validates issues with sanitizers;
@marktwtn
Copy link
Collaborator

marktwtn commented Feb 1, 2019

From @jserv in #94 (comment)

I rebuilt dcurl with AddressSanitizer, and make check aborted with the following messages:

==52602==ERROR: LeakSanitizer: detected memory leaks
...
SUMMARY: AddressSanitizer: 37866102 byte(s) leaked in 663 allocation(s).

@jserv jserv added the feature Outstanding features we should implement label Feb 11, 2019
@marktwtn
Copy link
Collaborator

marktwtn commented Feb 15, 2019

The MemorySanitizer can not be used since it is only supported in LLVM compiler,
as described in the Getting MemorySanitizer paragraph in the Wiki(Msan) page.

And we do not have to use LeakSanitizer since it had been integrated into AddressSanitizer and enabled by default.
The Wiki(LSan) page also gives the warning:

Be aware that the stand-alone mode is less well tested
compared to running LSan on top of ASan.

@marktwtn
Copy link
Collaborator

marktwtn commented Mar 3, 2019

After integrating Sanitizers with flags -fsanitize=[address|thread|undefined], it would generate error message when executing Python program with AddressSanitizer.

Error message:

==5001==ASan runtime does not come first in initial library list;
you should either link runtime to your application or manually preload it with LD_PRELOAD.

Although adding the environment variable LD_PRELOAD with libasan can remove the error message, but the memory leak detected by AddressSanitizer are mostly caused by Python itself.
In contrast with Python, the C source code has no problem at all, hence I decide to exclude the testing of *.py.

If there is no other advises, I will send a pull request tonight.

@jserv
Copy link
Member Author

jserv commented Mar 3, 2019

We can discard Python related tests.

@marktwtn
Copy link
Collaborator

marktwtn commented Mar 3, 2019

We can discard Python related tests.

Did you mean removing all the Python related file like tests/*.py and mk/python.mk or just excluding Python tests when using Sanitizers?

@jserv
Copy link
Member Author

jserv commented Mar 3, 2019

While introducing sanitizers, we shall only concentrate on dcurl internals. The reason why test cases exist is to validate dcurl implementations. Python-based tests just act arguably as the comparator.

@marktwtn
Copy link
Collaborator

marktwtn commented Mar 13, 2019

Undefined behaviour Sanitizer detected issue:

output:

src/curl.c:23:53: runtime error: left shift of negative value -1

source code:

static void _transform(int8_t state[]) {
    ......
    int8_t *from = state, *to = copy;
    ......
            to[i] = truthTable[from[aa] + (from[bb] << 2) + 5];
    ......
}

The possible value of from[bb] is 0, 1, -1.
The left-shift operation here is to substitute the operation of multiplying 4.
It works for 0, positive value and specific negative value like -1, -2, -4, -8, .......

For performance consideration, I think we can safely ignore this runtime error.


Update:
I change the source code from from[bb] << 2 to from[bb] * 4.
The generated assembly code has no difference at all.
I think we can modify the source code to use multiply operation and benefit from the compiler optimization.

@marktwtn
Copy link
Collaborator

marktwtn commented Mar 13, 2019

Thread Sanitizer detected issue:

output:

WARNING: ThreadSanitizer: data race (pid=17837)
  Write of size 4 at 0x7b9c00000360 by thread T2 (mutexes: write M6):
    #0 work_cb src/pow_avx.c:419 (test-dcurl+0x5007)
    #1 worker <null> (test-dcurl+0x5c23)

  Previous read of size 4 at 0x7b9c00000360 by thread T3:
    [failed to restore the stack]

  Location is heap block of size 12496 at 0x7b9c00000000 allocated by main thread:
    #0 malloc <null> (libtsan.so.0+0x2ae13)
    #1 PoWAVX_Context_Initialize src/pow_avx.c:566 (test-dcurl+0x57cf)
    #2 initializeImplContext src/implcontext.c:16 (test-dcurl+0x3ba6)
    #3 registerImplContext src/implcontext.c:11 (test-dcurl+0x3c54)
    #4 dcurl_init src/dcurl.c:66 (test-dcurl+0x39c3)
    #5 main tests/test-dcurl.c:54 (test-dcurl+0x2e24)

  Mutex M6 (0x7b9c00000000) created at:
    #0 pthread_mutex_init <null> (libtsan.so.0+0x2c5bd)
    #1 PowAVX src/pow_avx.c:493 (test-dcurl+0x52d1)
    #2 doThePoW src/implcontext.c:57 (test-dcurl+0x3dc5)
    #3 dcurl_entry src/dcurl.c:129 (test-dcurl+0x3ad9)
    #4 main tests/test-dcurl.c:55 (test-dcurl+0x2e3a)

  Thread T2 (tid=17840, running) created by main thread at:
    #0 pthread_create <null> (libtsan.so.0+0x2bcfe)
    #1 uv_thread_create <null> (test-dcurl+0xb810)
    #2 uv_once <null> (test-dcurl+0xba28)
    #3 doThePoW src/implcontext.c:57 (test-dcurl+0x3dc5)
    #4 dcurl_entry src/dcurl.c:129 (test-dcurl+0x3ad9)
    #5 main tests/test-dcurl.c:55 (test-dcurl+0x2e3a)

  Thread T3 (tid=17841, running) created by main thread at:
    #0 pthread_create <null> (libtsan.so.0+0x2bcfe)
    #1 uv_thread_create <null> (test-dcurl+0xb810)
    #2 uv_once <null> (test-dcurl+0xba28)
    #3 doThePoW src/implcontext.c:57 (test-dcurl+0x3dc5)
    #4 dcurl_entry src/dcurl.c:129 (test-dcurl+0x3ad9)
    #5 main tests/test-dcurl.c:55 (test-dcurl+0x2e3a)

SUMMARY: ThreadSanitizer: data race src/pow_avx.c:419 in work_cb

The data race occurred on a common variable stopPoW among the threads for determining whether to stop the thread or not.
Although we might fix it with pthread_mutex_lock and pthread_mutex_unlock, I think it is not necessary.

  • The frequent lock and unlock brings the overhead.
  • The program result is correct.
  • The disadvantage is it might makes the thread to execute some code one more time. But I think it is acceptable.

Update:
Perhaps we can use read-write lock instead.
The thread who finds the correct nonce value would write to the variable stopPoW, and the other threads are always reading the stopPoW to decide whether they should keep executing or not.

@jserv
Copy link
Member Author

jserv commented Mar 13, 2019

I am curious about libtuv internals. Does it implement something like Sleeping Read-Write Locks or Light-weight Locks?

@marktwtn
Copy link
Collaborator

I am curious about libtuv internals. Does it implement something like Sleeping Read-Write Locks or Light-weight Locks?

The locks of libtuv is the same as we use in Linux pthread.
It does not create any special lock and it just wraps the corresponding functions.

Hence it does offer the read-write lock.

@wusyong wusyong modified the milestones: Hemlock, Fumitory Mar 18, 2019
marktwtn added a commit to marktwtn/dcurl that referenced this issue Mar 21, 2019
Use libtuv read-write lock API to fix the data race
detected by thread Sanitizer.

Related DLTcollab#95.
marktwtn added a commit to marktwtn/dcurl that referenced this issue Mar 21, 2019
Use libtuv read-write lock API to avoid the data race
detected by thread Sanitizer.

Related DLTcollab#95.
@marktwtn
Copy link
Collaborator

The expected output has been achieved and the issue can be closed.

However, the expected output is slightly different since we focus on the buildkite CI rather than Travis-CI.

@jserv jserv closed this as completed Mar 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Outstanding features we should implement
Projects
None yet
Development

No branches or pull requests

3 participants