Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XOR benchmarks #8

Open
dgladkov opened this issue Oct 13, 2016 · 3 comments
Open

XOR benchmarks #8

dgladkov opened this issue Oct 13, 2016 · 3 comments

Comments

@dgladkov
Copy link
Collaborator

dgladkov commented Oct 13, 2016

Followup of discussion in 04289cc

Python 2 bytearray mutations

== run_benchmark_process_xor.py
max = 2147483543
34.8218421936
0.414654970169

Python 3 bytearray mutations

== run_benchmark_process_xor.py
max = 2147483543
33.529802142000335
0.39478560099996685

Python 3 iterating bytes

== run_benchmark_process_xor.py
max = 2147483543
32.42236607199993
0.39769275300022855

Python 2 numpy bitwise_xor

== run_benchmark_process_xor.py
max = 2147483543
27.3349659443
0.424569129944

Python 3 numpy bitwise_xor

== run_benchmark_process_xor.py
max = 2147483543
24.56664780200026
0.4790448710000419

Conclusions:

  • On real world example difference between bytearray mutations and map over bytes for Python 3 is not worth maintaining the separate code path.
  • numpy's bitwise_xor gives significant performance increase, however it requires manual key wrapping as bitwise_xor expects arrays of the same size

As numpy contains C extensions and may require compilation, I suggest adding numpy as an optional dependency and falling back to native Python bytearray mutations + xor if numpy is not installed.

@GreyCat
Copy link
Member

GreyCat commented Oct 13, 2016

Great comparison, thanks!

I have yet another humble suggestion - is it possible to submit something like bitwise_xor_cyclic to numpy, and then add here? That would be probably make it work even faster?

There's also this large battle for XOR at SO. I can't estimate how hard is it to include & maintain such code in our library and/or if it's worth that.

@dgladkov
Copy link
Collaborator Author

A lot of answers suggested in that SO post are impractical (writing manual SSE2 instructions, depending on lots of third party libraries), but the general idea is that the only way to gain meaningful performance improvement is to write native code. The best solutions for Python are cffi and Cython. cffi is a better solution for small functions, while Cython is better when you need complex C<->Python interop, so I picked cffi for my test. My current C implementation looks like this (I know only basic C so this might be suboptimal):

void xor_one(const char *data, const unsigned char key, char *result)
{
    unsigned long data_len = strlen(data);
    unsigned long idx;
    for (idx = 0; idx < data_len; idx++)
    {
        result[idx] = data[idx] ^ key;
    }
}

void xor_many(const char *data, const char *key, char *result)
{
    unsigned long data_len = strlen(data);
    unsigned long key_len = strlen(key);
    unsigned long idx;
    for (idx = 0; idx < data_len; idx++)
    {
        result[idx] = data[idx] ^ key[idx % key_len];
    }
}

Benchmark results:

Python 2

== run_benchmark_process_xor.py
max = 2144703287
26.3223340511
0.44394993782

Python 3

== run_benchmark_process_xor.py
max = 2144703287
23.45923235600094
0.4029908089996752

Pros:

  • User doesn't need to install full numpy package
  • Slightly faster than numpy

Cons:

  • Needs compilation and binary distributions for different platforms and architectures

Submitting bitwise_xor_cyclic to numpy doesn't look realistic, their bitwise_xor is is just one of their supported array operations, so it requires arrays of the same size in any case.

In my opinion performance improvements of cffi code are not worth the build complexity it adds.

@arekbulski
Copy link
Member

arekbulski commented Jan 18, 2018

Adding Cython as either dependency or runtime is being debated in kaitai-io/kaitai_struct#311 . If it gets added, reimplementing xor will be just a few-liner.

EDIT: Cython runtime was withdrawn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants