Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added clang inline helper #30990

Merged
merged 4 commits into from
Feb 16, 2020
Merged

Added clang inline helper #30990

merged 4 commits into from
Feb 16, 2020

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Jan 13, 2020

gcc on macOS I think executes clang more often than not, and it appears we didn't have a inline helper defined for that. This is ported from CYTHON_INLINE which essentially does the same in cythonized files, save the static declaration

This might add a bit of build time to macOS (I think partially explains why it's a few minutes faster than the rest) but makes it more similar in performance to the others

Running the full benchmark suite to see where this may help

@jbrockmendel
Copy link
Member

makes sense, looking forward to the perf stats

@bashtage
Copy link
Contributor

Doesn't clang define __GNUC__?

@WillAyd
Copy link
Member Author

WillAyd commented Jan 14, 2020

From what I understand clang defines it but the semantics of the inline keyword are different

@bashtage
Copy link
Contributor

Have you checked whether binaries produced before this patch and with this patch are different? I dont' see why __attribute__ ((__unused__)) would make a difference to the compiler's choice to inline a function. I think it suppressed some warnings.

@WillAyd
Copy link
Member Author

WillAyd commented Jan 14, 2020

I tried running the entire benchmark suite but got stuck at 30% somewhere over night, so re-ran just for indexing this morning. Here are the results:

       before           after         ratio
     [8ff2ebd9]       [77a96005]
     <axis-cleanup>       <clang-inline>
-      5.05±0.2ms      4.31±0.03ms     0.85  indexing_engines.ObjectEngineIndexing.time_get_loc('non_monotonic')
-         179±2μs          151±7μs     0.85  indexing.NumericSeriesIndexing.time_getitem_scalar(<class 'pandas.core.indexes.numeric.Int64Index'>, 'nonunique_monotonic_inc')
-        24.4±1ms       17.1±0.5ms     0.70  indexing.NumericSeriesIndexing.time_loc_scalar(<class 'pandas.core.indexes.numeric.UInt64Index'>, 'nonunique_monotonic_inc')
-        44.4±3ms         26.0±2ms     0.59  indexing.InsertColumns.time_assign_with_setitem

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

So I think a measurable boost in some places

@gfyoung gfyoung added Internals Related to non-user accessible pandas implementation Performance Memory or execution speed performance labels Jan 15, 2020
@jbrockmendel
Copy link
Member

any firm conclusions about the perf impact?

@WillAyd
Copy link
Member Author

WillAyd commented Feb 12, 2020

Yea looks to give a boost to indexing operations. I didn't run the entire test suite because this inline helper is already generated by Cython so already used in anything but our custom C code

@WillAyd
Copy link
Member Author

WillAyd commented Feb 13, 2020

Planning to merge at the end of the day tomorrow unless any other comments @pandas-dev/pandas-core

@bashtage
Copy link
Contributor

That is part of this, but the MSVC inline could become static inline.

Do any of these compilers have different behavior for vendor-specific keywords rather than C99 static inline?

@bashtage
Copy link
Contributor

Should

https://github.com/pandas-dev/pandas/pull/30990/files#diff-b3aa9f22c2ca2e89ba79863c5701aafcL18

be first?

GNUC says: "If you are writing a header file to be included in ISO C90 programs, write inline instead of inline. See Alternate Keywords."

@WillAyd
Copy link
Member Author

WillAyd commented Feb 14, 2020

Not sure, but I don't know if worth straying from what Cython generates for this in any case

@jreback jreback added this to the 1.1 milestone Feb 15, 2020
@jreback
Copy link
Contributor

jreback commented Feb 15, 2020

i guess this can't hurt.

@WillAyd WillAyd merged commit 05ab8ba into pandas-dev:master Feb 16, 2020
@WillAyd WillAyd deleted the clang-inline branch February 16, 2020 18:41
roberthdevries pushed a commit to roberthdevries/pandas that referenced this pull request Mar 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants