Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: implement scalar ops blockwise #29853

Merged
merged 20 commits into from
Dec 27, 2019

Conversation

jbrockmendel
Copy link
Member

Similar to #28583, but going through BlockManager.apply.

@jbrockmendel jbrockmendel added Numeric Operations Arithmetic, Comparison, and Logical operations Performance Memory or execution speed performance labels Dec 3, 2019
@pep8speaks
Copy link

pep8speaks commented Dec 21, 2019

Hello @jbrockmendel! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-12-27 16:35:39 UTC

@jbrockmendel
Copy link
Member Author

Resolved the issue with test_expressions behaving unexpectedly.

Added an asv that times operations on a homogeneous-dtype DataFrame (rows=20k, cols=100) with a scalar. Not sure how many variants of these to do; could be easy to go overboard.

       before           after         ratio
     [0cd388fd]       [1fc1e3ec]
     <cy30>           <back-to-arith>
-         110±4ms         59.6±1ms     0.54  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function floordiv>)
-         115±3ms         59.5±1ms     0.52  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function floordiv>)
-        88.2±1ms       44.1±0.7ms     0.50  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function floordiv>)
-        91.0±3ms       44.7±0.9ms     0.49  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function floordiv>)
-        94.8±3ms         46.5±1ms     0.49  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function floordiv>)
-        92.1±2ms       44.9±0.4ms     0.49  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function floordiv>)
-        93.9±4ms       42.2±0.2ms     0.45  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function floordiv>)
-        94.6±2ms         41.7±1ms     0.44  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function floordiv>)
-        78.0±4ms       28.6±0.4ms     0.37  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function pow>)
-        66.9±1ms       22.5±0.6ms     0.34  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function mod>)
-        67.8±2ms         22.1±1ms     0.33  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function mod>)
-        70.0±3ms       22.5±0.5ms     0.32  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function pow>)
-        67.3±2ms       20.8±0.6ms     0.31  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function mod>)
-        66.7±1ms         20.3±1ms     0.30  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function mod>)
-        64.9±1ms       19.3±0.5ms     0.30  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function mod>)
-      65.5±0.7ms       19.1±0.8ms     0.29  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function mod>)
-        73.2±1ms         18.2±1ms     0.25  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function mod>)
-        74.7±3ms         18.3±1ms     0.25  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function mod>)
-        60.3±1ms       9.87±0.1ms     0.16  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function pow>)
-        60.0±2ms       9.80±0.3ms     0.16  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function pow>)
-        59.4±3ms       9.65±0.4ms     0.16  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function pow>)
-        58.5±2ms       9.09±0.3ms     0.16  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function pow>)
-      42.6±0.9ms      3.96±0.09ms     0.09  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function xor>)
-      51.3±0.4ms       4.70±0.1ms     0.09  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function pow>)
-        52.7±2ms       4.68±0.1ms     0.09  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function pow>)
-      42.3±0.8ms       3.54±0.1ms     0.08  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function or_>)
-        43.4±2ms       3.36±0.2ms     0.08  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function and_>)
-        43.0±1ms       3.23±0.2ms     0.08  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function xor>)
-        44.9±1ms       3.36±0.2ms     0.07  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function or_>)
-      45.6±0.4ms       3.41±0.1ms     0.07  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function and_>)
-        49.8±1ms       3.24±0.1ms     0.07  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function mul>)
-      28.0±0.5ms       1.81±0.2ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function eq>)
-      48.8±0.7ms      3.15±0.07ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function truediv>)
-      50.1±0.5ms      3.15±0.09ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function mul>)
-      50.1±0.8ms       3.14±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function truediv>)
-        51.4±1ms       3.21±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function truediv>)
-      49.6±0.3ms      3.10±0.05ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function sub>)
-      49.5±0.9ms      3.08±0.06ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function mul>)
-        51.3±1ms       3.17±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function add>)
-        50.0±1ms      3.07±0.09ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function add>)
-      28.7±0.8ms       1.77±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function gt>)
-      50.4±0.8ms       3.06±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function add>)
-        29.1±1ms      1.76±0.03ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function ne>)
-        50.1±1ms       3.02±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function truediv>)
-      49.8±0.7ms      2.99±0.07ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function truediv>)
-        50.4±1ms       3.02±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function truediv>)
-        50.9±1ms       3.04±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function mul>)
-        50.8±1ms      3.03±0.09ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function truediv>)
-        50.8±1ms       3.03±0.3ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function sub>)
-      52.2±0.9ms      3.08±0.09ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function mul>)
-        54.0±1ms      3.17±0.09ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function add>)
-      51.1±0.7ms       2.98±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function add>)
-        50.8±1ms      2.96±0.06ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function sub>)
-        53.7±1ms      3.12±0.04ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function sub>)
-        52.7±2ms      3.05±0.08ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function truediv>)
-        51.2±1ms       2.96±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function sub>)
-        28.6±1ms      1.65±0.07ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function le>)
-      29.9±0.6ms       1.72±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function ge>)
-        29.7±2ms      1.69±0.04ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function lt>)
-      54.0±0.7ms       3.06±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function sub>)
-      55.1±0.9ms       3.12±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function add>)
-      50.6±0.8ms       2.86±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function add>)
-        59.1±2ms       3.31±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function sub>)
-        29.4±1ms      1.65±0.08ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function ge>)
-        53.5±1ms       2.97±0.1ms     0.06  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function mul>)
-        30.9±2ms       1.69±0.1ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function lt>)
-        29.2±2ms      1.59±0.09ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function eq>)
-      54.9±0.9ms       2.98±0.2ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function mul>)
-        54.2±1ms       2.93±0.1ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function add>)
-        54.5±1ms      2.92±0.08ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function sub>)
-        29.8±2ms       1.58±0.1ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function gt>)
-        58.2±3ms       3.08±0.2ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function mul>)
-        31.2±1ms       1.61±0.1ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function le>)
-        31.0±2ms       1.59±0.1ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 3.0, <built-in function ne>)
-        27.8±1ms      1.29±0.06ms     0.05  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function ge>)
-      27.4±0.8ms       1.22±0.1ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function le>)
-        28.0±1ms      1.17±0.04ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function le>)
-      27.8±0.6ms       1.16±0.1ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function lt>)
-      27.6±0.9ms      1.15±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function lt>)
-      27.8±0.4ms      1.16±0.07ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function le>)
-        29.8±2ms      1.24±0.06ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function ne>)
-      27.7±0.8ms      1.15±0.07ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function lt>)
-      27.4±0.6ms      1.14±0.08ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function le>)
-      28.5±0.8ms      1.18±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function ne>)
-      27.8±0.7ms      1.15±0.01ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function le>)
-      28.5±0.6ms      1.16±0.06ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function gt>)
-      27.1±0.8ms      1.11±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function lt>)
-      27.5±0.6ms      1.12±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function gt>)
-      27.4±0.4ms      1.11±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function ne>)
-      27.1±0.9ms      1.10±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function ge>)
-      28.6±0.4ms      1.14±0.06ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function eq>)
-      27.4±0.6ms      1.09±0.02ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function eq>)
-      28.2±0.7ms      1.12±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function ne>)
-      27.3±0.6ms      1.08±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function ge>)
-        28.1±1ms      1.12±0.02ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function gt>)
-        28.3±1ms      1.12±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function gt>)
-      28.1±0.8ms      1.11±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function lt>)
-      27.1±0.8ms      1.07±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function le>)
-        27.9±1ms      1.10±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function ge>)
-      28.2±0.5ms      1.09±0.06ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function lt>)
-      29.0±0.5ms      1.11±0.04ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function ne>)
-        29.0±2ms      1.11±0.04ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function gt>)
-      29.2±0.2ms      1.11±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function eq>)
-        28.2±2ms      1.07±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function ge>)
-        29.1±1ms      1.10±0.04ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function gt>)
-      29.5±0.2ms      1.11±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function eq>)
-        30.2±2ms      1.13±0.04ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 2, <built-in function ge>)
-      28.4±0.2ms      1.05±0.03ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 4, <built-in function eq>)
-        29.9±1ms      1.09±0.05ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function ne>)
-      29.1±0.6ms      1.03±0.02ms     0.04  binary_ops.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 5.0, <built-in function eq>)

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

@jbrockmendel jbrockmendel changed the title WIP: implement scalar ops blockwise PERF: implement scalar ops blockwise Dec 21, 2019
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. need to go thru again, some minor commments.

@@ -372,6 +373,10 @@ def dispatch_to_series(left, right, func, str_rep=None, axis=None):
right = lib.item_from_zerodim(right)
if lib.is_scalar(right) or np.ndim(right) == 0:

array_op = get_array_op(func, str_rep=str_rep)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment here on what is going on

@@ -411,7 +411,10 @@ def apply(self, f: str, filter=None, **kwargs):
axis = obj._info_axis_number
kwargs[k] = obj.reindex(b_items, axis=axis, copy=align_copy)

applied = getattr(b, f)(**kwargs)
if callable(f):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this strictly necessary? meaning happy to require only callables here (would require some changing)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of our existing usages pass strings here to get at Block methods. i think @WillAyd had a suggestion about re-working Block.apply to do str vs callable handling there; that should be its own PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k, yeah this whole section could use some TLC

@@ -367,3 +368,13 @@ def fill_bool(x, left=None):
res_values = filler(res_values) # type: ignore

return res_values


def get_array_op(op, str_rep=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a doc-string / what this is doing

block = self.make_block(values=nv, placement=[loc])
nbs.append(block)
return nbs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be an elif here and re-assign to result, just to make the flow more natural. alt could make this into a method on BM. but for followon's

array_op = get_array_op(func, str_rep=str_rep)
bm = left._data.apply(array_op, right=right)
return type(left)(bm)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could just be an if (as you are returning), e.g. change the following elif to an if, but NBD

@jreback jreback added this to the 1.0 milestone Dec 27, 2019
@jreback
Copy link
Contributor

jreback commented Dec 27, 2019

ok a couple of minor comments, rebase and looks ok to go. I suspect you will be refactoring things after this is in anyhow.

@jbrockmendel
Copy link
Member Author

rebased+green

@jreback jreback merged commit 23a4a51 into pandas-dev:master Dec 27, 2019
@jreback
Copy link
Contributor

jreback commented Dec 27, 2019

thanks @jbrockmendel

@jbrockmendel
Copy link
Member Author

Alright! This was a tough slog, thanks to all who helped along the way. Next up: dispatching for op(frame, series).

@jbrockmendel jbrockmendel deleted the back-to-arith branch December 27, 2019 20:34
AlexKirko pushed a commit to AlexKirko/pandas that referenced this pull request Dec 29, 2019
keechongtan added a commit to keechongtan/pandas that referenced this pull request Dec 29, 2019
…ndexing-1row-df

* upstream/master: (333 commits)
  CI: troubleshoot Web_and_Docs failing (pandas-dev#30534)
  WARN: Ignore NumbaPerformanceWarning in test suite (pandas-dev#30525)
  DEPR: camelCase in offsets, get_offset (pandas-dev#30340)
  PERF: implement scalar ops blockwise (pandas-dev#29853)
  DEPR: Remove Series.compress (pandas-dev#30514)
  ENH: Add numba engine for rolling apply (pandas-dev#30151)
  [ENH] Add to_markdown method (pandas-dev#30350)
  DEPR: Deprecate pandas.np module (pandas-dev#30386)
  ENH: Add ignore_index for df.drop_duplicates (pandas-dev#30405)
  BUG: The setting xrot=0 in DataFrame.hist() doesn't work with by and subplots pandas-dev#30288 (pandas-dev#30491)
  CI: Fix GBQ Tests (pandas-dev#30478)
  Bug groupby quantile listlike q and int columns (pandas-dev#30485)
  ENH: Add ignore_index for df.sort_values and series.sort_values (pandas-dev#30402)
  TYP: Typing hints in pandas/io/formats/{css,csvs}.py (pandas-dev#30398)
  BUG: raise on non-hashable Index name, closes pandas-dev#29069 (pandas-dev#30335)
  Replace "foo!r" to "repr(foo)" syntax pandas-dev#29886 (pandas-dev#30502)
  BUG: preserve EA dtype in transpose (pandas-dev#30091)
  BLD: add check to prevent tempita name error, clsoes pandas-dev#28836 (pandas-dev#30498)
  REF/TST: method-specific files for test_append (pandas-dev#30503)
  marked unused parameters (pandas-dev#30504)
  ...
@huitrouge
Copy link

Next up: dispatching for op(frame, series).

Hi @jbrockmendel! is there already an issue regarding this where we could track the progress?

@jbrockmendel
Copy link
Member Author

is there already an issue regarding this where we could track the progress?

no, but i can tell you the answer.

The four cases are: scalar, series(axis=0), series(axis=1), and frame. This PR handled the scalar case. Another PR handled the series(axis=1) case (except when that series is EA-backed). #32779 handles the frame case.

@huitrouge
Copy link

Thank you for your reply!

Another PR handled the series(axis=1) case (except when that series is EA-backed).

by "handled" you mean, that this has already been resolved? If so in which pandas-Version?

We are still seeing performance issues in
"df - series" - cases.

e.g.

import pandas as pd

df = pd.DataFrame(index=['A'], columns=range(1000), data=1.0)
s = pd.Series(index=df.columns, data=1.0)

%timeit x = df - s

is much slower on 0.24.x all the way through to the pypy-available 1.0.3 than it was on 0.23.4

regards,
Malte

@jbrockmendel
Copy link
Member Author

by "handled" you mean, that this has already been resolved? If so in which pandas-Version?

Yes, handled as in resolved. Not sure off the top of my head when that was. Before my caffeine I'd guess 60/40 that it made it into 1.0

We are still seeing performance issues in "df - series" - cases.

That case hasn't been addressed yet, will be next up after #32779. If you'd like to make a PR and improve it before I do, go for it.

@backbord
Copy link

@jbrockmendel, here is a performance comparison of the different pandas versions using @huitrouge's benchmark:

version  %timeit output
0.23.4:  241 µs ± 2.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
0.24.2:  218 ms ± 3.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
0.25.0:  217 ms ± 1.24 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.0.0:   215 ms ± 2.41 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.0.2:   218 ms ± 3.11 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.0.3:   216 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Numeric Operations Arithmetic, Comparison, and Logical operations Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants