Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: add shortcut to Timedelta constructor #31070

Merged
merged 8 commits into from
Jan 24, 2020

Conversation

AlexKirko
Copy link
Member

@AlexKirko AlexKirko commented Jan 16, 2020

WIP until #30676 gets merged.

This implements a shortcut in the Timedelta constructor to cut down on processing if Timedelta is passed. We still need to check if any other args are passed. Then, if a Timedelta with no other kwargs was passed, we just return that same Timedelta.
A test is added to check that the Timedelta is still the same object.

@AlexKirko
Copy link
Member Author

AlexKirko commented Jan 16, 2020

Same as in the Timestamp shortcut, we could hoist the shortcut to the very top of __new__, but I don't think that's necessary. Putting it before error checks looks messy to me, and the performance improvement in any real code would be miniscule (as casting the object into its own class is unlikely to be a significant part of a program). When running timeit, the option with the shortcut at the very top provided about a 20% performance boost.

@WillAyd
Copy link
Member

WillAyd commented Jan 16, 2020

What kind of performance gain does this show?

@WillAyd WillAyd added Performance Memory or execution speed performance Timedelta Timedelta data type labels Jan 16, 2020
@AlexKirko
Copy link
Member Author

@WillAyd About 20% on 10 000 000 iterations.

# Test for #30543
expected = pd.Timedelta(np.timedelta64(1, "s"))
result = pd.Timedelta(expected)
assert result is expected
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this expectation - we don't do this for Timestamps so not sure should do it for Timedeltas as well but maybe @jschendel has thoughts

Copy link
Member Author

@AlexKirko AlexKirko Jan 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a test for the constructor identity too, in the Timestamp shortcut PR #30676. Sorry for posting the Timedelta one before that one gets merged. It's just that the discussion was mostly done there, so I thought I'd get on the Timedelta one too.
If it's inconvenient, then we can put this one on hold until #30676 gets merged, and then I can just copy the approach we settle on there.

def setup(self):
self.nptimedelta64 = np.timedelta64(3600)
self.dttimedelta = datetime.timedelta(seconds=3600)
self.td = Timedelta(3600, unit="s")
Copy link
Member Author

@AlexKirko AlexKirko Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are here, let's add a benchmark for making Timedelta out of Timedelta, and I think we should also tweak existing tests. Currently, we benchmark both our method and np.timedelta64 and datetime.timedelta at the same time. We should move the initialization into setup, in my opinion. It's a bit nitpicky, of course, since both np.timedelta64 and datetime.timedelta are unlikely to change speed in the future, but still.

@jreback jreback added this to the 1.1 milestone Jan 24, 2020
@jreback jreback merged commit 11efca1 into pandas-dev:master Jan 24, 2020
@jreback
Copy link
Contributor

jreback commented Jan 24, 2020

thanks @AlexKirko

@AlexKirko AlexKirko deleted the perf-timedelta branch January 27, 2020 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants