Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: dt64 <-> dt64tz roundup #38622

Closed
jbrockmendel opened this issue Dec 21, 2020 · 7 comments · Fixed by #39258
Closed

BUG: dt64 <-> dt64tz roundup #38622

jbrockmendel opened this issue Dec 21, 2020 · 7 comments · Fixed by #39258
Labels
Datetime Datetime data dtype Deprecate Functionality to remove in pandas
Milestone

Comments

@jbrockmendel
Copy link
Member

xref #33401 which discusses astype, #24559 which discusses constructors. I think these need to be considered jointly.

Two big-picture questions.

  1. when we pass constructor(dt64values, tz=tz), do we interpret dt64values as UTC or wall-times?
  2. when we pass constructor(dt64values).astype("M8[ns, tz]")`, do we interpret dt64values as UTC or wall-times?

Let's check the current behavior on master for Timestamp, DatetimeIndex, and Series (see code block below if you want to check my work)

cls Timestamp DatetimeIndex Series
cls(dt64, dtype=dt64tz) UTC Wall UTC
cls(dt64).astype(dt64tz) N/A Wall UTC
cls(dt64tz).astype("M8[ns]") N/A UTC UTC

If we were starting fresh, I would probably lean towards making these all interpret dt64 as wall-times. But because these are already mostly-UTC, it would be an easier change to bring the DatetimeIndex behavior in line with the others.

Thoughts? cc @jorisvandenbossche @mroeschke @jreback

values = (3600 * 10**9 * np.arange(4)).astype("datetime64[ns]")
dt64 = values[-1]

tz = "US/Pacific"
dtype = pd.DatetimeTZDtype(tz=tz)

ts = pd.Timestamp(dt64, tz=tz)
dti = pd.DatetimeIndex(values, dtype=dtype)
ser = pd.Series(values, dtype=dtype)

if_wall = pd.DatetimeIndex(values).tz_localize(tz)
if_utc = pd.DatetimeIndex(values).tz_localize("UTC").tz_convert(tz)

assert ts == if_utc[-1]
assert (dti == if_wall).all()
assert (ser == if_utc).all()

dti_naive = pd.DatetimeIndex(values)
ser_naive = pd.Series(values)

res_dti = dti_naive.astype(dtype)
res_ser = ser.astype(dtype)

assert (res_dti == dti).all()
assert (res_ser == ser).all()

# astype from tzaware -> tznaive
assert (dti.astype(dti_naive.dtype) == dti.tz_convert("UTC").tz_localize(None)).all()
assert (ser_naive.astype(ser.dtype) == ser).all()
@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 21, 2020
@mroeschke
Copy link
Member

mroeschke commented Dec 22, 2020

If we were starting fresh, I would probably lean towards making these all interpret dt64 as wall-times.

Agreed

Is there any possibility of just deprecating any naive -> tz aware (and vice versa) converstion with astype #33401 (comment)?

@jbrockmendel
Copy link
Member Author

Is there any possibility of just deprecating any naive -> tz aware (and vice versa) converstion with astype #33401 (comment)?

I agree that would be nice. The question is if we are OK with breaking frame.astype({col: tz_dtype}) without having a "do X instead" available.

@mroeschke
Copy link
Member

I'd be okay with that with just specifying "any tz_dtype conversion should use tz_localize/convert", but should probably hear from others

@jbrockmendel
Copy link
Member Author

I'd be okay with that with just specifying "any tz_dtype conversion should use tz_localize/convert", but should probably hear from others

agreed. if @jorisvandenbossche and @jreback are on board ill make a deprecation PR

@jbrockmendel
Copy link
Member Author

gentle ping @jorisvandenbossche @jreback

@jbrockmendel
Copy link
Member Author

@jreback gentle ping

@jreback
Copy link
Contributor

jreback commented Jan 10, 2021

I'd be okay with that with just specifying "any tz_dtype conversion should use tz_localize/convert", but should probably hear from others

agreed. if @jorisvandenbossche and @jreback are on board ill make a deprecation PR

I am +1 on this, even though generally pandas makes life easy by just doing the right thing. when there is a choice where its not completely obvious as is the case for localize/convert we should raise.

I think this ok for a 1.3 release.

@jreback jreback added Deprecate Functionality to remove in pandas Datetime Datetime data dtype and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 20, 2021
@jreback jreback added this to the 1.3 milestone Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants