Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF/ENH: Constructors for DatetimeArray/TimedeltaArray #23493

Closed
wants to merge 20 commits into from
Closed
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions pandas/core/arrays/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -888,6 +888,27 @@ def maybe_infer_freq(freq):
return freq, freq_infer


def maybe_define_freq(freq_infer, result):
jbrockmendel marked this conversation as resolved.
Show resolved Hide resolved
"""
If appropriate, infer the frequency of the given Datetime/Timedelta Array
and pin it to the object at the end of the construction.

Parameters
----------
freq_infer : bool
result : DatetimeArray or TimedeltaArray

Notes
-----
This may alter `result` in-place, should only ever be called
from __new__/__init__.
"""
if freq_infer:
inferred = result.inferred_freq
if inferred:
result.freq = frequencies.to_offset(inferred)


def validate_tz_from_dtype(dtype, tz):
"""
If the given dtype is a DatetimeTZDtype, extract the implied
Expand Down
46 changes: 37 additions & 9 deletions pandas/core/arrays/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,8 @@ def wrapper(self, other):
if isinstance(other, list):
# FIXME: This can break for object-dtype with mixed types
other = type(self)(other)
elif not isinstance(other, (np.ndarray, ABCIndexClass, ABCSeries)):
elif not isinstance(other, (np.ndarray, ABCIndexClass, ABCSeries,
DatetimeArrayMixin)):
# Following Timestamp convention, __eq__ is all-False
# and __ne__ is all True, others raise TypeError.
return ops.invalid_comparison(self, other, op)
Expand Down Expand Up @@ -170,6 +171,8 @@ class DatetimeArrayMixin(dtl.DatetimeLikeArrayMixin):
# Constructors

_attributes = ["freq", "tz"]
_freq = None
_tz = None
jbrockmendel marked this conversation as resolved.
Show resolved Hide resolved

@classmethod
def _simple_new(cls, values, freq=None, tz=None, **kwargs):
Expand All @@ -193,11 +196,16 @@ def _simple_new(cls, values, freq=None, tz=None, **kwargs):
result._tz = timezones.tz_standardize(tz)
return result

def __new__(cls, values, freq=None, tz=None, dtype=None):
def __new__(cls, values, freq=None, tz=None, dtype=None, copy=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, if you want to simplify this already a bit more, you could rename the current __new__ to __init__ (you might need some dummy init/new on Index overriding it to make sure it works with the current inheritance works), and then we can already merge the __init__ with _simple_new

if isinstance(values, (list, tuple)) or is_object_dtype(values):
values = cls._from_sequence(values, copy=copy)
# TODO: Can we set copy=False here to avoid re-coping?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, then yes you're OK setting copy=False here. By definition, the conversion to datetime64[ns] will involve a copy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further question: it is not (yet) possible to simply remove this case? (eventually we should not call the DatetimeArray constructor with an array-like of scalars)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not (yet) possible to simply remove this case?

Not if we want to share the extant arithmetic tests (which we do)

(eventually we should not call the DatetimeArray constructor with an array-like of scalars)

I don't share this opinion, would prefer to delay this discussion until it is absolutely necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't share this opinion,

Then please raise this in the appropriate issue, as we have been discussing this before (I think it is #23212, although there is probably some more scattered discussion on other related PRs)

would prefer to delay this discussion until it is absolutely necessary.

It is here that you are redesigning the constructors for the array refactor, IIUC, so if there is a time we should discuss it, it is now I think?

Not if we want to share the extant arithmetic tests (which we do)

Can you clarify this a little bit? At what point do the arithmetic tests need to deal with array of objects?
Eg for boxing the constructed values into Series/Index/Array, there a properly dtyped array can be used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify this a little bit? At what point do the arithmetic tests need to deal with array of objects?

The pertinent word here is "extant". Many of the tests in tests/arithmetic pass a list into tm.box_expected or klass.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring the tests for a moment, I thought we were all on board with the goal of the DatetimelikeArray.__init__ being no inference and no copy.

Back to the tests, it looks like you can you add an entry to box_expected for DatetimeArray to return expected = DatetimeArray._from_sequence(expected)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring the tests for a moment, I thought we were all on board with the goal of the DatetimelikeArray.init being no inference and no copy.

My comment to Joris below about mothballing this conversation applies. But short answer is no: I did not get on board with that.


if tz is None and hasattr(values, 'tz'):
# e.g. DatetimeIndex
# e.g. DatetimeArray, DatetimeIndex
tz = values.tz

# TODO: what about if freq == 'infer'?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then we should also get the freq from the values as a "cheap" inference? Or would there be cases were an inferred frequency can be different than the actual frequency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then we should also get the freq from the values as a "cheap" inference?

That's what I'm thinking, yah

if freq is None and hasattr(values, "freq"):
# i.e. DatetimeArray, DatetimeIndex
freq = values.freq
Expand All @@ -207,26 +215,46 @@ def __new__(cls, values, freq=None, tz=None, dtype=None):
# if dtype has an embedded tz, capture it
tz = dtl.validate_tz_from_dtype(dtype, tz)

if isinstance(values, DatetimeArrayMixin):
if lib.is_scalar(values):
jbrockmendel marked this conversation as resolved.
Show resolved Hide resolved
raise ValueError('{cls}() must be called with a '
'collection of some kind, {data} was passed'
.format(cls=cls.__name__, data=repr(values)))
elif isinstance(values, ABCSeries):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would get out the _values, and then treat that the same as directly passing a DatetimeIndex/DatetimeArray ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if there is a graceful way to do this in the next pass (if I ever manage to catch up with all these comments!)

# extract nanosecond unix timestamps
if tz is None:
# TODO: Try to do this in just one place
tz = values.dt.tz
values = np.array(values.view('i8'))
elif isinstance(values, DatetimeArrayMixin):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatetimeArrayMixin -> cls?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you don't need to get the tz in this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatetimeArrayMixin -> cls?

No. For the moment we are still using inheritance, so this would mess up for DatetimeIndex == DatetimeArray. When we change to composition this check will have to become isinstance(values, (DatetimeArray, ABCDatetimeIndex))

# extract nanosecond unix timestamps
values = values.asi8

if values.dtype == 'i8':
values = values.view('M8[ns]')

assert isinstance(values, np.ndarray), type(values)
assert is_datetime64_dtype(values) # not yet assured nanosecond
values = conversion.ensure_datetime64ns(values, copy=False)
values = conversion.ensure_datetime64ns(values, copy=copy)

result = cls._simple_new(values, freq=freq, tz=tz)
if freq_infer:
inferred = result.inferred_freq
if inferred:
result.freq = to_offset(inferred)
dtl.maybe_define_freq(freq_infer, result)

# NB: Among other things not yet ported from the DatetimeIndex
# constructor, this does not call _deepcopy_if_needed
return result

@classmethod
def _from_sequence(cls, scalars, dtype=None, copy=False):
# list, tuple, or object-dtype ndarray/Index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to turn into an object array here? to_datetime handles all of these cases

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right we could make do without it. I like doing this explicitly because to_datetime is already overloaded and circular.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is horribly inefficient and unnecessary

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't do it here, to_datetime is going to do this. It may be unnecessary, but it is not horribly inefficient. What is a code smell is the circularity involved in calling to_datetime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then just call array_to_datetime and don’t force the conversion to array

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is the root problem (referenced in your "circularity" comment, and down below in TimedeltaIndex.__new__) that to_datetime / to_timedelta returns an Index instead of an EA?

Could we have the public to_datetime just be a simple

array = _to_datetime(...)
return DatetimeIndex(array)

so the internal _to_datetime returns the array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is the root problem (referenced in your "circularity" comment, and down below in TimedeltaIndex.new) that to_datetime / to_timedelta returns an Index instead of an EA?

It's not the fact that it's an Index so much as that it is a circular dependency. I think I can resolve this in an upcoming commit.

Looking through to_datetime and _convert_listlike_datetimes, I don't see a conversion to ndarray[object].

_convert_listlike_datetimes calls ensure_object.

Sorry, to_datetime has in intermediate datetime64[ns] -> object -> datetime64[ns] conversion? That seems unnecessary.

Not sure what you're referring to. As implemented _from_sequence is specifically for list, tuple, or object-dtype NDArray/Index. datetime64-dtype goes through a different path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_convert_listlike_datetimes calls ensure_object.

That's after an

    # these are shortcutable
    if is_datetime64tz_dtype(arg):
        if not isinstance(arg, DatetimeIndex):
            return DatetimeIndex(arg, tz=tz, name=name)
        if tz == 'utc':
            arg = arg.tz_convert(None).tz_localize(tz)
        return arg

    elif is_datetime64_ns_dtype(arg):
        if box and not isinstance(arg, DatetimeIndex):
            try:
                return DatetimeIndex(arg, tz=tz, name=name)
            except ValueError:
                pass

        return arg

So those both avoid conversion to object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExtensionArray._from_sequence is for any sequence of scalar objects, including a ndarray with a specialized type like datetime64[ns]. It'll be used, for example, in factorize(ExtensionArray).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomAugspurger thank you for clarifying; I was under the mistaken impression that it was specifically list/tuple/object-dtype.

Are there any restrictions on kwargs that can be added to it? In particular I'm thinking of freq and tz

values = np.array(scalars, dtype=np.object_, copy=copy)

# TODO: See if we can decrease circularity
from pandas.core.tools.datetimes import to_datetime
values = to_datetime(values)

# pass dtype to constructor in order to convert timezone if necessary
return cls(values, dtype=dtype)

@classmethod
def _generate_range(cls, start, end, periods, freq, tz=None,
normalize=False, ambiguous='raise', closed=None):
Expand Down
4 changes: 3 additions & 1 deletion pandas/core/arrays/period.py
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,9 @@ class PeriodArray(dtl.DatetimeLikeArrayMixin, ExtensionArray):

# --------------------------------------------------------------------
# Constructors
def __init__(self, values, freq=None, copy=False):
def __init__(self, values, freq=None, dtype=None, copy=False):
freq = dtl.validate_dtype_freq(dtype, freq)

if freq is not None:
freq = Period._maybe_convert_freq(freq)

Expand Down
65 changes: 51 additions & 14 deletions pandas/core/arrays/timedeltas.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,14 @@
from pandas import compat

from pandas.core.dtypes.common import (
_TD_DTYPE, is_list_like)
_TD_DTYPE, is_list_like, is_object_dtype, is_timedelta64_dtype)
from pandas.core.dtypes.generic import ABCSeries
from pandas.core.dtypes.missing import isna

import pandas.core.common as com
from pandas.core.algorithms import checked_add_with_arr

from pandas.tseries.offsets import Tick
from pandas.tseries.frequencies import to_offset

from . import datetimelike as dtl

Expand Down Expand Up @@ -112,9 +111,7 @@ def dtype(self):

@classmethod
def _simple_new(cls, values, freq=None, dtype=_TD_DTYPE):
# `dtype` is passed by _shallow_copy in corner cases, should always
# be timedelta64[ns] if present
assert dtype == _TD_DTYPE
_require_m8ns_dtype(dtype)
assert isinstance(values, np.ndarray), type(values)

if values.dtype == 'i8':
Expand All @@ -127,22 +124,44 @@ def _simple_new(cls, values, freq=None, dtype=_TD_DTYPE):
result._freq = freq
return result

def __new__(cls, values, freq=None):
def __new__(cls, values, freq=None, dtype=_TD_DTYPE, copy=False):
_require_m8ns_dtype(dtype)

if isinstance(values, (list, tuple)) or is_object_dtype(values):
values = cls._from_sequence(values, copy=copy)._data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you cannot return here directly?

# TODO: can we set copy=False to avoid re-copying?

freq, freq_infer = dtl.maybe_infer_freq(freq)

values = np.array(values, copy=False)
if values.dtype == np.object_:
values = array_to_timedelta64(values)
if isinstance(values, TimedeltaArrayMixin):
if freq is None and values.freq is not None:
freq = values.freq
freq_infer = False
values = values._data

result = cls._simple_new(values, freq=freq)
if freq_infer:
inferred = result.inferred_freq
if inferred:
result.freq = to_offset(inferred)
values = np.array(values, copy=copy)

if values.dtype == 'i8':
pass
elif not is_timedelta64_dtype(values):
raise TypeError(values.dtype)
elif values.dtype != _TD_DTYPE:
# i.e. non-nano unit
# TODO: use tslibs.conversion func? watch out for overflows
values = values.astype(_TD_DTYPE)

result = cls._simple_new(values, freq=freq)
dtl.maybe_define_freq(freq_infer, result)
return result

@classmethod
def _from_sequence(cls, scalars, dtype=_TD_DTYPE, copy=False):
# list, tuple, or object-dtype ndarray/Index
values = np.array(scalars, dtype=np.object_, copy=copy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't call to_timedelta, so this does require that we pass an object array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then it should
let’s not reinvent the wheel

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it shouldn't. to_timedelta will just end up calling array_to_timedelta64 like this does, but only after doing a bunch of unecessary dtype checks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides this is what TimedeltaIndex.__new__ currently calls


result = array_to_timedelta64(values)
return cls(result, dtype=dtype)

@classmethod
def _generate_range(cls, start, end, periods, freq, closed=None):

Expand Down Expand Up @@ -413,3 +432,21 @@ def _generate_regular_range(start, end, periods, offset):

data = np.arange(b, e, stride, dtype=np.int64)
return data


def _require_m8ns_dtype(dtype):
"""
`dtype` is included in the constructor signature for consistency with
DatetimeArray and PeriodArray, but only timedelta64[ns] is considered
valid. Raise if anything else is passed.

Parameters
----------
dtype : dtype

Raises
------
ValueError
"""
if dtype != _TD_DTYPE:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AssertionError no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When called from _simple_new this is internal so AssertionError would make sense, but it is also called from __new__ so is in principle user-facing.

Either way I need to add tests for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well this should never happen all conversions should be before this

so it should assert

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtype is part of the signature of TimedeltaArray.__new__, which is/will be user-facing. If the user passes the wrong dtype, its a ValueError.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no my point is there should be and i think currently there is already conversion

if it’s wrong at this point it’s not a user error but an incorrect path taken

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that this check function is called two times, one of which is the very first thing in TimedeltaArray.__new__.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the discussion above, is it worth having a 15 line function (including docstrings :-)), for a 2-liner used in two places?
I would maybe simply leave it in place how it was, I think reading something like assert dtype == _TD_DTYPE in TimedeltaArray._simple_new is clearer than calling into a helper function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable. But hey, its a nice docstring.

raise ValueError("Only timedelta64[ns] dtype is valid.", dtype)
49 changes: 24 additions & 25 deletions pandas/core/indexes/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,6 @@ class DatetimeIndex(DatetimeArrayMixin, DatelikeOps, TimelikeOps,

"""
_resolution = cache_readonly(DatetimeArrayMixin._resolution.fget)
_shallow_copy = Index._shallow_copy

_typ = 'datetimeindex'
_join_precedence = 10
Expand All @@ -199,10 +198,11 @@ def _join_i8_wrapper(joinf, **kwargs):

_engine_type = libindex.DatetimeEngine

tz = None
_tz = None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A tz property is defined below

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed? (just curious to understand)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fairly certain that the version currently here is a mistake, since it is overriden by the property definition of tz below. _freq and _tz are set in _simple_new, so I think the idea of also defining them on the class is to make introspecting what attributes exist easy.

_freq = None
_comparables = ['name', 'freqstr', 'tz']
_attributes = ['name', 'freq', 'tz']
timetuple = None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another we'll-need-this-later (when moving from inheritance to composition)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, what is this? Is it intended to be public? It's not present in the public API for 0.23.4

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stdlib datetime.datetime comparison methods check if this attribute exists and if so return NotImplemented, otherwise raise TypeError for non-datetime objects


# define my properties & methods for delegation
_bool_ops = ['is_month_start', 'is_month_end',
Expand All @@ -226,6 +226,9 @@ def _join_i8_wrapper(joinf, **kwargs):
_timezone = cache_readonly(DatetimeArrayMixin._timezone.fget)
is_normalized = cache_readonly(DatetimeArrayMixin.is_normalized.fget)

# --------------------------------------------------------------------
# Constructors

def __new__(cls, data=None,
freq=None, start=None, end=None, periods=None, tz=None,
normalize=False, closed=None, ambiguous='raise',
Expand Down Expand Up @@ -254,13 +257,13 @@ def __new__(cls, data=None,
result.name = name
return result

if not isinstance(data, (np.ndarray, Index, ABCSeries,
DatetimeArrayMixin)):
if is_scalar(data):
raise ValueError('DatetimeIndex() must be called with a '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is kinda an API change (raising a TypeError instead of a ValueError).

Seems fine from a consistency P.O.V., but deserves a release note in the API breaking changes section.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think currently all public cases raise ValueError, so keeping it on that would not give an API change?
(I agree that TypeError is slightly more appropriate though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would have been to keep it a ValueError and change to a TypeError in a separate PR, but here we are... will add a note in Breaking Changes.

'collection of some kind, %s was passed'
% repr(data))
# other iterable of some kind
if is_scalar(data):
raise ValueError('{cls}() must be called with a '
jbrockmendel marked this conversation as resolved.
Show resolved Hide resolved
'collection of some kind, {data} was passed'
.format(cls=cls.__name__, data=repr(data)))

elif not isinstance(data, (np.ndarray, Index, ABCSeries,
DatetimeArrayMixin)):
if not isinstance(data, (list, tuple)):
data = list(data)
data = np.asarray(data, dtype='O')
Expand All @@ -280,16 +283,15 @@ def __new__(cls, data=None,
data = data.tz_localize(tz, ambiguous=ambiguous)
else:
# the tz's must match
if str(tz) != str(data.tz):
if not timezones.tz_compare(tz, data.tz):
msg = ('data is already tz-aware {0}, unable to '
'set specified tz: {1}')
raise TypeError(msg.format(data.tz, tz))

subarr = data.values

if freq is None:
freq = data.freq
verify_integrity = False
data = data._data
elif issubclass(data.dtype.type, np.datetime64):
if data.dtype != _NS_DTYPE:
data = conversion.ensure_datetime64ns(data)
Expand All @@ -298,14 +300,13 @@ def __new__(cls, data=None,
tz = timezones.maybe_get_tz(tz)
data = conversion.tz_localize_to_utc(data.view('i8'), tz,
ambiguous=ambiguous)
subarr = data.view(_NS_DTYPE)
else:
# must be integer dtype otherwise
# assume this data are epoch timestamps
if data.dtype != _INT64_DTYPE:
data = data.astype(np.int64, copy=False)
subarr = data.view(_NS_DTYPE)

subarr = data.view(_NS_DTYPE)
assert isinstance(subarr, np.ndarray), type(subarr)
assert subarr.dtype == 'M8[ns]', subarr.dtype

Expand All @@ -320,19 +321,9 @@ def __new__(cls, data=None,
if freq is not None and not freq_infer:
cls._validate_frequency(subarr, freq, ambiguous=ambiguous)

if freq_infer:
inferred = subarr.inferred_freq
if inferred:
subarr.freq = to_offset(inferred)

dtl.maybe_define_freq(freq_infer, subarr)
return subarr._deepcopy_if_needed(ref_to_data, copy)

def _convert_for_op(self, value):
""" Convert value to be insertable to ndarray """
if self._has_same_tz(value):
return _to_m8(value)
raise ValueError('Passed item and index have different timezone')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just moving this out of the Constructors section for readability

@classmethod
def _simple_new(cls, values, name=None, freq=None, tz=None,
dtype=None, **kwargs):
Expand All @@ -349,6 +340,8 @@ def _simple_new(cls, values, name=None, freq=None, tz=None,
result._reset_identity()
return result

# --------------------------------------------------------------------

@property
def _values(self):
# tz-naive -> ndarray
Expand Down Expand Up @@ -400,6 +393,12 @@ def _is_dates_only(self):
from pandas.io.formats.format import _is_dates_only
return _is_dates_only(self.values) and self.tz is None

def _convert_for_op(self, value):
""" Convert value to be insertable to ndarray """
if self._has_same_tz(value):
return _to_m8(value)
raise ValueError('Passed item and index have different timezone')

@property
def _formatter_func(self):
from pandas.io.formats.format import _get_format_datetime64
Expand Down
Loading