Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: reorg type inference & introspection #13147

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion ci/lint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ RET=0

if [ "$LINT" ]; then
echo "Linting"
for path in 'core' 'indexes' 'types' 'formats' 'io' 'stats' 'compat' 'sparse' 'tools' 'tseries' 'tests' 'computation' 'util'
for path in 'api' 'core' 'indexes' 'types' 'formats' 'io' 'stats' 'compat' 'sparse' 'tools' 'tseries' 'tests' 'computation' 'util'
do
echo "linting -> pandas/$path"
flake8 pandas/$path --filename '*.py'
Expand Down
22 changes: 21 additions & 1 deletion doc/source/whatsnew/v0.19.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ users upgrade to this version.
Highlights include:

- :func:`merge_asof` for asof-style time-series joining, see :ref:`here <whatsnew_0190.enhancements.asof_merge>`
- pandas development api, see :ref:`here <whatsnew_0190.dev_api>`

.. contents:: What's new in v0.18.2
:local:
Expand All @@ -20,6 +21,25 @@ Highlights include:
New features
~~~~~~~~~~~~

.. _whatsnew_0190.dev_api:

pandas development API
^^^^^^^^^^^^^^^^^^^^^^

As part of making pandas APi more uniform and accessible in the future, we have created a standard
sub-package of pandas, ``pandas.api`` to hold public API's. We are starting by exposing type
introspection functions in ``pandas.api.types``. More sub-packages and officially sanctioned API's
will be published in future versions of pandas.

The following are now part of this API:

.. ipython:: python

import pprint
from pandas.api import types
funcs = [ f for f in dir(types) if not f.startswith('_') ]
pprint.pprint(funcs)

.. _whatsnew_0190.enhancements.asof_merge:

:func:`merge_asof` for asof-style time-series joining
Expand Down Expand Up @@ -227,7 +247,7 @@ Other enhancements
- Consistent with the Python API, ``pd.read_csv()`` will now interpret ``+inf`` as positive infinity (:issue:`13274`)
- The ``DataFrame`` constructor will now respect key ordering if a list of ``OrderedDict`` objects are passed in (:issue:`13304`)
- ``pd.read_html()`` has gained support for the ``decimal`` option (:issue:`12907`)
- A top-level function :func:`union_categorical` has been added for combining categoricals, see :ref:`Unioning Categoricals<categorical.union>` (:issue:`13361`)
- A function :func:`union_categorical` has been added for combining categoricals, see :ref:`Unioning Categoricals<categorical.union>` (:issue:`13361`)
- ``Series`` has gained the properties ``.is_monotonic``, ``.is_monotonic_increasing``, ``.is_monotonic_decreasing``, similar to ``Index`` (:issue:`13336`)

.. _whatsnew_0190.api:
Expand Down
2 changes: 1 addition & 1 deletion pandas/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

if missing_dependencies:
raise ImportError("Missing required dependencies {0}".format(missing_dependencies))

del hard_dependencies, dependency, missing_dependencies

# numpy compat
from pandas.compat.numpy import *
Expand Down
1 change: 1 addition & 0 deletions pandas/api/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
""" public toolkit API """
Empty file added pandas/api/tests/__init__.py
Empty file.
213 changes: 213 additions & 0 deletions pandas/api/tests/test_api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
# -*- coding: utf-8 -*-

import pandas as pd
from pandas.core import common as com
from pandas import api
from pandas.api import types
from pandas.util import testing as tm

_multiprocess_can_split_ = True


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche I documented this as much as possible. Some obvious candidates for deprecation at some point.

Further this is all now tested so that anything changing the namespace will break the tests (so at least we would know/care) and take action.

class Base(object):

def check(self, namespace, expected, ignored=None):
# see which names are in the namespace, minus optional
# ignored ones
# compare vs the expected

result = sorted([f for f in dir(namespace) if not f.startswith('_')])
if ignored is not None:
result = sorted(list(set(result) - set(ignored)))

expected = sorted(expected)
tm.assert_almost_equal(result, expected)


class TestPDApi(Base, tm.TestCase):

# these are optionally imported based on testing
# & need to be ignored
ignored = ['tests', 'rpy', 'sandbox', 'locale']

# top-level sub-packages
lib = ['api', 'compat', 'computation', 'core',
'indexes', 'formats', 'pandas',
'test', 'tools', 'tseries',
'types', 'util', 'options', 'io']

# top-level packages that are c-imports, should rename to _*
# to avoid naming conflicts
lib_to_rename = ['algos', 'hashtable', 'tslib', 'msgpack', 'sparse',
'json', 'lib', 'index', 'parser']

# these are already deprecated; awaiting removal
deprecated_modules = ['ols', 'stats']

# misc
misc = ['IndexSlice', 'NaT']

# top-level classes
classes = ['Categorical', 'CategoricalIndex', 'DataFrame', 'DateOffset',
'DatetimeIndex', 'ExcelFile', 'ExcelWriter', 'Float64Index',
'Grouper', 'HDFStore', 'Index', 'Int64Index', 'MultiIndex',
'Period', 'PeriodIndex', 'RangeIndex',
'Series', 'SparseArray', 'SparseDataFrame',
'SparseSeries', 'TimeGrouper', 'Timedelta',
'TimedeltaIndex', 'Timestamp']

# these are already deprecated; awaiting removal
deprecated_classes = ['SparsePanel', 'TimeSeries', 'WidePanel',
'SparseTimeSeries']

# these should be deperecated in the future
deprecated_classes_in_future = ['Panel', 'Panel4D',
'SparseList', 'Term']

# these should be removed from top-level namespace
remove_classes_from_top_level_namespace = ['Expr']

# external modules exposed in pandas namespace
modules = ['np', 'datetime', 'datetools']

# top-level functions
funcs = ['bdate_range', 'concat', 'crosstab', 'cut',
'date_range', 'eval',
'factorize', 'get_dummies', 'get_store',
'infer_freq', 'isnull', 'lreshape',
'match', 'melt', 'notnull', 'offsets',
'merge', 'merge_ordered', 'merge_asof',
'period_range',
'pivot', 'pivot_table', 'plot_params', 'qcut',
'scatter_matrix',
'show_versions', 'timedelta_range', 'unique',
'value_counts', 'wide_to_long']

# top-level option funcs
funcs_option = ['reset_option', 'describe_option', 'get_option',
'option_context', 'set_option',
'set_eng_float_format']

# top-level read_* funcs
funcs_read = ['read_clipboard', 'read_csv', 'read_excel', 'read_fwf',
'read_gbq', 'read_hdf', 'read_html', 'read_json',
'read_msgpack', 'read_pickle', 'read_sas', 'read_sql',
'read_sql_query', 'read_sql_table', 'read_stata',
'read_table']

# top-level to_* funcs
funcs_to = ['to_datetime', 'to_msgpack',
'to_numeric', 'to_pickle', 'to_timedelta']

# these should be deperecated in the future
deprecated_funcs_in_future = ['pnow', 'groupby', 'info']

# these are already deprecated; awaiting removal
deprecated_funcs = ['ewma', 'ewmcorr', 'ewmcov', 'ewmstd', 'ewmvar',
'ewmvol', 'expanding_apply', 'expanding_corr',
'expanding_count', 'expanding_cov', 'expanding_kurt',
'expanding_max', 'expanding_mean', 'expanding_median',
'expanding_min', 'expanding_quantile',
'expanding_skew', 'expanding_std', 'expanding_sum',
'expanding_var', 'fama_macbeth', 'rolling_apply',
'rolling_corr', 'rolling_count', 'rolling_cov',
'rolling_kurt', 'rolling_max', 'rolling_mean',
'rolling_median', 'rolling_min', 'rolling_quantile',
'rolling_skew', 'rolling_std', 'rolling_sum',
'rolling_var', 'rolling_window', 'ordered_merge']

def test_api(self):

self.check(pd,
self.lib + self.lib_to_rename + self.misc +
self.modules + self.deprecated_modules +
self.classes + self.deprecated_classes +
self.deprecated_classes_in_future +
self.remove_classes_from_top_level_namespace +
self.funcs + self.funcs_option +
self.funcs_read + self.funcs_to +
self.deprecated_funcs +
self.deprecated_funcs_in_future,
self.ignored)


class TestApi(Base, tm.TestCase):

allowed = ['tests', 'types']

def test_api(self):

self.check(api, self.allowed)


class TestTypes(Base, tm.TestCase):

allowed = ['is_any_int_dtype', 'is_bool', 'is_bool_dtype',
'is_categorical', 'is_categorical_dtype', 'is_complex',
'is_complex_dtype', 'is_datetime64_any_dtype',
'is_datetime64_dtype', 'is_datetime64_ns_dtype',
'is_datetime64tz_dtype', 'is_datetimetz', 'is_dtype_equal',
'is_extension_type', 'is_float', 'is_float_dtype',
'is_floating_dtype', 'is_int64_dtype', 'is_integer',
'is_integer_dtype', 'is_number', 'is_numeric_dtype',
'is_object_dtype', 'is_scalar', 'is_sparse',
'is_string_dtype', 'is_timedelta64_dtype',
'is_timedelta64_ns_dtype',
'is_re', 'is_re_compilable',
'is_dict_like', 'is_iterator',
'is_list_like', 'is_hashable',
'is_named_tuple', 'is_sequence',
'pandas_dtype']

def test_types(self):

self.check(types, self.allowed)

def check_deprecation(self, fold, fnew):
with tm.assert_produces_warning(FutureWarning):
try:
result = fold('foo')
expected = fnew('foo')
self.assertEqual(result, expected)
except TypeError:
self.assertRaises(TypeError,
lambda: fnew('foo'))
except AttributeError:
self.assertRaises(AttributeError,
lambda: fnew('foo'))

def test_deprecation_core_common(self):

# test that we are in fact deprecating
# the pandas.core.common introspectors
for t in self.allowed:
self.check_deprecation(getattr(com, t), getattr(types, t))

def test_deprecation_core_common_moved(self):

# these are in pandas.types.common
l = ['is_datetime_arraylike',
'is_datetime_or_timedelta_dtype',
'is_datetimelike',
'is_datetimelike_v_numeric',
'is_datetimelike_v_object',
'is_datetimetz',
'is_int_or_datetime_dtype',
'is_period_arraylike',
'is_string_like',
'is_string_like_dtype']

from pandas.types import common as c
for t in l:
self.check_deprecation(getattr(com, t), getattr(c, t))

def test_removed_from_core_common(self):

for t in ['is_null_datelike_scalar',
'ensure_float']:
self.assertRaises(AttributeError, lambda: getattr(com, t))

if __name__ == '__main__':
import nose
nose.runmodule(argv=[__file__, '-vvs', '-x', '--pdb', '--pdb-failure'],
exit=False)
4 changes: 4 additions & 0 deletions pandas/api/types/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
""" public toolkit API """

from pandas.types.api import * # noqa
del np # noqa
3 changes: 2 additions & 1 deletion pandas/compat/numpy/function.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@
from numpy import ndarray
from pandas.util.validators import (validate_args, validate_kwargs,
validate_args_and_kwargs)
from pandas.core.common import is_bool, is_integer, UnsupportedFunctionCall
from pandas.core.common import UnsupportedFunctionCall
from pandas.types.common import is_integer, is_bool
from pandas.compat import OrderedDict


Expand Down
8 changes: 4 additions & 4 deletions pandas/computation/ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@

import numpy as np

from pandas.types.common import is_list_like, is_scalar
import pandas as pd
from pandas.compat import PY3, string_types, text_type
import pandas.core.common as com
from pandas.formats.printing import pprint_thing, pprint_thing_encoded
import pandas.lib as lib
from pandas.core.base import StringMixin
from pandas.computation.common import _ensure_decoded, _result_type_many
from pandas.computation.scope import _DEFAULT_GLOBALS
Expand Down Expand Up @@ -100,7 +100,7 @@ def update(self, value):

@property
def isscalar(self):
return lib.isscalar(self._value)
return is_scalar(self._value)

@property
def type(self):
Expand Down Expand Up @@ -229,7 +229,7 @@ def _in(x, y):
try:
return x.isin(y)
except AttributeError:
if com.is_list_like(x):
if is_list_like(x):
try:
return y.isin(x)
except AttributeError:
Expand All @@ -244,7 +244,7 @@ def _not_in(x, y):
try:
return ~x.isin(y)
except AttributeError:
if com.is_list_like(x):
if is_list_like(x):
try:
return ~y.isin(x)
except AttributeError:
Expand Down
4 changes: 3 additions & 1 deletion pandas/computation/pytables.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
from datetime import datetime, timedelta
import numpy as np
import pandas as pd

from pandas.types.common import is_list_like
import pandas.core.common as com
from pandas.compat import u, string_types, DeepChainMap
from pandas.core.base import StringMixin
Expand Down Expand Up @@ -127,7 +129,7 @@ def pr(left, right):

def conform(self, rhs):
""" inplace conform rhs """
if not com.is_list_like(rhs):
if not is_list_like(rhs):
rhs = [rhs]
if isinstance(rhs, np.ndarray):
rhs = rhs.ravel()
Expand Down
Loading