Skip to content

Commit

Permalink
Drop support for NaN categories in Categorical
Browse files Browse the repository at this point in the history
Deprecated in 0.17.0.

xref gh-10748
  • Loading branch information
gfyoung committed Mar 26, 2017
1 parent c577c19 commit 7d9f5ec
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 18 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.20.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -811,6 +811,7 @@ Removal of prior version deprecations/changes
in favor of ``iloc`` and ``iat`` as explained :ref:`here <whatsnew_0170.deprecations>` (:issue:`10711`).
- The deprecated ``DataFrame.iterkv()`` has been removed in favor of ``DataFrame.iteritems()`` (:issue:`10711`)
- The ``Categorical`` constructor has dropped the ``name`` parameter (:issue:`10632`)
- ``Categorical`` has dropped support for ``NaN`` categories (:issue:`10748`)
- The ``take_last`` parameter has been dropped from ``duplicated()``, ``drop_duplicates()``, ``nlargest()``, and ``nsmallest()`` methods (:issue:`10236`, :issue:`10792`, :issue:`10920`)
- ``Series``, ``Index``, and ``DataFrame`` have dropped the ``sort`` and ``order`` methods (:issue:`10726`)
- Where clauses in ``pytables`` are only accepted as strings and expressions types and not other data-types (:issue:`12027`)
Expand Down
13 changes: 3 additions & 10 deletions pandas/core/categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -545,18 +545,11 @@ def _validate_categories(cls, categories, fastpath=False):

if not fastpath:

# check properties of the categories
# we don't allow NaNs in the categories themselves

# Categories cannot contain NaN.
if categories.hasnans:
# NaNs in cats deprecated in 0.17
# GH 10748
msg = ('\nSetting NaNs in `categories` is deprecated and '
'will be removed in a future version of pandas.')
warn(msg, FutureWarning, stacklevel=3)

# categories must be unique
raise ValueError('Categorial categories cannot be NaN')

# Categories must be unique.
if not categories.is_unique:
raise ValueError('Categorical categories must be unique')

Expand Down
14 changes: 6 additions & 8 deletions pandas/tests/test_categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,14 +222,6 @@ def f():
cat = pd.Categorical([np.nan, 1., 2., 3.])
self.assertTrue(is_float_dtype(cat.categories))

# Deprecating NaNs in categoires (GH #10748)
# preserve int as far as possible by converting to object if NaN is in
# categories
with tm.assert_produces_warning(FutureWarning):
cat = pd.Categorical([np.nan, 1, 2, 3],
categories=[np.nan, 1, 2, 3])
self.assertTrue(is_object_dtype(cat.categories))

# This doesn't work -> this would probably need some kind of "remember
# the original type" feature to try to cast the array interface result
# to...
Expand Down Expand Up @@ -418,6 +410,12 @@ def f():

self.assertRaises(ValueError, f)

# NaN categories included
def f():
Categorical.from_codes([0, 1, 2], ["a", "b", np.nan])

self.assertRaises(ValueError, f)

# too negative
def f():
Categorical.from_codes([-2, 1, 2], ["a", "b", "c"])
Expand Down

0 comments on commit 7d9f5ec

Please sign in to comment.