Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG in Series.interpolate: limit_area/limit_direction kwargs with method="pad"/"bfill" have no effect #38106

Merged
merged 45 commits into from
Dec 1, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
9afe992
Added failing test for https://github.com/pandas-dev/pandas/issues/26796
cchwala Jan 15, 2020
3a191b9
Added implementation to support `limit_area`
cchwala Jan 15, 2020
fd5d8e8
fix test
cchwala Jan 15, 2020
26d88ed
pep8
cchwala Jan 15, 2020
6597aca
fixed small error that actually had no effect since the input array `…
cchwala Jan 15, 2020
c536d3c
Raise when forbidden combination of `method` and `limit_direction` ar…
cchwala Jan 15, 2020
ed9cf21
Updated docstring with info about allowed combinations of `method` an…
cchwala Jan 15, 2020
2980325
clean up
cchwala Jan 15, 2020
ecf428e
Added entry to whatsnew file
cchwala Jan 15, 2020
f8a3423
Removed `axis` kwarg from `interpolate_1d_fill` because it was unused
cchwala Jan 15, 2020
6733186
Type annotations added to new function `interpolate_1d_fill`
cchwala Jan 15, 2020
c5b77d2
fixed incorrectly sorted imports
cchwala Jan 15, 2020
0bb36de
Added type annotation, updated docstring and removed unnecessary argu…
cchwala Feb 5, 2020
a467afd
Reverting docstring entry for default value of `limit_direction`
cchwala Feb 18, 2020
5466d8c
Moved logic for calling `missing.interpolate_1d_fill` to `missing.int…
cchwala Feb 18, 2020
3e968fc
Moved whatsnew entry to v1.1.0.rst
cchwala Feb 18, 2020
556a3cf
clean up
cchwala Feb 18, 2020
6c1e429
fixed missing Optional in type definition
cchwala Feb 18, 2020
767b0ca
small fix so that CI type validation does not complain
cchwala Mar 16, 2020
b82aaff
Merge remote-tracking branch 'upstream/master' into fix_interpolate_l…
cchwala Mar 17, 2020
26ef7b5
Apply suggestions from code review concerning list instead of set
cchwala Mar 19, 2020
b4b6b5a
added import for missing List type
cchwala Mar 19, 2020
e259549
fixed unsorted order of imports
cchwala Mar 19, 2020
8ceff58
Merge remote-tracking branch 'upstream/master' into fix_interpolate_l…
cchwala Aug 25, 2020
7c5ad7d
Added tests back in
cchwala Aug 25, 2020
92148ff
Added new solution to account for limit_area with pad
cchwala Aug 26, 2020
d62e02e
black formatting
cchwala Aug 26, 2020
c2473f2
added whatsnew entry
cchwala Aug 26, 2020
610e347
Test with moving logic for interpolate_2d with `limite_area` directly…
cchwala Sep 14, 2020
570e3c2
fix wrong arg order by using kwargs as suggested in https://github.co…
cchwala Sep 15, 2020
721304a
Added comment to explain recursion and added typing for interpolate_2d
cchwala Sep 15, 2020
73ab1bf
improved test code coverage
cchwala Sep 15, 2020
a33f629
fixed wrong typing
cchwala Sep 15, 2020
64e03ec
merge with master
arw2019 Nov 27, 2020
80e2a78
merge master
arw2019 Nov 30, 2020
cb9d56a
move func to module level
arw2019 Nov 30, 2020
3aa6efa
typing
arw2019 Nov 30, 2020
47064b6
parametrize tests
arw2019 Nov 30, 2020
bd2e0cc
Merge branch 'master' of https://github.com/pandas-dev/pandas into GH…
arw2019 Nov 30, 2020
8a01725
docstring
arw2019 Dec 1, 2020
e2e7f61
Merge branch 'master' of https://github.com/pandas-dev/pandas into GH…
arw2019 Dec 1, 2020
69bae74
typing
arw2019 Dec 1, 2020
e18f43d
Merge branch 'master' of https://github.com/pandas-dev/pandas into GH…
arw2019 Dec 1, 2020
cc898a9
refactor
arw2019 Dec 1, 2020
5ee59ea
Merge branch 'master' of https://github.com/pandas-dev/pandas into GH…
arw2019 Dec 1, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -670,6 +670,7 @@ Missing

- Bug in :meth:`.SeriesGroupBy.transform` now correctly handles missing values for ``dropna=False`` (:issue:`35014`)
- Bug in :meth:`Series.nunique` with ``dropna=True`` was returning incorrect results when both ``NA`` and ``None`` missing values were present (:issue:`37566`)
- Bug in :meth:`Series.interpolate` where kwarg ``limit_area`` and ``limit_direction`` had no effect when using methods ``pad`` and ``backfill`` (:issue:`31048`)
-

MultiIndex
Expand Down
3 changes: 3 additions & 0 deletions pandas/core/internals/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -1261,6 +1261,7 @@ def interpolate(
axis=axis,
inplace=inplace,
limit=limit,
limit_area=limit_area,
downcast=downcast,
)
# validate the interp method
Expand All @@ -1287,6 +1288,7 @@ def _interpolate_with_fill(
axis: int = 0,
inplace: bool = False,
limit: Optional[int] = None,
limit_area: Optional[str] = None,
downcast: Optional[str] = None,
) -> List["Block"]:
""" fillna but using the interpolate machinery """
Expand All @@ -1301,6 +1303,7 @@ def _interpolate_with_fill(
method=method,
axis=axis,
limit=limit,
limit_area=limit_area,
)

blocks = [self.make_block_same_class(values, ndim=self.ndim)]
Expand Down
86 changes: 81 additions & 5 deletions pandas/core/missing.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
"""
Routines for filling missing data.
"""

from functools import partial
from typing import Any, List, Optional, Set, Union

import numpy as np

from pandas._libs import algos, lib
from pandas._typing import ArrayLike, DtypeObj
from pandas._typing import ArrayLike, Axis, DtypeObj
from pandas.compat._optional import import_optional_dependency

from pandas.core.dtypes.cast import infer_dtype_from_array
Expand Down Expand Up @@ -528,16 +528,92 @@ def _cubicspline_interpolate(xi, yi, x, axis=0, bc_type="not-a-knot", extrapolat
return P(x)


def _interpolate_with_limit_area(
values: ArrayLike, method: str, limit: Optional[int], limit_area: Optional[str]
) -> ArrayLike:
"""
Apply interpolation and limit_area logic to values along a to-be-specified axis.
arw2019 marked this conversation as resolved.
Show resolved Hide resolved

Parameters
----------
values: array-like
Input array.
method: str
Interpolation method. Could be "bfill" or "pad"
limit: int, optional
Index limit on interpolation.
limit_area: str
Limit area for interpolation. Can be "inside" or "outside"

Returns
-------
values: array-like
Interpolated array.
"""

invalid = isna(values)

if not invalid.all():
first = find_valid_index(values, "first")
last = find_valid_index(values, "last")

values = interpolate_2d(
values,
method=method,
limit=limit,
)

if limit_area == "inside":
invalid[first : last + 1] = False
elif limit_area == "outside":
invalid[:first] = invalid[last + 1 :] = False

values[invalid] = np.nan

return values


def interpolate_2d(
values,
method="pad",
axis=0,
limit=None,
method: str = "pad",
axis: Axis = 0,
limit: Optional[int] = None,
limit_area: Optional[str] = None,
):
"""
Perform an actual interpolation of values, values will be make 2-d if
needed fills inplace, returns the result.

Parameters
----------
values: array-like
Input array.
method: str, default "pad"
Interpolation method. Could be "bfill" or "pad"
axis: 0 or 1
Interpolation axis
limit: int, optional
Index limit on interpolation.
limit_area: str, optional
Limit area for interpolation. Can be "inside" or "outside"

Returns
-------
values: array-like
Interpolated array.
"""
if limit_area is not None:
return np.apply_along_axis(
partial(
_interpolate_with_limit_area,
method=method,
limit=limit,
limit_area=limit_area,
),
axis,
values,
)

orig_values = values

transf = (lambda x: x) if axis == 0 else (lambda x: x.T)
Expand Down
76 changes: 76 additions & 0 deletions pandas/tests/series/methods/test_interpolate.py
Original file line number Diff line number Diff line change
Expand Up @@ -458,6 +458,82 @@ def test_interp_limit_direction_raises(self, method, limit_direction, expected):
with pytest.raises(ValueError, match=msg):
s.interpolate(method=method, limit_direction=limit_direction)

@pytest.mark.parametrize(
"data, expected_data, kwargs",
(
(
[np.nan, np.nan, 3, np.nan, np.nan, np.nan, 7, np.nan, np.nan],
[np.nan, np.nan, 3.0, 3.0, 3.0, 3.0, 7.0, np.nan, np.nan],
{"method": "pad", "limit_area": "inside"},
),
(
[np.nan, np.nan, 3, np.nan, np.nan, np.nan, 7, np.nan, np.nan],
[np.nan, np.nan, 3.0, 3.0, np.nan, np.nan, 7.0, np.nan, np.nan],
{"method": "pad", "limit_area": "inside", "limit": 1},
),
(
[np.nan, np.nan, 3, np.nan, np.nan, np.nan, 7, np.nan, np.nan],
[np.nan, np.nan, 3.0, np.nan, np.nan, np.nan, 7.0, 7.0, 7.0],
{"method": "pad", "limit_area": "outside"},
),
(
[np.nan, np.nan, 3, np.nan, np.nan, np.nan, 7, np.nan, np.nan],
[np.nan, np.nan, 3.0, np.nan, np.nan, np.nan, 7.0, 7.0, np.nan],
{"method": "pad", "limit_area": "outside", "limit": 1},
),
(
[np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
[np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
{"method": "pad", "limit_area": "outside", "limit": 1},
),
(
range(5),
range(5),
{"method": "pad", "limit_area": "outside", "limit": 1},
),
),
)
def test_interp_limit_area_with_pad(self, data, expected_data, kwargs):
# GH26796

s = Series(data)
expected = Series(expected_data)
result = s.interpolate(**kwargs)
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize(
"data, expected_data, kwargs",
(
(
[np.nan, np.nan, 3, np.nan, np.nan, np.nan, 7, np.nan, np.nan],
[np.nan, np.nan, 3.0, 7.0, 7.0, 7.0, 7.0, np.nan, np.nan],
{"method": "bfill", "limit_area": "inside"},
),
(
[np.nan, np.nan, 3, np.nan, np.nan, np.nan, 7, np.nan, np.nan],
[np.nan, np.nan, 3.0, np.nan, np.nan, 7.0, 7.0, np.nan, np.nan],
{"method": "bfill", "limit_area": "inside", "limit": 1},
),
(
[np.nan, np.nan, 3, np.nan, np.nan, np.nan, 7, np.nan, np.nan],
[3.0, 3.0, 3.0, np.nan, np.nan, np.nan, 7.0, np.nan, np.nan],
{"method": "bfill", "limit_area": "outside"},
),
(
[np.nan, np.nan, 3, np.nan, np.nan, np.nan, 7, np.nan, np.nan],
[np.nan, 3.0, 3.0, np.nan, np.nan, np.nan, 7.0, np.nan, np.nan],
{"method": "bfill", "limit_area": "outside", "limit": 1},
),
),
)
def test_interp_limit_area_with_backfill(self, data, expected_data, kwargs):
# GH26796

s = Series(data)
expected = Series(expected_data)
result = s.interpolate(**kwargs)
tm.assert_series_equal(result, expected)

def test_interp_limit_direction(self):
# These tests are for issue #9218 -- fill NaNs in both directions.
s = Series([1, 3, np.nan, np.nan, np.nan, 11])
Expand Down