Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: merge_ordered fails with list-like left_by or right_by #38089

Merged
merged 9 commits into from
Nov 29, 2020
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -737,6 +737,7 @@ Reshaping
- Bug in :meth:`DataFrame.apply` not setting index of return value when ``func`` return type is ``dict`` (:issue:`37544`)
- Bug in :func:`concat` resulting in a ``ValueError`` when at least one of both inputs had a non-unique index (:issue:`36263`)
- Bug in :meth:`DataFrame.merge` and :meth:`pandas.merge` returning inconsistent ordering in result for ``how=right`` and ``how=left`` (:issue:`35382`)
- Bug in :func:`merge_ordered` wasn't able to handle list-like ``left_by`` or ``right_by`` (:issue:`35269`)

Sparse
^^^^^^
Expand Down
4 changes: 1 addition & 3 deletions pandas/core/reshape/merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,9 +140,7 @@ def _groupby_and_merge(by, on, left: "DataFrame", right: "DataFrame", merge_piec

# make sure join keys are in the merged
# TODO, should merge_pieces do this?
for k in by:
if k in merged:
merged[k] = key
merged[by] = key

pieces.append(merged)

Expand Down
32 changes: 32 additions & 0 deletions pandas/tests/reshape/merge/test_merge_ordered.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,3 +115,35 @@ def test_doc_example(self):
)

tm.assert_frame_equal(result, expected)

def test_list_type_by(self):
# GH 35269
left = DataFrame({"G": ["g", "g"], "H": ["h", "h"], "T": [1, 3]})
right = DataFrame({"T": [2], "E": [1]})
result1 = merge_ordered(left, right, on=["T"], left_by=["G", "H"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parameterize this test with the cases

result2 = merge_ordered(left, right, on="T", left_by=["G", "H"])

expected = DataFrame(
{
"G": ["g"] * 3,
"H": ["h"] * 3,
"T": [1, 2, 3],
"E": [np.nan, 1.0, np.nan],
}
)

tm.assert_frame_equal(result1, expected)
tm.assert_frame_equal(result2, expected)

result3 = merge_ordered(right, left, on=["T"], right_by=["G", "H"])

expected = DataFrame(
{
"T": [1, 2, 3],
"E": [np.nan, 1.0, np.nan],
"G": ["g"] * 3,
"H": ["h"] * 3,
}
)

tm.assert_frame_equal(result3, expected)