Skip to content

Commit

Permalink
BUG: DataFrame.to_dict when orient=index data loss (#22810)
Browse files Browse the repository at this point in the history
  • Loading branch information
jameswinegar authored and jreback committed Oct 11, 2018
1 parent a86501f commit c8ce3d0
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 0 deletions.
16 changes: 16 additions & 0 deletions doc/source/whatsnew/v0.24.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -373,6 +373,22 @@ is the case with :attr:`Period.end_time`, for example

p.end_time

.. _whatsnew_0240.api_breaking.frame_to_dict_index_orient:

Raise ValueError in ``DataFrame.to_dict(orient='index')``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Bug in :func:`DataFrame.to_dict` raises ``ValueError`` when used with
``orient='index'`` and a non-unique index instead of losing data (:issue:`22801`)

.. ipython:: python
:okexcept:

df = pd.DataFrame({'a': [1, 2], 'b': [0.5, 0.75]}, index=['A', 'A'])
df

df.to_dict(orient='index')

.. _whatsnew_0240.api.datetimelike.normalize:

Tick DateOffset Normalize Restrictions
Expand Down
4 changes: 4 additions & 0 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -1224,6 +1224,10 @@ def to_dict(self, orient='dict', into=dict):
for k, v in zip(self.columns, np.atleast_1d(row)))
for row in self.values]
elif orient.lower().startswith('i'):
if not self.index.is_unique:
raise ValueError(
"DataFrame index must be unique for orient='index'."
)
return into_c((t[0], dict(zip(self.columns, t[1:])))
for t in self.itertuples())
else:
Expand Down
6 changes: 6 additions & 0 deletions pandas/tests/frame/test_convert_to.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,12 @@ def test_to_dict_timestamp(self):
tm.assert_dict_equal(test_data_mixed.to_dict(orient='split'),
expected_split_mixed)

def test_to_dict_index_not_unique_with_index_orient(self):
# GH22801
# Data loss when indexes are not unique. Raise ValueError.
df = DataFrame({'a': [1, 2], 'b': [0.5, 0.75]}, index=['A', 'A'])
pytest.raises(ValueError, df.to_dict, orient='index')

def test_to_dict_invalid_orient(self):
df = DataFrame({'A': [0, 1]})
pytest.raises(ValueError, df.to_dict, orient='xinvalid')
Expand Down

0 comments on commit c8ce3d0

Please sign in to comment.