Skip to content

Commit

Permalink
ENH: Allow export of mixed columns to Stata strl (#23692)
Browse files Browse the repository at this point in the history
Enable export of large columns to Stata strls when the column
contains None as a null value

closes #23633
  • Loading branch information
bashtage authored and jreback committed Nov 14, 2018
1 parent 3edc18d commit fcb8403
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 0 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.24.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,7 @@ Other Enhancements
- :meth:`Timestamp.tz_localize`, :meth:`DatetimeIndex.tz_localize`, and :meth:`Series.tz_localize` have gained the ``nonexistent`` argument for alternative handling of nonexistent times. See :ref:`timeseries.timezone_nonexistent` (:issue:`8917`)
- :meth:`read_excel()` now accepts ``usecols`` as a list of column names or callable (:issue:`18273`)
- :meth:`MultiIndex.to_flat_index` has been added to flatten multiple levels into a single-level :class:`Index` object.
- :meth:`DataFrame.to_stata` and :class:` pandas.io.stata.StataWriter117` can write mixed sting columns to Stata strl format (:issue:`23633`)

.. _whatsnew_0240.api_breaking:

Expand Down
2 changes: 2 additions & 0 deletions pandas/io/stata.py
Original file line number Diff line number Diff line change
Expand Up @@ -2558,6 +2558,8 @@ def generate_table(self):
for o, (idx, row) in enumerate(selected.iterrows()):
for j, (col, v) in enumerate(col_index):
val = row[col]
# Allow columns with mixed str and None (GH 23633)
val = '' if val is None else val
key = gso_table.get(val, None)
if key is None:
# Stata prefers human numbers
Expand Down
17 changes: 17 additions & 0 deletions pandas/tests/io/test_stata.py
Original file line number Diff line number Diff line change
Expand Up @@ -1505,3 +1505,20 @@ def test_unicode_dta_118(self):
expected = pd.DataFrame(values, columns=columns)

tm.assert_frame_equal(unicode_df, expected)

def test_mixed_string_strl(self):
# GH 23633
output = [
{'mixed': 'string' * 500,
'number': 0},
{'mixed': None,
'number': 1}
]

output = pd.DataFrame(output)
with tm.ensure_clean() as path:
output.to_stata(path, write_index=False, version=117)
reread = read_stata(path)
expected = output.fillna('')
expected.number = expected.number.astype('int32')
tm.assert_frame_equal(reread, expected)

0 comments on commit fcb8403

Please sign in to comment.