Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: test_orc segfault when missing /usr/share/zoneinfo/US/Pacific #56292

Closed
3 tasks done
WillAyd opened this issue Dec 2, 2023 · 3 comments
Closed
3 tasks done

BUG: test_orc segfault when missing /usr/share/zoneinfo/US/Pacific #56292

WillAyd opened this issue Dec 2, 2023 · 3 comments
Labels
Arrow pyarrow functionality Bug Closing Candidate May be closeable, needs more eyeballs Upstream issue Issue related to pandas dependency

Comments

@WillAyd
Copy link
Member

WillAyd commented Dec 2, 2023

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

python -m pytest pandas/tests/io/test_orc.py

Issue Description

Segfault occurs:

pandas/tests/io/test_orc.py::test_orc_reader_basic terminate called after throwing an instance of 'orc::TimezoneError'
  what():  Can't open /usr/share/zoneinfo/US/Pacific
Fatal Python error: Aborted

Current thread 0x00007eff1a912780 (most recent call first):

Appears to be looking for /usr/share/zoneinfo/US/Pacific on my machine. My timezone directory has an America subdirectory but within that nothing called Pacific

Expected Behavior

No segfault

A confirmed workaround is:

$ sudo mkdir -p /usr/share/zoneinfo/US
$ sudo ln -s /usr/share/zoneinfo/America/Los_Angeles /usr/share/zoneinfo/US/Pacific

Installed Versions

INSTALLED VERSIONS

commit : 0cce0ed91cd23f17b412a0b345e6df79ed6b522b
python : 3.10.13.final.0
python-bits : 64
OS : Linux
OS-release : 6.5.0-13-generic
Version : #13-Ubuntu SMP PREEMPT_DYNAMIC Fri Nov 3 12:16:05 UTC 2023
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.2.0.dev0+801.g0cce0ed91c
numpy : 1.26.0
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.2.2
pip : 23.3.1
Cython : 3.0.5
pytest : 7.4.3
hypothesis : 6.88.4
sphinx : 7.2.6
blosc : None
feather : None
xlsxwriter : 3.1.9
lxml.etree : 4.9.3
html5lib : 1.1
pymysql : 1.4.6
psycopg2 : 2.9.7
jinja2 : 3.1.2
IPython : 8.17.2
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.2
bottleneck : 1.3.7
dataframe-api-compat : None
fastparquet : 2023.10.1
fsspec : 2023.10.0
gcsfs : 2023.10.0
matplotlib : 3.7.3
numba : 0.58.1
numexpr : 2.8.7
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : 14.0.1
pyreadstat : 1.2.4
python-calamine : None
pyxlsb : 1.0.10
s3fs : 2023.10.0
scipy : 1.11.3
sqlalchemy : 2.0.23
tables : 3.9.1
tabulate : 0.9.0
xarray : 2023.10.1
xlrd : 2.0.1
zstandard : 0.22.0
tzdata : 2023.3
qtpy : None
pyqt5 : None

@WillAyd WillAyd added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 2, 2023
@mroeschke
Copy link
Member

I suspect that there's a pure pyarrow reproducer as orc.py just wraps pyarrow.orc

@WillAyd
Copy link
Member Author

WillAyd commented Dec 2, 2023

Ah OK cool. I didn't pull a core dump but this does show up in the Python layer

pandas/tests/io/test_orc.py::test_orc_reader_basic terminate called after throwing an instance of 'orc::TimezoneError'
  what():  Can't open /usr/share/zoneinfo/US/Pacific
Fatal Python error: Aborted

Current thread 0x00007eff1a912780 (most recent call first):
  File "/home/willayd/mambaforge/envs/pandas-dev/lib/python3.10/site-packages/pyarrow/orc.py", line 187 in read
  File "/home/willayd/mambaforge/envs/pandas-dev/lib/python3.10/site-packages/pyarrow/orc.py", line 308 in read_table
  File "/home/willayd/clones/pandas/pandas/io/orc.py", line 117 in read_orc
  File "/home/willayd/clones/pandas/pandas/tests/io/test_orc.py", line 93 in test_orc_reader_basic

@lithomas1 lithomas1 added Upstream issue Issue related to pandas dependency Arrow pyarrow functionality Closing Candidate May be closeable, needs more eyeballs and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 2, 2023
@phofl
Copy link
Member

phofl commented Mar 18, 2024

Closing here, this seems to be an arrow only thing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality Bug Closing Candidate May be closeable, needs more eyeballs Upstream issue Issue related to pandas dependency
Projects
None yet
Development

No branches or pull requests

4 participants