-
-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Passing File Object Not Filename To GzipFile and BZ2File #109
Comments
Reproduced. Working on a fix and test to read/write compressed files. In particular, it broke gensim Travis tests. For completeness, could you paste a code snippet that breaks for you in this version. CC @robottwo |
Hi. I am also seeing this bug when attempting to use gensim, which uses smart_open to open gz compressed files... here is a minimal reproduction of the issue:
|
model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin.gz', binary=True) |
Thanks for reporting. Fixed in #110 and released in 1.5.1 on pypi |
…and #110. (#112) * Better support for custom S3 servers. This patch adds support for custom S3 servers in the connection string. It also adds explicit support for setting the server port, and whether or not to use SSL, both as paramaters to the smart_open function as well as within the connection string. These changes are neccessary to be able to connect to s3proxy and other custom s3 servers which don't run on the default port, or neccessarily use SSL. * Fix unit tests * updated README.rst with new s3 mode. * Added a new unit test for the unsecured calling form * Updated style and unit test. * Check that the port argument isnt normally passed. * Add generic HTTP and HTTPS streaming support. Adds support for opening vanilla HTTP and HTTPS addresses. Supports efficient streaming, gzip and bz2 compression, as well as Kerberos and username/password (basic) http authentication. * removed previous merge artifact; * Raise exception instead of returning it :/ * Raise http exceptions properly * neccessary import * python 3 compatibility * Reverted make_closing -> closing We still want to maintain Python 2.6 compatibility, so don't rely on contextlib.closing. * Refactor the code to get the Python version * Refactored the GZfile and BZ2File compression wrappers. * Refactored HttpOpenRead unit tests. Now they don't require internet access, and will test for Basic authentication in the HTTP header. * Clean up http unit tests. http => https, and remove old versions of the tests. * Cosmetic changes and doc updates. * Re-use the open filehandle rather than open a new one. This allows one to use any filehandle-like object instead of just local posix. It also avoids unneccessary filesystem syscalls. * merge artifact * Add unit tests for compressed httpd reads. This breaks out the http tests into their own test class. Also fixed a few behaviors in the HttpReader uncovered by the new tests (yay). * fixed import for python3 * removed stray import * Handle some python3 byte vs unicode incompatibilityes. Works now on Python 2 as well as Python 3.
Here
https://github.com/RaRe-Technologies/smart_open/blob/master/smart_open/smart_open_lib.py#L626
and here
https://github.com/RaRe-Technologies/smart_open/blob/master/smart_open/smart_open_lib.py#L630
Should be passing in the filename not the file object.
The text was updated successfully, but these errors were encountered: