Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while loading data using joblib and import errors of the dependent fiels for models #376

Open
Rupesh-Darimisetti opened this issue Apr 14, 2024 · 4 comments

Comments

@Rupesh-Darimisetti
Copy link

The repository not unable to load the basic import files used in model from the tools path and even appending with

sys.path.append(".")

it resolves importing error of

fromtools.email_preprocess import preprocess

after which it throwing error with joblib loading files as

python naive_bayes/nb_author_id.py 
Traceback (most recent call last):
  File "D:\02_learning\udacity\machine_learning\ud120-projects\naive_bayes\nb_author_id.py", line 24, in <module>
    features_train, features_test, labels_train, labels_test = preprocess()
                                                               ^^^^^^^^^^^^
  File "D:\02_learning\udacity\machine_learning\ud120-projects\tools\email_preprocess.py", line 31, in preprocess
    word_data = joblib.load(open("tools/word_data.pkl","rb"))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\02_learning\udacity\machine_learning\ud120-projects\venv\Lib\site-packages\joblib\numpy_pickle.py", line 648, in load
    obj = _unpickle(fobj)
          ^^^^^^^^^^^^^^^
  File "D:\02_learning\udacity\machine_learning\ud120-projects\venv\Lib\site-packages\joblib\numpy_pickle.py", line 577, in _unpickle
    obj = unpickler.load()
          ^^^^^^^^^^^^^^^^
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\pickle.py", line 1213, in load
    dispatch[key[0]](self)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\pickle.py", line 1337, in load_string
    raise UnpicklingError("the STRING opcode argument must be quoted")
_pickle.UnpicklingError: the STRING opcode argument must be quoted
@Rupesh-Darimisetti
Copy link
Author

@jaycode @13rac1 @nmb10 @richardkalehoff please try to fix the codebase in order to run the models of machine learning.

@Rupesh-Darimisetti Rupesh-Darimisetti changed the title Error while loading data using jobpickle and import errors of the dependent fiels for models Error while loading data using joblib and import errors of the dependent fiels for models Apr 14, 2024
@jaycode
Copy link
Contributor

jaycode commented Apr 18, 2024

Hi @Rupesh-Darimisetti, the code was set up to use Python 3.6.3. It looks like you are using 3.11, which rendered the unpickle feature to fail. I suggest creating a new virtual environment to run it.

@Rupesh-Darimisetti
Copy link
Author

Even though i use python 3.6.8 version and reinstall the venv it's throwing the following error @jaycode

 python naive_bayes/nb_author_id.py 
Traceback (most recent call last):
  File "naive_bayes/nb_author_id.py", line 24, in <module>  
    features_train, features_test, labels_train, labels_test
 = preprocess()
  File ".\tools\email_preprocess.py", line 34, in preprocess

    word_data = joblib.load(words_file_handler)
  File "D:\02_learning\udacity\machine_learning\ud120-projec
ts\venv\lib\site-packages\joblib\numpy_pickle.py", line 577,
 in load
    obj = _unpickle(fobj)
  File "D:\02_learning\udacity\machine_learning\ud120-projec
ts\venv\lib\site-packages\joblib\numpy_pickle.py", line 506,
 in _unpickle
    obj = unpickler.load()
  File "C:\Users\rupes\AppData\Local\Programs\Python\Python3
6-32\lib\pickle.py", line 1050, in load
    dispatch[key[0]](self)
  File "C:\Users\rupes\AppData\Local\Programs\Python\Python3
6-32\lib\pickle.py", line 1174, in load_string
    raise UnpicklingError("the STRING opcode argument must b
e quoted")
_pickle.UnpicklingError: the STRING opcode argument must be 
quoted
(venv) 

@Keroshi
Copy link

Keroshi commented May 24, 2024

I was also facing the same issue. But after a few stack overflow threads, I found the issue to be the Line separator format for both word_data.pkl and email_author.pkl was the one that brought about the errors. Changing it to LF from CRLF or CR might do the trick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants