Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Maximum allowed dimension exceeded #470

Closed
iqbaleric opened this issue Sep 28, 2015 · 8 comments
Closed

ValueError: Maximum allowed dimension exceeded #470

iqbaleric opened this issue Sep 28, 2015 · 8 comments

Comments

@iqbaleric
Copy link

Hi,

I have installed numpy and scipy to make gensim library running in my PC. I am getting below error in both 64-bit and 32-bit version of Windows 7. Any quick help is highly appreciated.

======================================================================
ERROR: testScoring (__main__.TestWord2VecModel)
Test word2vec scoring.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\gensim\src\gensim\test\test_word2vec.py", line 193, in testScoring
    scores = model.score(sentences)
  File "C:\Users\gensim\src\gensim\models\word2vec.py", line 880, in score
    sentence_scores = matutils.zeros_aligned(total_sentences, dtype=REAL)
  File "C:\Users\gensim\src\gensim\matutils.py", line 147, in zeros_aligned
    buffer = numpy.zeros(nbytes + align, dtype=numpy.uint8)  # problematic on win64 ("maximum allowed dimension exceeded")
ValueError: Maximum allowed dimension exceeded

----------------------------------------------------------------------
Ran 28 tests in 1413.341s

FAILED (errors=1)

Thank you

@tmylk tmylk mentioned this issue Nov 2, 2015
2 tasks
@tmylk
Copy link
Contributor

tmylk commented Nov 2, 2015

@gojomo Do you have ideas about fixing? This is now blocking for Windows builds and wheels #492

Struggling to understand how can there be a memory error on Windows with only 9 sentences in the corpus.

@gojomo
Copy link
Collaborator

gojomo commented Nov 2, 2015

I know little of Windows implementation limits other than the errors people have reported when hitting them, but from a look at the code:

For some reason, score() is using a default example count total_sentences=int(1e9)?! The comment does warn a value should be provided; maybe the 1B value is left over from some intentional extreme testing? (cc @mataddy)

At least in this case, since sentence is a plain list, the call from testScoring() could be scores = model.score(sentences, len(sentences)).

@mataddy
Copy link
Contributor

mataddy commented Nov 2, 2015

oops, my bad! I'll fix this right now.

@mataddy
Copy link
Contributor

mataddy commented Nov 2, 2015

@tmylk and @gojomo this should be fixed. I dropped the total_sentences default and made its role more clear in the documentation.

I already have an open pull request at #500 (adding a notebook demo for scoring) so this will be included in that now.

@mataddy
Copy link
Contributor

mataddy commented Nov 2, 2015

I'm also adding more safety checks to stop related issues; will ping here when I'm done.

@mataddy
Copy link
Contributor

mataddy commented Nov 3, 2015

OK this is all done. @tmylk all you need should be in #500

@tmylk
Copy link
Contributor

tmylk commented Nov 3, 2015

@mataddy Thanks, Win32 build passes after adding len(sentences).

@mataddy
Copy link
Contributor

mataddy commented Nov 3, 2015

cool @tmylk; I made everything more robust to strange choices for total_sentences in #500 so let me know if you need anything else for that pull req

@tmylk tmylk closed this as completed Jan 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants