Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using multiple models in NER #1334

Open
linlinloo opened this issue Jan 22, 2024 · 6 comments
Open

Using multiple models in NER #1334

linlinloo opened this issue Jan 22, 2024 · 6 comments
Labels

Comments

@linlinloo
Copy link

I want to run the following code, but an error occurred.

import stanza
pipe = stanza.Pipeline("en", processors="tokenize,ner", package={"ner": ["ncbi_disease", "ontonotes"]})
doc = pipe("John Bauer works at Stanford and has hip arthritis. He works for Chris Manning")
print(doc.ents)

WARNING: Language en package default expects mwt, which has been added

I have downloaded ncbi_disease.pt and placed it in site-packages\stanza\stanza_resources\en\ner What's the problem?and why?

@AngledLuffa
Copy link
Collaborator

AngledLuffa commented Jan 22, 2024 via email

@linlinloo
Copy link
Author

However, the operation did not yield any results, and a series of errors would appear: ConnectTimeout, MaxRetryError......
When I run other code, there is no ncbi_disease in ner. Is it the wrong package I have put?
Loading these models for language: en (English):

| Processor | Package |

| tokenize | combined |
| mwt | combined |
| pos | combined_charlm |
| lemma | combined_nocharlm |
| constituency | ptb3-revised_charlm |
| depparse | combined_charlm |
| sentiment | sstplus |
| ner | ontonotes-ww-multi_charlm |

@AngledLuffa
Copy link
Collaborator

AngledLuffa commented Jan 22, 2024 via email

@AngledLuffa
Copy link
Collaborator

AngledLuffa commented Jan 22, 2024 via email

@linlinloo
Copy link
Author

I find ontonotes_charlm.pt, and I can download it, do you meant that I should replace ontonotes-ww-multi_charlm?
And sorry, how to add download_method=None. Like this? pipe = stanza.Pipeline("en", download_method=None )

@AngledLuffa
Copy link
Collaborator

I find ontonotes_charlm.pt, and I can download it, do you meant that I should replace ontonotes-ww-multi_charlm?

You can do whatever you like, of course. The ww-multi model was trained on both OntoNotes and the dataset described in this paper

And sorry, how to add download_method=None. Like this? pipe = stanza.Pipeline("en", download_method=None )

Yes, exactly. I suggest that because it's the most likely reason you're getting timeouts. If the problem is somewhere else, please include the complete stack trace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants