Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computing global gamma for a corpus of sentences/instances #43

Open
vestedinterests opened this issue Aug 9, 2023 · 1 comment
Open

Comments

@vestedinterests
Copy link

First of all, a great package, I am really happy that gamma exists as a measurement at all and also about a well-documented python implementation!

I had a brief question; say you are using this for an NER task. Your whole corpus might then lots of individual sentences which are annotated. I am now wondering how I'd best compute a global gamma for the whole corpus.

  1. Reading the documentation, it seems that using the CLI, I could have each sentence in a file, then batch analyse them, have individual gamma measures per file and then report SD of gamma, lowest and highest values.
  2. Or I could add each sentence after one another, meaning that token 3 in sentence 3 perhaps has the token-position 12, since I would treat is as one giant annotation task, and then have a gamma computed for the whole corpus.

I seem to see both approaches used in papers citing your work, most without code-sharing however; I was curious if you have a recommendation which approach makes more sense. Thanks a lot!

@hadware
Copy link
Collaborator

hadware commented Mar 4, 2024

Sorry for the extremely late answer. First of all, are you still working on that topic? Have you found an answer to your question?

I can look into it if you're still interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants