Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analysis with "so" before qualifier in a sentence with negation gives wrong results #1

Open
Nek opened this issue Apr 19, 2017 · 3 comments
Assignees

Comments

@Nek
Copy link

Nek commented Apr 19, 2017

Thanks for picking this project up. It's a good fit for an app prototype I'm building.
Right now I'm playing with the library from Clojure REPL.
Works fine except there are some problems with "so".

moodwiz.core=> (calculate-sentiment "good")
{"negative" 0.0, "neutral" 0.0, "positive" 1.0, "compound" 0.4404}
moodwiz.core=> (calculate-sentiment "so good")
{"negative" 0.0, "neutral" 0.238, "positive" 0.762, "compound" 0.4927}
moodwiz.core=> (calculate-sentiment "I feel good")
{"negative" 0.0, "neutral" 0.256, "positive" 0.744, "compound" 0.4404}
moodwiz.core=> (calculate-sentiment "I feel so good")
{"negative" 0.0, "neutral" 0.385, "positive" 0.615, "compound" 0.4927}
moodwiz.core=> (calculate-sentiment "I don't feel good")
{"negative" 0.546, "neutral" 0.454, "positive" 0.0, "compound" -0.3412}
moodwiz.core=> (calculate-sentiment "I don't feel so good")
{"negative" 0.0, "neutral" 0.445, "positive" 0.555, "compound" 0.5777}
@nunoachenriques nunoachenriques self-assigned this Apr 19, 2017
@nunoachenriques
Copy link
Owner

Hi Nek, this is not an issue in this project code. I've tested[1] it and it's an issue with the original Python implementation from Hutto and the NLTK team. I recommend that you point it out to them so we can all benefit from the improvement.

Anyway, thanks a lot for the report! I'll mark it as an improvement to be scheduled.
And yes, I also picked VADER because I'm integrating it with another project :-)

Cheers!

[1]

Python 3.5.3 (default, Jan 19 2017, 14:11:04) 
[GCC 6.3.0 20170118] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from nltk.sentiment.vader import SentimentIntensityAnalyzer
>>> sid = SentimentIntensityAnalyzer()
>>> ss = sid.polarity_scores("I don't feel so good")
>>> for k in sorted(ss):
...         print('{0}: {1}\t'.format(k, ss[k]), end='')
... 
compound: 0.5777	neg: 0.0	neu: 0.445	pos: 0.555	>>> 
>>>

apanimesh061 pushed a commit to apanimesh061/VaderSentimentJava that referenced this issue Jun 17, 2017
@apanimesh061
Copy link

Hi,
I looked into this issue. The issue arises from this line which is the implementation of this.

If you use a statement like I don't feel completely good, you can get correct result i.e. {negative=0.466, neutral=0.534, positive=0.0, compound=-0.3865}. If you have so in place of completely, the valence is multiplied by a 1.25, otherwise it is multiplied by -0.74.

I made a few changes here but they will break tests and also some checkstyle rules.

I basically added a rule which checks in a if the trigram has <negative word> <some word | so | this> <so | this>., if that occurs accordingly adjust the score. Previously, we were only handling <never> <some word | so | this> <so | this>. After this you'll get {negative=0.466, neutral=0.534, positive=0.0, compound=-0.3865}.

@nunoachenriques
Copy link
Owner

Hi Animesh, thanks for taking the time to try a fix/enhancement to this issue.

I took a look at your changes. I believe it will be better to recode it without breaking the tests... ;-) otherwise, we will have to guarantee (by extensive testing) that the implementation is indeed better than the original by Hutto!

Cheers and tell me what do you think...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants