Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_word_topics() returns value #767

Merged
merged 6 commits into from
Jul 1, 2016
Merged

Conversation

bhargavvader
Copy link
Contributor

@bhargavvader bhargavvader commented Jun 30, 2016

@tmylk, @piskvorky , with reference to #764 .

load_word_topics() now returns word topics which is assigned by train to self.wordtopics.

Before, there were two variables; word_topics and a wordtopics, and only the wordtopics variable was being used elsewhere; so I removed word_topics.

Is that ok?

@tmylk
Copy link
Contributor

tmylk commented Jun 30, 2016

Rename wordtopics to word_topics.

Also should there be a meaningful error like 'run train or load_word_topics before showing topics' in https://github.com/bhargavvader/gensim/blob/6fd1ecbfe41d7ab2adc17d20179d467e986f477a/gensim/models/wrappers/ldamallet.py#L244

@bhargavvader
Copy link
Contributor Author

Will do. Also, this PR doesn't seem to have a travis build happening, any reason?

@devashishd12
Copy link
Contributor

oh renaming to word_topics would break some other things actually....

@tmylk
Copy link
Contributor

tmylk commented Jun 30, 2016

@bhargavvader Well spotted about Travis - other PRs are working ok. Let's see what happens on your next commit.

@bhargavvader
Copy link
Contributor Author

bhargavvader commented Jul 1, 2016

@dsquareindia , exactly what else would break? Not if we change all the wordtopics to word_topics right? The tests written still pass fine.

@devashishd12
Copy link
Contributor

@bhargavvader the mallet2ldamodel would fail and also the support for ldamallet in topic coherence would fail. Tests pass because tests for one of them hasn't been added and for the other one hasn't been merged. I'll change the spelling in my PR but could you please change it in coherencemodel.py? Also a quick git grep would help.

@bhargavvader
Copy link
Contributor Author

bhargavvader commented Jul 1, 2016

@dsquareindia , made the changes in test_ldamallet_wrapper.py, which is the only other place wordtopics shows up in.

coherencemodel.py doesn't seem to really use LdaMallet right now - but I do see it in PR #750 which you're working on, so it would make more sense for you to change it yourself.

@devashishd12
Copy link
Contributor

yeah @bhargavvader I'll make both the changes in my pr once this gets merged. Thanks!

@bhargavvader
Copy link
Contributor Author

@tmylk , @piskvorky , can you have a look?
Also, about the warning, I've just used logger.info now. Is that ok?

@@ -242,7 +242,9 @@ def show_topics(self, num_topics=10, num_words=10, log=False, formatted=True):
return shown

def show_topic(self, topicid, topn=10):
topic = self.wordtopics[topicid]
if self.word_topics is None:
logger.info("Run train or load_word_topics before showing topics.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warn instead of info

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, did.

@tmylk tmylk merged commit 003a886 into piskvorky:develop Jul 1, 2016
@tmylk
Copy link
Contributor

tmylk commented Jul 1, 2016

Thanks for the PR. It's a useful fix!

@piskvorky
Copy link
Owner

piskvorky commented Jul 2, 2016

-1 on renaming wordtopics to word_topics. This breaks backward compatibility for any code that uses that (including a lot of our own code). At least make it an alias.

logger.info("loaded assigned topics for %i tokens", wordtopics.sum())
self.wordtopics = wordtopics
word_topics[int(topic), tokenid] += 1.0
logger.info("loaded assigned topics for %i tokens", word_topics.sum())
self.print_topics(15)
Copy link
Owner

@piskvorky piskvorky Jul 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This print makes no sense without the self.wordtopics assignment.

@tmylk did you review this before merging??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - the print was there in the code earlier and I removed the self.wordtopics assignment so now it doesn't make sense.

@bhargavvader bhargavvader deleted the LdaMallet branch July 2, 2016 08:23
@bhargavvader
Copy link
Contributor Author

bhargavvader commented Jul 2, 2016

@piskvorky , how exactly would one make an alias?
And what exactly do you mean by breaking the older code? I had checked for occurrences of wordtopics in the repo and couldn't find anything I didn't change. Or do you mean tutorials and the like?

I'll make another PR for that and for the unnecessary print statement. Is there anything else you think should change?

@piskvorky
Copy link
Owner

piskvorky commented Jul 2, 2016

I mean, we use it in our own code, in commercial projects :)

I think it's safe to assume some other people depend on it too, in the same way. Unless it's unavoidable, we try to be backward-compatible with our changes.

Alias = two variables referring to the same object, i.e. x = y = some_var, or x = some_var; y = some_var.

@bhargavvader
Copy link
Contributor Author

Ok, makes sense.

As for making an alias in this case, would you mean adding a line such as self.wordtopics = self.word_topics = word_topics or such whenever an assignment is being made?

This was referenced Jul 2, 2016
@piskvorky
Copy link
Owner

piskvorky commented Jul 2, 2016

Yes, assign self.wordtopics = self.word_topics, with a big fat comment explaining why this alias is there.

Option two: just continue to call the var self.wordtopics, like it was before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants