Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDP improvements: Topic Hierarchy and Tutorial #945

Closed
bhargavvader opened this issue Oct 12, 2016 · 11 comments
Closed

HDP improvements: Topic Hierarchy and Tutorial #945

bhargavvader opened this issue Oct 12, 2016 · 11 comments
Labels
difficulty easy Easy issue: required small fix documentation Current issue related to documentation

Comments

@bhargavvader
Copy link
Contributor

bhargavvader commented Oct 12, 2016

As mentioned in #941, and previously in #262, HDP isn't very consistent with LdaModel. Also #901, #937 are other documented issues to do with HDP. We could use a complete overhaul in the way these Topic Models are organised (ideally with a parent topic-model class), but for the time-being, HDP needs improvements in it's documentation and structure.

Particularly, it would be useful to have a comprehensive tutorial of what HDP is and what it does.

I would like to take this up in the future, but if someone else has time on their hand and would like to go ahead right away, we could use this issue to discuss the same!

Edit: #938 has already talked about a topic model class.

@markroxor
Copy link
Contributor

markroxor commented Oct 13, 2016

I am up for it. What more changes can we incorporate? For a start I guess editing the tutorial should do like the one here. as pointed out in #941
hdp = HdpModel(corpus, id2word)
hdp.print_topics(show_topics=20, num_words=10)

show_topics should be changed to num_topics without altering the API.

@tmylk
Copy link
Contributor

tmylk commented Oct 13, 2016

@markroxor Creating a tutorial in the tutorial folder would be a great place to start.

Agree about the change to print_topics

@bhargavvader
Copy link
Contributor Author

Sounds good! About the tutorial, it will be nice to have it like a 'story' so it is interesting. Would look even better after we have a Topic Hierarchy functionality.
@markroxor could you look at the ldaseqmodel tutorial, and the news-classification to get an idea of how to make it more read-friendly? Even the doc2vec and topic coherence tutorials are good examples.

@bhargavvader
Copy link
Contributor Author

@tmylk could you add appropriate labels to this please?

@markroxor
Copy link
Contributor

@bhargavvader It will take some time but be assured it will be done. Will catch up with you on this.

@tmylk tmylk added documentation Current issue related to documentation difficulty easy Easy issue: required small fix labels Oct 13, 2016
@markroxor
Copy link
Contributor

I am available to take this now. If I am right, it is required to create a hdpmodel tutorial in the tutorial folder?

@tmylk
Copy link
Contributor

tmylk commented Oct 22, 2016

The folder is correct. The notebook content is in the main description of this issue.

@tmylk
Copy link
Contributor

tmylk commented Nov 8, 2016

Ping @markroxor

@bhargavvader
Copy link
Contributor Author

bhargavvader commented Nov 23, 2016

@tmylk , I was looking up how to maybe do the topic hierarchy, but couldn't find anything.
This post, this, and this ask for the same as well. Is there any resource, paper, video or anything which maybe hints at how to do this?

@menshikh-iv
Copy link
Contributor

Resolved in #1055.

@zeyd31
Copy link

zeyd31 commented Feb 11, 2019

The selection of the number of topics at the end of the process is still unclear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty easy Easy issue: required small fix documentation Current issue related to documentation
Projects
None yet
Development

No branches or pull requests

5 participants