Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

这套代码怎么运行? #2

Open
lilililisa1998 opened this issue Mar 8, 2021 · 4 comments
Open

这套代码怎么运行? #2

lilililisa1998 opened this issue Mar 8, 2021 · 4 comments
Assignees
Labels
question Further information is requested

Comments

@lilililisa1998
Copy link

数据集和embedding怎么获取?

@stephantul stephantul self-assigned this Mar 8, 2021
@stephantul stephantul added the question Further information is requested label Mar 8, 2021
@stephantul
Copy link
Member

Hi, the dataset can be downloaded here: https://github.com/ruidan/Unsupervised-Aspect-Extraction
We trained the embeddings on the SemEval 2014 and 2015 corpora, which you can download here: http://alt.qcri.org/semeval2014/task4/ and here: http://alt.qcri.org/semeval2015/task12/

@dingtingtings
Copy link

@stephantul where is the file named"my_data.conllu"?

@stephantul
Copy link
Member

Hi!

This is just an example, there is no file called "my_data.conllu".

In order to work with the pipeline, the code needs to have data in CoNLL-U format (see here: https://universaldependencies.org/format.html). You can use many tools to get parser output in CONLL-U format, such as https://github.com/andreasvc/spacyconllu

@Omaogegea
Copy link

Hello, what kind of data set can be put in the data directory? Can you give me an example? I have downloaded dataset from https://github.com/ruidan/Unsupervised-Aspect-Extraction.But there is no labels_restaurant_train.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants