Skip to content

In the WEBNLG challenge, participants are given a set of structured data in the form of a logical form or a knowledge graph, and are asked to generate a natural language text that accurately and coherently describes the information contained in the structured data. The generated text is evaluated based on a number of criteria.

Notifications You must be signed in to change notification settings

CFR2000/WebNLG2022

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Preparation of French Corpus

French Corpus is a translated version of WebNLG release3.0 English dataset. We used English to French [NMT model][[(https://storage.googleapis.com/samanantar-public/V0.3/models/en-indic.zip)]] provide by https://pytorch.org/hub/pytorch_fairseq_translation/ to generate french sentences.

To generate the french corpus

download the required packages

pip install -r requirements.txt

Generate files for train,dev and test folder

python3 run.py <path to the folder containing english xml files>

In our case, we used english language datapath as it is easy to replace english lex with french lex. WebNLG corpus can be downloaded from this repository.

About

In the WEBNLG challenge, participants are given a set of structured data in the form of a logical form or a knowledge graph, and are asked to generate a natural language text that accurately and coherently describes the information contained in the structured data. The generated text is evaluated based on a number of criteria.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%