Skip to content

longday1102/Demo-QA-Extraction-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Question & Answering Extraction system

Introduction

This source code is made based on Hugging Face's tutorial on QA Extraction using the Transformer architecture language model. The input to the system is a context and a question, the system will extract the answer in that context.

Model

The model used is bhavikardeshna/xlm-roberta-base-vietnamese, which is a language model based on RoBERTa, trained on the Vietnamese dataset.
The model is described in Cascading Adaptors to Leverage English Data to Improve Performance of Question Answering for Low-Resource Languages paper.

Datasets

The dataset used is UIT-ViQuAD. This dataset comprises over 23,000 human-generated question-answer pairs based on 5,109 passages from 174 Vietnamese articles from Wikipedia. However, in processing, I eliminated more than 3000 questions with no answers.

Evaluation

The dataset after processing is divided with test size is 0.06. Below are the evaluation results of the test set:

EM F1-SCORE
52.38 77.67

Test

Below are some test results:

longhoang06_fine-tuned-viquad-hgf-·-Hugging-Face-and-2-more-pages-Personal-Microsoft_-Edge-2023-03-1

Test.1

longhoang06_fine-tuned-viquad-hgf-·-Hugging-Face-and-3-more-pages-Personal-Microsoft_-Edge-2023-03-1

Test.2

longhoang06_fine-tuned-viquad-hgf-·-Hugging-Face-and-3-more-pages-Personal-Microsoft_-Edge-2023-03-1_3

Test.3

Relatively good 😅

About

⚡ The system extracts answers from a given context

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages