Question & Answering Extraction system

Introduction

This source code is made based on Hugging Face's tutorial on QA Extraction using the Transformer architecture language model. The input to the system is a context and a question, the system will extract the answer in that context.

Model

The model used is bhavikardeshna/xlm-roberta-base-vietnamese, which is a language model based on RoBERTa, trained on the Vietnamese dataset.
The model is described in Cascading Adaptors to Leverage English Data to Improve Performance of Question Answering for Low-Resource Languages paper.

Datasets

The dataset used is UIT-ViQuAD. This dataset comprises over 23,000 human-generated question-answer pairs based on 5,109 passages from 174 Vietnamese articles from Wikipedia. However, in processing, I eliminated more than 3000 questions with no answers.

Evaluation

The dataset after processing is divided with test size is 0.06. Below are the evaluation results of the test set:

EM	F1-SCORE
52.38	77.67

Test

Below are some test results:

Test.1

Test.2

Test.3

Relatively good 😅

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Dataset		Dataset
GUI.py		GUI.py
README.md		README.md
evaluate.py		evaluate.py
features.py		features.py
processing.py		processing.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Question & Answering Extraction system

Introduction

Model

Datasets

Evaluation

Test

About

Releases

Packages

Languages

longday1102/Demo-QA-Extraction-system

Folders and files

Latest commit

History

Repository files navigation

Question & Answering Extraction system

Introduction

Model

Datasets

Evaluation

Test

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages