Build NLQA Using LLMs

This repository contains the implementation of a Natural Language Question-Answering (NLQA) system based on the GPT-neo model and huggingface transformers library. The base GPT-neo 1.3 B model is finetuned on Stanford CS324 LLM lecture notes data webscraped and collected through a python script. Parameter Efficient Fine Tuning (PEFT) method used to create and add Low Rank Adapter (LoRA) to the model.

Overview

Initially Anthropic-claude / ChatOpenAI model was chosen , but due to these models being closed source and paid , GPT-neo was chosen. ollama was also not used as size to locally download the models was too high. Llama-3(8B) and Llama-2(7B) were also too large inorder to fit in the Kaggle provided GPUs. Even the GPT-NEO 1.3B and 125m version was chosen instead of 2.7B due to size constraints. The text was first scraped through the webscraper.py script. Through the main lectures link, all the text present on the sublinks present was collected and stored in the data0.txt file. Text was then cleaned and tokenized into chunks of 96 size (Due to memory constraints , a higher size was not possible) and then encoded through the model tokenizer and converted into dictionary with input_ids and attention_mask. This data was then stored in the vector database Chroma DB .A custom dataset was then created inorder to properly pass these values to the model and an extra field 'labels' was created for supervised learning. Initially , a custom pytorch training loop was used inorder to maximise the training while following the memory constraints. Mixed precision , Distributed parallel data processing on multiple gpus , gradient clipping methods were employed. At the start , the entire model was trained on the data, and then optimal number of freezing of layers was experimented. At the end, the model was trained through huggingface transformers trainer object with the smaller version and using PEFT LoRA adapter. This provided the best results.

REPOSITORY CONTENTS :

gptNeo13_lora : This folder contains the trained LoRA adapter to be added onto the GPT-NEO-1.3B model
data0.txt : This file contains the text extracted from the web pages through a python script
run.py : This python script loads the LoRA adapter onto the base model and takes in user input for the user to use and test the model
webScraper.py : This file contains the script for scraping all text from the pages and other links the page contains
GPT_Neo_FineTuning.ipynb : This python notebook contains all of the above described code organised into sections along with the recorded outputs

Features

GPT-Neo Fine-Tuning: Utilizes GPT-Neo models for natural language understanding and question answering.
PEFT with LoRA Adapters: Fine-tunes specific layers to efficiently incorporate new knowledge without degrading overall model performance.
Data Management: Employs sentence embeddings, attention masks, and ChromaDB for efficient data handling.
Custom PyTorch Training Loop: Includes gradient clipping, mixed precision, and garbage cleaning to optimize memory usage.
Web Scraping Bot: Processes and cleans text data for training.

Results

The model was trained for 7 epochs:

Difference between original model(below) and fine tuned one (above):

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
gptNeo13_lora/kaggle/working/fine_tuned_model		gptNeo13_lora/kaggle/working/fine_tuned_model
GPT_Neo_FineTuning.ipynb		GPT_Neo_FineTuning.ipynb
README.md		README.md
Screenshot from 2024-06-27 15-29-26.png		Screenshot from 2024-06-27 15-29-26.png
Screenshot from 2024-06-27 15-42-27.png		Screenshot from 2024-06-27 15-42-27.png
data0.txt		data0.txt
run.py		run.py
webScraper.py		webScraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Build NLQA Using LLMs

Table of Contents

Overview

Features

Results

About

Releases

Packages

Languages

eshan1347/GPT-NEO-LORA

Folders and files

Latest commit

History

Repository files navigation

Build NLQA Using LLMs

Table of Contents

Overview

Features

Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages