Adaptive-Language-Modeling-for-Dynamic-Few-Shot-Tasks

This project implements a dynamic few-shot learning approach using PyTorch and Transformers, focusing on solving mathematical problems such as addition, subtraction, multiplication, and division. The model leverages GPT-2 for language modeling and prompts tailored to specific tasks, allowing adaptation to new mathematical operations with minimal training data.

Features

Dynamic Memory: Utilizes a memory module to store embeddings of encountered tasks and adapt to new tasks dynamically.
Prompt Tuning: Implements prompt tuning techniques to enhance model performance on mathematical problem-solving tasks.
Dataset Generation: Generates synthetic mathematical tasks (addition, subtraction, multiplication, division) with varying operands and solutions.
Integration with Hugging Face Hub: Supports model checkpointing and sharing via the Hugging Face Hub for easy replication and deployment.

Setup

Prerequisites

Python 3.6+
PyTorch
Transformers
PyTorch Lightning
Hugging Face Hub
dotenv

Installation

Clone the repository:

git clone https://github.com/sanowl/Adaptive-Language-Modeling-for-Dynamic-Few-Shot-Tasks.git
cd dynamic_few_shot_learning

Install dependencies:
```
pip install -r requirements.txt
```
Set up environment variables: Create a .env file in the root directory with the following content:
```
HF_REPO_ID=your_huggingface_username
HF_TOKEN=your_huggingface_token
```

Usage

To train the model, run:

python main.py

Configuration

Adjust the configuration parameters in main.py to customize the model behavior:

model_name: Pretrained model name from Hugging Face model hub.
num_tokens: Number of tokens in the prompt for task-specific tuning.
num_tasks: Number of different mathematical tasks (e.g., addition, subtraction).
learning_rate: Learning rate for optimizer.
batch_size: Batch size for training.
num_epochs: Number of training epochs.
samples_per_task: Number of samples per mathematical task.
max_seq_length: Maximum sequence length for tokenization.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.DS_Store		.DS_Store
.gitignore		.gitignore
README.MD		README.MD
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adaptive-Language-Modeling-for-Dynamic-Few-Shot-Tasks

Features

Setup

Prerequisites

Installation

Usage

Configuration

License

About

Releases

Packages

Languages

sanowl/Adaptive-Language-Modeling-for-Dynamic-Few-Shot-Tasks

Folders and files

Latest commit

History

Repository files navigation

Adaptive-Language-Modeling-for-Dynamic-Few-Shot-Tasks

Features

Setup

Prerequisites

Installation

Usage

Configuration

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages