Dialogue Knowledge Tracing

This repository contains the code for the paper Exploring Knowledge Tracing in Tutor-Student Dialogues. The primary contributions here include the code for the LLMKT and DKT-Sem models, the code for running deep KT and BKT models on dialogue KT, and the code for automatically annotating dialogues with KT labels using the OpenAI API.

If you use our code or find this work useful in your research then please cite us!

@misc{scarlatos2024exploringknowledgetracingtutorstudent,
      title={Exploring Knowledge Tracing in Tutor-Student Dialogues},
      author={Alexander Scarlatos and Andrew Lan},
      year={2024},
      eprint={2409.16490},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2409.16490},
}

Setup

Download Data

Achieve the Core (ATC): Download the ATC HuggingFace dataset and put standards.jsonl and domain_groups.json under src/ATC. At the time of releasing this code, the data was not accessible via HuggingFace due to a bug. If the data is still not accessible then you can contact us or the authors of the paper to send you a copy.

CoMTA: Download the CoMTA data file and put it under data/src.

MathDial: Clone the MathDial repo and put the root under data/src.

Environment

We used Python 3.10.12 in the development of this work. Run the following to set up a Python environment:

python -m venv dk
source dk/bin/activate
pip install -r requirements.txt

Also add the following to your environment:

export OPENAI_API_KEY=<your key here> # For automated annotation via OpenAI
export CUBLAS_WORKSPACE_CONFIG=:4096:8 # For enabling deterministic operations

Prepare Dialogues for KT (Run Annotation with OpenAI)

Dialogue KT requires each dialogue turn to be annotated with correctness and KC labels. We automated this process with LLM prompting via the OpenAI API. You can run the following to tag correctness and ATC standard KCs on the two datasets:

python main.py annotate --mode collect --openai_model gpt-4o --dataset comta
python main.py annotate --mode collect --openai_model gpt-4o --dataset mathdial

To see statistics on the resulting labels, run:

python main.py annotate --mode analyze --dataset comta
python main.py annotate --mode analyze --dataset mathdial

Evaluate KT Methods

Each of the following runs a train/test cross-validation on the CoMTA data for a different model:

python main.py train --dataset comta --crossval --model_type lmkt --model_name lmkt_model         # LLMKT
python main.py train --dataset comta --crossval --model_type dkt-sem --model_name dkt_sem_model   # DKT-Sem
python main.py train --dataset comta --crossval --model_type dkt --model_name dkt_model           # DKT
python main.py train --dataset comta --crossval --model_type dkvmn --model_name dkvmn_model       # DKVMN
python main.py train --dataset comta --crossval --model_type akt --model_name akt_model           # AKT
python main.py train --dataset comta --crossval --model_type saint --model_name saint_model       # SAINT
python main.py train --dataset comta --crossval --model_type simplekt --model_name simplekt_model # simpleKT
python main.py train --dataset comta --crossval --model_type bkt                                  # BKT

Check the results folder for metric summaries and turn-level predictions for analysis.

To see all training options, run:

python main.py train --help

Hyperparameter Sweep

We run a grid search to find the optimal hyperparameters for the DKT family models. For example, to run a search for DKT on CoMTA, run the following (crossval is inferred and model_name is set automatically):

python main.py train --dataset comta --hyperparam_sweep --model_type dkt

The output will indicate the model that achieved the highest validation AUC. To get its performance on the test folds, run:

python main.py test --dataset comta --crossval --model_type dkt --model_name <copy from output> --emb_size <get from model_name>

Best Hyperparameters Found

CoMTA:

DKT-Sem: lr=5e-3, emb_size=128
DKT: lr=5e-3, emb_size=64
DKVMN: lr=2e-4, emb_size=8
AKT: lr=1e-3, emb_size=16
SAINT: lr=1e-4, emb_size=8
simpleKT: lr=1e-3, emb_size=32

MathDial:

DKT-Sem: lr=1e-3, emb_size=256
DKT: lr=5e-4, emb_size=256
DKVMN: lr=5e-3, emb_size=32
AKT: lr=2e-4, emb_size=256
SAINT: lr=1e-3, emb_size=64
simpleKT: lr=5e-4, emb_size=256

Visualize Learning Curves

To generate the learning curve graphs, run the following (they will be placed in results):

python main.py visualize --dataset comta --model_name <trained model to visualize predictions for>

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
models		models
results		results
saved_models		saved_models
.gitignore		.gitignore
README.md		README.md
annotate.py		annotate.py
data_loading.py		data_loading.py
human_eval.py		human_eval.py
kt_data_loading.py		kt_data_loading.py
main.py		main.py
openai_api.py		openai_api.py
prompting.py		prompting.py
requirements.txt		requirements.txt
training.py		training.py
utils.py		utils.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dialogue Knowledge Tracing

Setup

Download Data

Environment

Prepare Dialogues for KT (Run Annotation with OpenAI)

Evaluate KT Methods

Hyperparameter Sweep

Best Hyperparameters Found

Visualize Learning Curves

About

Releases

Packages

Languages

umass-ml4ed/dialogue-kt

Folders and files

Latest commit

History

Repository files navigation

Dialogue Knowledge Tracing

Setup

Download Data

Environment

Prepare Dialogues for KT (Run Annotation with OpenAI)

Evaluate KT Methods

Hyperparameter Sweep

Best Hyperparameters Found

Visualize Learning Curves

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages