Skip to content

Automatic generation of descriptive radiological reports from X-RAY scans

License

Notifications You must be signed in to change notification settings

CTCycle/XREPORT-radiological-reports-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

XREPORT: Radiological Reports Generation

1. Project Overview

XRAY Report Generator is a machine learning-based tool designed to assist radiologists in generating descriptive reports from X-ray images. This project aims to reduce the time and effort required by radiologists to write detailed reports based on the XRAY scan description, thereby increasing efficiency and turnover. The generative model is trained using combinations of XRAY images and their labels (descriptions), in the same fashion as image captioning models learn a sequence of word tokens associated to specific parts of the image. While originally developed around the MIMIC-CXR Database (https://www.kaggle.com/datasets/wasifnafee/mimic-cxr), this project can be applied to any dataset with X-ray scans labeled with their respective radiological reports (or any kind of description). The XREPORT Deep Learning (DL) model developed for this scope makes use of a transformer encoder-decoder architecture, which relies on both self attention and cross attention to improve text significance within the clinical image context. The images features are extracted using a custom convolutional encoder with pooling layers to reduce dimensionality. Once a pretrained model is obtained leveraging a large number of X-RAY scans and their descriptions, the model can be used in inference mode to generate radiological reports from the raw pictures.

1.2 Supplementary information

Further information are available in the docs folder (to be added).

2. XREPORT model

The XREPORT model is based on a transformer encoder-decoder architecture. Three stacked encoders with multi-head self-attention and feedforward networks are used downstream to the convolutional image encoder network to generate vectors with extracted x-ray scan features. The X-RAY scans are processed and reduced in dimensionality using a series of convolutional layers followed by max-pooling operations. These image vectors are then fed into the transformer decoder, which applies cross-attention between encoder and decoder inputs, to determine most important features in the images associated with specific words in the text. To ensure coherent report generation, the model employs causal masking on token sequences during decoding. This auto-regressive mechanism guarantees that generated reports consider the context of previously generated tokens.

DistilBERT tokenization: to improve the vectorization and the semantic representation of the training text corpus, the pretrained tokenizer of the DistilBERT model has been used to split text into subwords and vectorize the tokens. The base model is taken from distilbert/distilbert-base-uncased, and is automatically downloaded in training/BERT. Once saved, the weights are loaded each time a new training session is called. The XREPORT model performs word embedding by coupling token embeddings with positional embeddings, and supports masking for variable-length sequences, ensuring adaptability to text sequences of different length.

XREP transformers: the body of the model comprises a series of transformer encoders/decoders. The transformer encoder employs multi-head self-attention and feedforward networks to further process the encoded images. These transformed image vectors are then fed into the transformer decoder, which applies cross-attention between encoder and decoder inputs. To ensure coherent report generation, the model employs causal masking on token sequences during decoding. This auto-regressive mechanism guarantees that generated reports consider the context of previously generated tokens.

3. Installation

The installation process on Windows has been designed for simplicity and ease of use. To begin, simply run XREPORT.bat. On its first execution, the installation procedure will automatically start with minimal user input required. The script will check if either Anaconda or Miniconda is installed on your system. If neither is found, you will need to install it manually. You can download and install Miniconda by following the instructions here: (https://docs.anaconda.com/miniconda/).

After setting up Anaconda/Miniconda, the installation script will install all the necessary Python dependencies. This includes Keras 3 (with PyTorch support as the backend) and the required CUDA dependencies (CUDA 12.1) to enable GPU acceleration. If you'd prefer to handle the installation process separately, you can run the standalone installer by executing setup/XREPORT_installer.bat. You can also use a custom python environment by modifying settings/launcher_configurations.ini and setting use_custom_environment as true, while specifying the name of your custom environment.

Important: After installation, if the project folder is moved or its path is changed, the application will no longer function correctly. To fix this, you can either:

  • Open the main menu, select "XREPORT setup," and choose "Install project packages"

  • Manually run the following commands in the terminal, ensuring the project folder is set as the current working directory (CWD):

    conda activate XREPORT

    pip install -e . --use-pep517

3.1 Additional Package for XLA Acceleration

XLA is designed to optimize computations for speed and efficiency, particularly beneficial when working with TensorFlow and other machine learning frameworks that support XLA. Since this project uses Keras 3 with PyTorch as backend, the approach for optimizing computations for speed and efficiency has shifted from XLA to PyTorch's native acceleration tools, particularly TorchScript (currently not implemented). For those who wish to use Tensorflow as backend, XLA acceleration can be globally enabled setting the XLA_FLAGS environmental variabile with the following value: --xla_gpu_cuda_data_dir=path\to\XLA, where path\to\XLA is the actual directory path to the folder containing the nvvm subdirectory (where the file libdevice.10.bc resides).

4. How to use

On Windows, run XREPORT.bat to launch the main navigation menu and browse through the various options. Alternatively, you can run each file separately using python path/filename.py or jupyter path/notebook.ipynb.

4.1 Navigation menu

1) Data analysis: run validation/data_validation.ipynb to perform data validation using a series of metrics for the analysis of the dataset. This feature cannot be directly started from the launcher due to unpredictable behavior of .ipynb files when executed from batch scripts.

2) Data preprocessing: prepare data from machine learning, starting from raw radiological images and their report in text format. This is done by running preprocessing/data_preprocessing.py

3) Model training and evaluation: open the machine learning menu to explore various options for model training and validation. Once the menu is open, you will see different options:

  • train from scratch: runs training/model_training.py to start training an instance of the XREPORT model from scratch using the available data and parameters.
  • train from checkpoint: runs training/train_from_checkpoint.py to start training a pretrained XREPORT checkpoint for an additional amount of epochs, using pretrained model settings and data.
  • model evaluation: runs validation/model_validation.ipynb to evaluate the performance of pretrained model checkpoints using different metrics. This feature cannot be directly started from the launcher due to unpredictable behavior of .ipynb files when executed from batch scripts.

4) Generate radiological reports: use the pretrained transformer decoder from a model checkpoint to generate radiological reports starting from an input image. This option executes inference/report_generator.py.

5) XREPORT setup: allows running some options command such as install project packages to run the developer model project installation, and remove logs to remove all logs saved in resources/logs.

6) Exit and close: exit the program immediately

4.2 Resources

This folder is used to organize data and results for various stages of the project, including data validation, model training, and evaluation. Here are the key subfolders:

  • checkpoints: pretrained model checkpoints are stored here, and can be used either for resuming training or performing inference with an already trained model.

  • dataset: contains images used to train the XREPORT model (dataset/images), as well as the file XREPORT_dataset.csv that should be provided for training purposes. This .csv file must contain two columns: id where the image names are given, and text where the associated text is saved.

  • generation: contains input_images where you place images intended for inference using the pretrained XREPORT model, and reports. The generated radiological reports from input images are saved within this latter folder.

  • logs: the application logs are saved within this folder

  • validation: Used to save the results of data validation processes. This helps in keeping track of validation metrics and logs.

5. Configurations

For customization, you can modify the main configuration parameters using settings/app_configurations.json

Dataset Configuration

Parameter Description
SAMPLE_SIZE Number of samples to use from the dataset
VALIDATION_SIZE Proportion of the dataset to use for validation
IMG_NORMALIZE Whether to normalize image data
IMG_AUGMENT Whether to apply data augmentation to images
MAX_REPORT_SIZE Max length of text report
SPLIT_SEED Seed for random splitting of the dataset

Model Configuration

Parameter Description
IMG_SHAPE Shape of the input images (height, width, channels)
EMBEDDING_DIMS Embedding dimensions (valid for both models)
NUM_HEADS Number of attention heads
NUM_ENCODERS Number of encoder layers
NUM_DECODERS Number of decoder layers
SAVE_MODEL_PLOT Whether to save a plot of the model architecture

Training Configuration

Parameter Description
EPOCHS Number of epochs to train the model
LEARNING_RATE Learning rate for the optimizer
BATCH_SIZE Number of samples per batch
MIXED_PRECISION Whether to use mixed precision training
USE_TENSORBOARD Whether to use TensorBoard for logging
XLA_STATE Whether to enable XLA (Accelerated Linear Algebra)
ML_DEVICE Device to use for training (e.g., GPU)
NUM_PROCESSORS Number of processors to use for data loading

Evaluation Configuration

Parameter Description
BATCH_SIZE Number of samples per batch during evaluation
SAMPLE_SIZE Number of samples from the dataset (evaluation only)
VALIDATION_SIZE Fraction of validation data (evaluation only)

6. License

This project is licensed under the terms of the MIT license. See the LICENSE file for details.

About

Automatic generation of descriptive radiological reports from X-RAY scans

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published