Skip to content

Latest commit

 

History

History
41 lines (26 loc) · 1.51 KB

README.md

File metadata and controls

41 lines (26 loc) · 1.51 KB

last commit

Intro

This repository contains an implementation of Local Binary Pattern algorithm using GPU acceleration with CUDA. The project is made to compare speed performances wrt sequential CPU-only version.

Usage

  • Place an image in .jpg format in input/ folder
  • Run the program specifying the image name
  • At the end of the run an histogram will be generated in output/

Performances

We compared running time between three different implementations:

  • Simple sequential CPU version
  • Non-optimized GPU accelerated version that uses only global memory
  • Optimized GPU accelerated version using also shared memory

Running time for different sizes of a square image

We could reach up to 15x speed-up on GeForce GTX 980 Ti.

More details

For a detailed description of code implementation and tests you can check our report. (available in italian only, sorry)

We also made a similar comparison between sequential vs multithread version on CPU only.

Acknowledgments

Parallel Computing - Computer Engineering Master Degree @University of Florence.