CUDA SAXPY Kernel Implementation

Overview

This code provides a CUDA implementation of the SAXPY operation, a common computation in linear algebra. SAXPY stands for "Single-Precision A*X Plus Y", where A, X, and Y are vectors. The code performs this operation in parallel on a GPU using CUDA, NVIDIA's parallel computing platform and programming model. Requirements

NVIDIA GPU with CUDA support
CUDA Toolkit installed on your system

Description

The code is divided into two main parts:

CUDA Kernel (saxpy_parallel): This function runs on the GPU and performs the SAXPY computation. Each thread computes a single element of the result vector.

Host Code (main function): This part runs on the CPU. It allocates memory on the GPU, copies data from the host to the GPU, invokes the CUDA kernel, and then copies the result back to the host.

Key Components

saxpy_parallel: CUDA kernel function for the SAXPY operation.
main: Host function to set up CUDA environment and invoke the kernel.
n: Size of the input vectors.
alpha: Scalar value in the SAXPY operation.
x, y: Input vectors.
d_x, d_y: Device (GPU) copies of x and y.

Usage

Compile the code using the CUDA compiler (nvcc).
Run the executable. It will perform the SAXPY operation and print the results.

Important Considerations

The code assumes a certain size for the vectors (n = 2000), but this can be adjusted as needed.
The number of threads per block is set to 256, but this value can be tuned for different GPUs.
Memory allocation and free operations are crucial to avoid memory leaks.
Error checks for CUDA operations (like cudaMalloc, cudaMemcpy) are not included but are recommended for robust code.

Conclusion

This CUDA SAXPY implementation demonstrates basic GPU programming concepts, including memory management, kernel invocation, and parallel computation. It serves as a starting point for more complex GPU-accelerated applications. Before running the make command, ensure that the makefile has the correct paths and instructions for your system's architecture and the intended CUDA toolkit version.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
2000.txt		2000.txt
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
main.cu		main.cu
saxpy		saxpy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CUDA SAXPY Kernel Implementation

About

Releases

Packages

Languages

License

MarcRamonMoreno/CUDA-SAXPY-KERNEL

Folders and files

Latest commit

History

Repository files navigation

CUDA SAXPY Kernel Implementation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages