Skip to content
Yichao Yu edited this page Jan 18, 2021 · 16 revisions

For the AWG (and also software radio) we need to compute a lot of sine functions to generate the data in real time. Based on the computation power of the hardware, we should be able to generate for ~100 channels on the CPU (currently an Intel Core i9-7900X). However, it is not trivial to make full use of this computational power and the code needs to be highly optimized to make sure we are not hitting bottleneck elsewhere, e.g. thread synchronization or memory bandwidth/latency. Moreover, we can also use GPUs to accelerate this further, with a modern one in the $300-800 range having enough computational power to generate hundreds of traps. This adds even more options and challenges to the code. In this repo, we'll explore a few different options to use the CPU and the GPU at the same time in order to achieve maximum performance.

The pages in this wiki will describe each of the measurements done, the motivation, the hardware used and the conclusion. I am planning to write this on the go as I do each test. All the pages related to this initial testing will be stored under Tests1 in the git repo for the wiki.

Computation on the CPU:

  1. Single thread performance

  2. Memory write bandwidth

  3. Ring buffer between two threads

  4. Combined test with multiple threads

Computation on the GPU:

  1. Introduction

  2. Computation accuracy

  3. Performance of sin function

  4. Memory bandwidth

  5. Event Overhead