Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API] Accessing new loops created by reuse_at #503

Open
zzzDavid opened this issue Jun 6, 2023 · 3 comments
Open

[API] Accessing new loops created by reuse_at #503

zzzDavid opened this issue Jun 6, 2023 · 3 comments
Assignees
Labels

Comments

@zzzDavid
Copy link
Collaborator

zzzDavid commented Jun 6, 2023

Currently user doesn't have access to some new loops created by reuse_at. For example, this 2D convolution with line buffer and window buffer:

    A = hcl.placeholder((10, 10))
    r = hcl.reduce_axis(0, 3, name="r")
    c = hcl.reduce_axis(0, 3, name="c")
    B = hcl.compute((8, 8), lambda y, x: hcl.sum(A[y + r, x + c], axis=[r, c]))
    s = hcl.create_schedule([A, B])
    LB = s.reuse_at(A, s[B], B.axis[0])
    WB = s.reuse_at(LB, s[B], B.axis[1])

The v9 loop that shifts the window buffer (see below for generated HLS code) is not directly accessible in B.axis list.

void top(
  int32_t v0[10][10],
  int32_t v1[8][8]
) {     
  int32_t tensor_1_reuse_0[3][10];      // L472
  int32_t tensor_1_reuse_1[3][3];       // L472
  l_tensor_1_y: for (int y = 0; y < 10; y++) {  // L472
    l_x: for (int x = 0; x < 10; x++) { // L472
      int32_t v6 = tensor_1_reuse_0[1][x];      // L472
      tensor_1_reuse_0[0][x] = v6;      // L472
      int32_t v7 = tensor_1_reuse_0[2][x];      // L472
      tensor_1_reuse_0[1][x] = v7;      // L472
      int32_t v8 = v0[y][x];    // L472
      tensor_1_reuse_0[2][x] = v8;      // L472
      if ((y - 2) >= 0) {       // L472
        // THIS LOOP IS NOT DIRECTLY ACCESSIBLE
        for (int v9 = 0; v9 < 3; v9++) {        // L256
          int32_t v10 = tensor_1_reuse_1[v9][1];        // L256
          tensor_1_reuse_1[v9][0] = v10;        // L256
          int32_t v11 = tensor_1_reuse_1[v9][2];        // L256
          tensor_1_reuse_1[v9][1] = v11;        // L256
          int32_t v12 = tensor_1_reuse_0[v9][x];        // L256
          tensor_1_reuse_1[v9][2] = v12;        // L256
        }
        if ((x - 2) >= 0) {     // L256
          int32_t sum;  // L256
          sum = 0;      // L472
          l_r: for (int r = 0; r < 3; r++) {    // L256
            l_c: for (int c = 0; c < 3; c++) {  // L256
              if (1) {  // L472
                int32_t v16 = tensor_1_reuse_1[r][c];   // L6
                int32_t v17 = sum;      // L256
                ap_int<33> v18 = v16;   // L472
                ap_int<33> v19 = v17;   // L472
                ap_int<33> v20 = v18 + v19;     // L6
                int32_t v21 = v20;      // L472
                sum = v21;      // L472
              }
            }
          }
          int32_t v22 = sum;    // L256
          v1[(y - 2)][(x - 2)] = v22;   // L472
        }
      }
    }
  }
}

I think the axis list loop nest representation makes sense for perfectly nested loops but I'm not sure for imperfect loop nests. Loop v9 and loop r are at the same level, nested in loop x.

@zzzDavid zzzDavid added the API label Jun 6, 2023
@zzzDavid zzzDavid self-assigned this Jun 6, 2023
@chhzh123
Copy link
Member

chhzh123 commented Jun 6, 2023

Right, this is a bit tricky. Do we have any use cases that require to access those loops?

@zzzDavid
Copy link
Collaborator Author

zzzDavid commented Jun 6, 2023

I received this feedback from a HeteroCL user, their use case is also convolution:

For example, my original conv layer had 4 outer layers and 4 inner layers of loops. After applying "reuse_at" with a buffer to this conv layer, some new loops have been added. I would like to perform operations like pipeline on these new loops. Now the question is whether there is a way to apply "reuse_at" and access the new loops after these changes.

@zzzDavid
Copy link
Collaborator Author

zzzDavid commented Jun 6, 2023

We can represent such imperfect loop nests as trees, but I'm not sure if we want to do that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants