Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Operation] Constant Tensor Bufferization Issue #61

Closed
zzzDavid opened this issue Apr 11, 2022 · 2 comments
Closed

[Operation] Constant Tensor Bufferization Issue #61

zzzDavid opened this issue Apr 11, 2022 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@zzzDavid
Copy link
Collaborator

MLIR refers to the process of converting tensor to memref as "bufferization". This thread records the issues I found during constant tensor bufferization.

Relevant test case: test_compute_basic.py::test_copy

MLIR assembly:

module {
  func @top() -> memref<10xi32> attributes {extra_itypes = "", extra_otypes = "s", llvm.emit_c_interface, top} {
    %cst = arith.constant dense<[7, 0, 0, 2, 1, 0, 2, 5, 8, 8]> : tensor<10xi32>
    %0 = memref.alloc() {name = "const_tensor"} : memref<10xi32>
    memref.tensor_store %cst, %0 : memref<10xi32>
    %cst_0 = arith.constant dense<[7, 0, 0, 2, 1, 0, 2, 5, 8, 8]> : tensor<10xi32>
    %1 = memref.alloc() {name = "const_tensor"} : memref<10xi32>
    memref.tensor_store %cst_0, %1 : memref<10xi32>
    %2 = memref.alloc() {name = "compute_1"} : memref<10xi32>
    affine.for %arg0 = 0 to 10 {
      %3 = affine.load %0[%arg0] {from = "const_tensor"} : memref<10xi32>
      %4 = affine.load %1[%arg0] {from = "const_tensor"} : memref<10xi32>
      %5 = arith.addi %3, %4 : i32
      affine.store %5, %2[%arg0] {to = "compute_1"} : memref<10xi32>
    } {loop_name = "i0", stage_name = "compute_1"}
    return %2 : memref<10xi32>
  }
}

We need to lower tensor<10xi32> to memref. The existing passes in MLIR doesn't seem to work with constant tensor and memref.tensor_store together:

mlir-opt test_copy.mlir --arith-bufferize --finalizing-bufferize

These two passes should bufferize the arith.constant operation and then remove bufferization.to_tensor operation. But after the first pass memref.tensor_store is marked as illegal. Therefore, we get this error:

`memref.tensor_store %cst, %0`:
  failed to legalize operation `memref.tensor_store`
@zzzDavid zzzDavid added the bug Something isn't working label Apr 11, 2022
@zzzDavid zzzDavid added this to the Integration tests milestone Apr 11, 2022
@zzzDavid zzzDavid self-assigned this Apr 11, 2022
@zzzDavid
Copy link
Collaborator Author

A potential solution is to generate constant memref directly from the frontend, so we don't go through the bufferization passes.

An example of constant memref:

// RUN: hcl-opt %s --jit
module {
  memref.global "private" @gv0 : memref<10xi64> = dense<[0,0,0,0,0,0,0,0,0,0]>
  memref.global "private" @gv1 : memref<10xi64> = dense<[5,5,5,5,5,5,5,5,5,5]>

  func @top() attributes {bit, extra_itypes = "ss", extra_otypes = "", llvm.emit_c_interface, top} {
    %arg0 = memref.get_global @gv0 : memref<10xi64>
    %arg1 = memref.get_global @gv1 : memref<10xi64>
    %0 = memref.alloc() : memref<10xi1>
    affine.for %arg2 = 0 to 10 {
      %2 = affine.load %arg0[%arg2] : memref<10xi64>
      %3 = arith.trunci %2 : i64 to i1
      affine.store %3, %0[%arg2] : memref<10xi1>
    }
    %1 = memref.alloc() : memref<10xi32>
    affine.for %arg2 = 0 to 10 {
      %2 = affine.load %arg1[%arg2] : memref<10xi64>
      %3 = arith.trunci %2 : i64 to i32
      affine.store %3, %1[%arg2] : memref<10xi32>
    }
    affine.for %arg2 = 0 to 10 {
      %2 = affine.load %0[%arg2] {from = "compute_0"} : memref<10xi1>
      %3 = affine.load %1[%arg2] {from = "compute_1"} : memref<10xi32>
      %c0 = arith.constant 0 : index
      hcl.set_bit(%3 : i32, %c0, %2 : i1)
      affine.store %3, %1[%arg2] {to = "compute_1"} : memref<10xi32>
    } {loop_name = "loop_0"}
    affine.for %arg2 = 0 to 10 {
      %2 = affine.load %0[%arg2] : memref<10xi1>
      %3 = arith.extsi %2 : i1 to i64
      affine.store %3, %arg0[%arg2] : memref<10xi64>
    }
    affine.for %arg2 = 0 to 10 {
      %2 = affine.load %1[%arg2] : memref<10xi32>
      %3 = arith.extsi %2 : i32 to i64
      affine.store %3, %arg1[%arg2] : memref<10xi64>
    }
    hcl.print(%arg1) {format="%.0f \n"} : memref<10xi64>
    return
  }
}

@zzzDavid
Copy link
Collaborator Author

I found some documentation in the LLVM repo about bufferization: https://github.com/llvm/llvm-project/blob/09c2b7c35af8c4bad39f03e9f60df8bd07323028/mlir/docs/Bufferization.md

But the code snippet is obsolete in LLVM14, some passes were probably removed or changed name. I tried it but couldn't pass compilation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant