Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flash attention for rocm #1

Open
wants to merge 322 commits into
base: main
Choose a base branch
from
Open
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Feb 16, 2023

  1. speed up

    fsx950223 committed Feb 16, 2023
    Configuration menu
    Copy the full SHA
    f5d8763 View commit details
    Browse the repository at this point in the history
  2. remove useless changes

    fsx950223 committed Feb 16, 2023
    Configuration menu
    Copy the full SHA
    5c257c9 View commit details
    Browse the repository at this point in the history
  3. update ck

    fsx950223 committed Feb 16, 2023
    Configuration menu
    Copy the full SHA
    63405db View commit details
    Browse the repository at this point in the history
  4. Merge pull request #9 from fsx950223/optimize

    Optimize
    guangzlu authored Feb 16, 2023
    Configuration menu
    Copy the full SHA
    17ea3a7 View commit details
    Browse the repository at this point in the history

Commits on Feb 17, 2023

  1. optimize performance

    fsx950223 committed Feb 17, 2023
    Configuration menu
    Copy the full SHA
    d1bf99a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    84ed6d5 View commit details
    Browse the repository at this point in the history

Commits on Feb 20, 2023

  1. fix a bug

    fsx950223 committed Feb 20, 2023
    Configuration menu
    Copy the full SHA
    43f28bd View commit details
    Browse the repository at this point in the history

Commits on Feb 21, 2023

  1. Update Dockerfile for ROCm

    Use public repo for build.
    groenenboomj committed Feb 21, 2023
    Configuration menu
    Copy the full SHA
    c730d50 View commit details
    Browse the repository at this point in the history

Commits on Feb 25, 2023

  1. Add forward pass benchmark

    Run the FlashAttention benchmark on more configs and on forward pass only.
    groenenboomj committed Feb 25, 2023
    Configuration menu
    Copy the full SHA
    24c81ea View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    228bb1a View commit details
    Browse the repository at this point in the history

Commits on Feb 27, 2023

  1. Configuration menu
    Copy the full SHA
    53dd6cd View commit details
    Browse the repository at this point in the history

Commits on Feb 28, 2023

  1. Configuration menu
    Copy the full SHA
    55165c0 View commit details
    Browse the repository at this point in the history

Commits on Mar 1, 2023

  1. modified fmha_api.cpp

    guangzlu committed Mar 1, 2023
    Configuration menu
    Copy the full SHA
    c7ec4c0 View commit details
    Browse the repository at this point in the history
  2. moified some files

    guangzlu committed Mar 1, 2023
    Configuration menu
    Copy the full SHA
    b1473a8 View commit details
    Browse the repository at this point in the history

Commits on Mar 2, 2023

  1. added dropout verify

    guangzlu committed Mar 2, 2023
    Configuration menu
    Copy the full SHA
    f788e7d View commit details
    Browse the repository at this point in the history
  2. batched seqlen can pass

    guangzlu committed Mar 2, 2023
    Configuration menu
    Copy the full SHA
    9f6d0ae View commit details
    Browse the repository at this point in the history
  3. fix bugs

    fsx950223 committed Mar 2, 2023
    Configuration menu
    Copy the full SHA
    7164b75 View commit details
    Browse the repository at this point in the history

Commits on Mar 3, 2023

  1. merge updates

    fsx950223 committed Mar 3, 2023
    Configuration menu
    Copy the full SHA
    de43726 View commit details
    Browse the repository at this point in the history

Commits on Mar 6, 2023

  1. fix multi gpu

    fsx950223 committed Mar 6, 2023
    Configuration menu
    Copy the full SHA
    f51255f View commit details
    Browse the repository at this point in the history

Commits on Mar 7, 2023

  1. Configuration menu
    Copy the full SHA
    fb1be67 View commit details
    Browse the repository at this point in the history
  2. fix bugs

    fsx950223 committed Mar 7, 2023
    Configuration menu
    Copy the full SHA
    f6b11c7 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    9de5c29 View commit details
    Browse the repository at this point in the history

Commits on Mar 8, 2023

  1. update ck

    fsx950223 committed Mar 8, 2023
    Configuration menu
    Copy the full SHA
    f1eb89e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f26ced0 View commit details
    Browse the repository at this point in the history
  3. speed up tests

    fsx950223 committed Mar 8, 2023
    Configuration menu
    Copy the full SHA
    93677be View commit details
    Browse the repository at this point in the history
  4. fix test cases

    fsx950223 committed Mar 8, 2023
    Configuration menu
    Copy the full SHA
    9b94f55 View commit details
    Browse the repository at this point in the history
  5. remove z tensor

    fsx950223 committed Mar 8, 2023
    Configuration menu
    Copy the full SHA
    40978cd View commit details
    Browse the repository at this point in the history
  6. fix a bug

    fsx950223 committed Mar 8, 2023
    Configuration menu
    Copy the full SHA
    ccd80bf View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    05aec02 View commit details
    Browse the repository at this point in the history

Commits on Mar 9, 2023

  1. merge updates

    fsx950223 committed Mar 9, 2023
    Configuration menu
    Copy the full SHA
    75458cb View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    83c46c8 View commit details
    Browse the repository at this point in the history
  3. enable bf16

    fsx950223 committed Mar 9, 2023
    Configuration menu
    Copy the full SHA
    27f84e8 View commit details
    Browse the repository at this point in the history

Commits on Mar 10, 2023

  1. optimize

    fsx950223 committed Mar 10, 2023
    Configuration menu
    Copy the full SHA
    06acbdb View commit details
    Browse the repository at this point in the history

Commits on Mar 11, 2023

  1. optimize api

    fsx950223 committed Mar 11, 2023
    Configuration menu
    Copy the full SHA
    065c2f0 View commit details
    Browse the repository at this point in the history
  2. optimize api

    fsx950223 committed Mar 11, 2023
    Configuration menu
    Copy the full SHA
    324bcbf View commit details
    Browse the repository at this point in the history

Commits on Mar 13, 2023

  1. update ck

    fsx950223 committed Mar 13, 2023
    Configuration menu
    Copy the full SHA
    d3b9fc6 View commit details
    Browse the repository at this point in the history
  2. format code

    fsx950223 committed Mar 13, 2023
    Configuration menu
    Copy the full SHA
    890091e View commit details
    Browse the repository at this point in the history

Commits on Mar 14, 2023

  1. Merge pull request #14 from fsx950223/mlperf_test2

    Optimized api and enabled bwd pass
    guangzlu authored Mar 14, 2023
    Configuration menu
    Copy the full SHA
    cefe848 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    80b3a49 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    d4d0c6f View commit details
    Browse the repository at this point in the history
  4. fixed test file

    guangzlu committed Mar 14, 2023
    Configuration menu
    Copy the full SHA
    92cedaf View commit details
    Browse the repository at this point in the history

Commits on Mar 15, 2023

  1. optimized dropout verify

    guangzlu committed Mar 15, 2023
    Configuration menu
    Copy the full SHA
    0103fcb View commit details
    Browse the repository at this point in the history
  2. modified test file

    guangzlu committed Mar 15, 2023
    Configuration menu
    Copy the full SHA
    2d64089 View commit details
    Browse the repository at this point in the history

Commits on Apr 11, 2023

  1. Merge pull request #12 from ROCmSoftwarePlatform/dropout-verify

    Added dropout for flash_attention_for_rocm
    groenenboomj authored Apr 11, 2023
    Configuration menu
    Copy the full SHA
    a3ecabe View commit details
    Browse the repository at this point in the history

Commits on Apr 13, 2023

  1. modified ck backend

    guangzlu committed Apr 13, 2023
    Configuration menu
    Copy the full SHA
    d0cc349 View commit details
    Browse the repository at this point in the history
  2. modified api

    guangzlu committed Apr 13, 2023
    Configuration menu
    Copy the full SHA
    79d7ca1 View commit details
    Browse the repository at this point in the history
  3. can run now

    guangzlu committed Apr 13, 2023
    Configuration menu
    Copy the full SHA
    f4827f8 View commit details
    Browse the repository at this point in the history
  4. modified output of dq dk dv

    guangzlu committed Apr 13, 2023
    Configuration menu
    Copy the full SHA
    325367c View commit details
    Browse the repository at this point in the history

Commits on Apr 14, 2023

  1. fixed fp16 path

    guangzlu committed Apr 14, 2023
    Configuration menu
    Copy the full SHA
    7a81af7 View commit details
    Browse the repository at this point in the history
  2. can pass unpadded test now

    guangzlu committed Apr 14, 2023
    Configuration menu
    Copy the full SHA
    a67bc9c View commit details
    Browse the repository at this point in the history

Commits on Apr 19, 2023

  1. Configuration menu
    Copy the full SHA
    36de0b6 View commit details
    Browse the repository at this point in the history

Commits on Apr 21, 2023

  1. Configuration menu
    Copy the full SHA
    e3ff7b1 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f7d1133 View commit details
    Browse the repository at this point in the history
  3. optimized fmha_api.cpp

    guangzlu committed Apr 21, 2023
    Configuration menu
    Copy the full SHA
    e84f4a0 View commit details
    Browse the repository at this point in the history

Commits on Apr 27, 2023

  1. fix patch path

    fsx950223 committed Apr 27, 2023
    Configuration menu
    Copy the full SHA
    963dfb9 View commit details
    Browse the repository at this point in the history

Commits on May 22, 2023

  1. Configuration menu
    Copy the full SHA
    9ee09b1 View commit details
    Browse the repository at this point in the history

Commits on May 30, 2023

  1. add switch for RTZ and deterministic

    Junhao committed May 30, 2023
    Configuration menu
    Copy the full SHA
    3b883df View commit details
    Browse the repository at this point in the history
  2. add switches for RTZ and deterministic

    Junhao committed May 30, 2023
    Configuration menu
    Copy the full SHA
    58b0844 View commit details
    Browse the repository at this point in the history
  3. modify ignores

    Junhao committed May 30, 2023
    Configuration menu
    Copy the full SHA
    44a17a5 View commit details
    Browse the repository at this point in the history

Commits on May 31, 2023

  1. submodule updates

    Junhao committed May 31, 2023
    Configuration menu
    Copy the full SHA
    66cd14d View commit details
    Browse the repository at this point in the history

Commits on Jun 1, 2023

  1. Configuration menu
    Copy the full SHA
    b6b4090 View commit details
    Browse the repository at this point in the history
  2. update python api

    Junhao committed Jun 1, 2023
    Configuration menu
    Copy the full SHA
    618918f View commit details
    Browse the repository at this point in the history
  3. update python api

    Junhao committed Jun 1, 2023
    Configuration menu
    Copy the full SHA
    c0be910 View commit details
    Browse the repository at this point in the history
  4. update python api

    Junhao committed Jun 1, 2023
    Configuration menu
    Copy the full SHA
    261c92a View commit details
    Browse the repository at this point in the history
  5. bug fix

    Junhao committed Jun 1, 2023
    Configuration menu
    Copy the full SHA
    f4854a2 View commit details
    Browse the repository at this point in the history
  6. bug fix

    Junhao committed Jun 1, 2023
    Configuration menu
    Copy the full SHA
    93844af View commit details
    Browse the repository at this point in the history
  7. bug fixes

    Junhao committed Jun 1, 2023
    Configuration menu
    Copy the full SHA
    f638aa6 View commit details
    Browse the repository at this point in the history

Commits on Jun 2, 2023

  1. Update README.md

    Junhao Zhang authored Jun 2, 2023
    Configuration menu
    Copy the full SHA
    6cb6b26 View commit details
    Browse the repository at this point in the history
  2. bug fixes

    Junhao committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    d5d80c5 View commit details
    Browse the repository at this point in the history
  3. Update README.md

    Junhao Zhang authored Jun 2, 2023
    Configuration menu
    Copy the full SHA
    adcd98f View commit details
    Browse the repository at this point in the history
  4. Merge pull request #15 from ROCmSoftwarePlatform/jhzhan/release_test

    Release merged with test_rtz
    sabreshao authored Jun 2, 2023
    Configuration menu
    Copy the full SHA
    0c84715 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    7633247 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    782e7ab View commit details
    Browse the repository at this point in the history
  7. modify readme and minor changes

    Junhao committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    918cd00 View commit details
    Browse the repository at this point in the history
  8. modify readme

    Junhao committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    cfb7f3f View commit details
    Browse the repository at this point in the history
  9. refine readme

    Junhao committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    2205fdc View commit details
    Browse the repository at this point in the history
  10. Update flash_attn_interface.py

    Junhao Zhang authored Jun 2, 2023
    Configuration menu
    Copy the full SHA
    9273197 View commit details
    Browse the repository at this point in the history

Commits on Jun 5, 2023

  1. Update dockerfile

    Junhao committed Jun 5, 2023
    Configuration menu
    Copy the full SHA
    7e6a96a View commit details
    Browse the repository at this point in the history

Commits on Jun 6, 2023

  1. Configuration menu
    Copy the full SHA
    9c01c25 View commit details
    Browse the repository at this point in the history
  2. unify data types of input, output, and gemm in either FP16 or BF16 fo…

    …r tuning performance; refactor codes
    Junhao committed Jun 6, 2023
    Configuration menu
    Copy the full SHA
    ceea624 View commit details
    Browse the repository at this point in the history

Commits on Jun 7, 2023

  1. using BF16 as GEMM type in performance mode

    Junhao committed Jun 7, 2023
    Configuration menu
    Copy the full SHA
    d565fad View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e488af5 View commit details
    Browse the repository at this point in the history

Commits on Jun 15, 2023

  1. Configuration menu
    Copy the full SHA
    ee0665c View commit details
    Browse the repository at this point in the history
  2. Update fmha_utils.h

    bug fix
    Junhao Zhang authored Jun 15, 2023
    Configuration menu
    Copy the full SHA
    8559ccd View commit details
    Browse the repository at this point in the history

Commits on Jun 19, 2023

  1. fix pt2.0 build

    fsx950223 committed Jun 19, 2023
    Configuration menu
    Copy the full SHA
    9887a29 View commit details
    Browse the repository at this point in the history
  2. fix setup.py

    fsx950223 committed Jun 19, 2023
    Configuration menu
    Copy the full SHA
    3e1f9ea View commit details
    Browse the repository at this point in the history
  3. fix bugs

    fsx950223 committed Jun 19, 2023
    Configuration menu
    Copy the full SHA
    8512242 View commit details
    Browse the repository at this point in the history

Commits on Jun 20, 2023

  1. support torch1.12

    fsx950223 committed Jun 20, 2023
    Configuration menu
    Copy the full SHA
    beab3fb View commit details
    Browse the repository at this point in the history
  2. update dockerfile

    fsx950223 committed Jun 20, 2023
    Configuration menu
    Copy the full SHA
    4d05af4 View commit details
    Browse the repository at this point in the history
  3. update README

    fsx950223 committed Jun 20, 2023
    Configuration menu
    Copy the full SHA
    9838670 View commit details
    Browse the repository at this point in the history
  4. rename folder

    fsx950223 committed Jun 20, 2023
    Configuration menu
    Copy the full SHA
    0317244 View commit details
    Browse the repository at this point in the history
  5. rename files

    fsx950223 committed Jun 20, 2023
    Configuration menu
    Copy the full SHA
    662535c View commit details
    Browse the repository at this point in the history
  6. remove useless code

    fsx950223 committed Jun 20, 2023
    Configuration menu
    Copy the full SHA
    6a51836 View commit details
    Browse the repository at this point in the history

Commits on Jun 21, 2023

  1. optimize performance

    fsx950223 committed Jun 21, 2023
    Configuration menu
    Copy the full SHA
    ad3259a View commit details
    Browse the repository at this point in the history
  2. Merge remote-tracking branch 'public/flash_attention_for_rocm2' into …

    …flash_attention_for_rocm2
    fsx950223 committed Jun 21, 2023
    Configuration menu
    Copy the full SHA
    99637e4 View commit details
    Browse the repository at this point in the history
  3. remove useless code

    fsx950223 committed Jun 21, 2023
    Configuration menu
    Copy the full SHA
    983d299 View commit details
    Browse the repository at this point in the history
  4. update README

    fsx950223 committed Jun 21, 2023
    Configuration menu
    Copy the full SHA
    e90010b View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    63ce40f View commit details
    Browse the repository at this point in the history
  6. disable triton test cases

    fsx950223 committed Jun 21, 2023
    Configuration menu
    Copy the full SHA
    78aada9 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    1a11344 View commit details
    Browse the repository at this point in the history

Commits on Jun 26, 2023

  1. changed submodule

    guangzlu committed Jun 26, 2023
    Configuration menu
    Copy the full SHA
    dedea21 View commit details
    Browse the repository at this point in the history

Commits on Jun 27, 2023

  1. added qloop

    guangzlu committed Jun 27, 2023
    Configuration menu
    Copy the full SHA
    424141b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    994eca4 View commit details
    Browse the repository at this point in the history
  3. fixed unittest mode

    guangzlu committed Jun 27, 2023
    Configuration menu
    Copy the full SHA
    f67f948 View commit details
    Browse the repository at this point in the history

Commits on Jun 28, 2023

  1. Configuration menu
    Copy the full SHA
    7e190a4 View commit details
    Browse the repository at this point in the history
  2. modified backend to use rtz

    guangzlu committed Jun 28, 2023
    Configuration menu
    Copy the full SHA
    98258ef View commit details
    Browse the repository at this point in the history
  3. updated ck

    guangzlu committed Jun 28, 2023
    Configuration menu
    Copy the full SHA
    8bb4d98 View commit details
    Browse the repository at this point in the history

Commits on Jul 4, 2023

  1. Configuration menu
    Copy the full SHA
    ab576b9 View commit details
    Browse the repository at this point in the history

Commits on Jul 5, 2023

  1. add rtz

    guangzlu committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    6834c97 View commit details
    Browse the repository at this point in the history

Commits on Jul 11, 2023

  1. Revert "Update fmha_utils.h"

    This reverts commit 8559ccd.
    
    Revert "change random seeds api in accordance with PyTorch 1.13.1+"
    
    This reverts commit ee0665c.
    
    Revert "using BF16 as GEMM type in performance mode"
    
    This reverts commit d565fad.
    
    Revert "unify data types of input, output, and gemm in either FP16 or BF16 for tuning performance; refactor codes"
    
    This reverts commit ceea624.
    
    Revert "update docker and readme to remove private reference"
    
    This reverts commit 9c01c25.
    
    Revert "Update dockerfile"
    
    This reverts commit 7e6a96a.
    sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    10d7481 View commit details
    Browse the repository at this point in the history
  2. fix pt2.0 build

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    777e166 View commit details
    Browse the repository at this point in the history
  3. fix setup.py

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    b8f2ee6 View commit details
    Browse the repository at this point in the history
  4. fix bugs

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    3e4f367 View commit details
    Browse the repository at this point in the history
  5. support torch1.12

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    bfb1d75 View commit details
    Browse the repository at this point in the history
  6. update dockerfile

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    b83723c View commit details
    Browse the repository at this point in the history
  7. update README

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    6e2a304 View commit details
    Browse the repository at this point in the history
  8. rename folder

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    5ad9386 View commit details
    Browse the repository at this point in the history
  9. rename files

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    e29d75f View commit details
    Browse the repository at this point in the history
  10. remove useless code

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    deb2e94 View commit details
    Browse the repository at this point in the history
  11. optimize performance

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    3f5297b View commit details
    Browse the repository at this point in the history
  12. remove useless code

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    22b64b1 View commit details
    Browse the repository at this point in the history
  13. update README

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    535f1b7 View commit details
    Browse the repository at this point in the history
  14. disable triton test cases

    fsx950223 authored and sabreshao committed Jul 11, 2023
    Configuration menu
    Copy the full SHA
    1ddabb8 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    db62edc View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    cf2ffe1 View commit details
    Browse the repository at this point in the history

Commits on Jul 12, 2023

  1. added kloop into qloop

    guangzlu committed Jul 12, 2023
    Configuration menu
    Copy the full SHA
    6aacb04 View commit details
    Browse the repository at this point in the history

Commits on Jul 14, 2023

  1. Configuration menu
    Copy the full SHA
    67d897b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0cb0cd5 View commit details
    Browse the repository at this point in the history
  3. Merge pull request #6 from ROCmSoftwarePlatform/attn-qloop-kloop-v2

    Enable both Qloop and Kloop
    sabreshao authored Jul 14, 2023
    Configuration menu
    Copy the full SHA
    9551449 View commit details
    Browse the repository at this point in the history
  4. updated ck

    guangzlu committed Jul 14, 2023
    Configuration menu
    Copy the full SHA
    0ba1882 View commit details
    Browse the repository at this point in the history
  5. default using qloop

    guangzlu committed Jul 14, 2023
    Configuration menu
    Copy the full SHA
    489a673 View commit details
    Browse the repository at this point in the history
  6. modified README.md

    guangzlu committed Jul 14, 2023
    Configuration menu
    Copy the full SHA
    a988787 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    34e29f7 View commit details
    Browse the repository at this point in the history

Commits on Jul 31, 2023

  1. Reduce the compiling time by spliting into several cpp files (#7)

    Tested the elapsed time of "python setup.py install" on ROCm5.7/PyTorch 1.13.1:
    Older version: 26m1.244s
    This version: 4m11.111s on PyTorch 1.13.1;3m39.470s on PyTorch 2.0.1
    Unit tests passed on ROCm5.7 + PyTorch 1.13.1:2113 passed, 2848 skipped in 119.70s
    
    * refactoring code
    
    * update ignores
    
    * bug fixes
    
    * patch updates
    
    * fix test cases
    
    * remove useless fils
    
    * update ck
    
    ---------
    
    Co-authored-by: Junhao <junhzhan@amd.com>
    Co-authored-by: fsx950223 <fsx950223@outlook.com>
    3 people authored Jul 31, 2023
    Configuration menu
    Copy the full SHA
    0821eb0 View commit details
    Browse the repository at this point in the history

Commits on Aug 7, 2023

  1. Remove PyTorch patch by updating PyTorch

    Use a version of PyTorch with the hipify
    changes included.
    groenenboomj committed Aug 7, 2023
    Configuration menu
    Copy the full SHA
    05d45e4 View commit details
    Browse the repository at this point in the history

Commits on Aug 11, 2023

  1. ck sync up

    root committed Aug 11, 2023
    Configuration menu
    Copy the full SHA
    a2e81ca View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0627500 View commit details
    Browse the repository at this point in the history

Commits on Aug 12, 2023

  1. Configuration menu
    Copy the full SHA
    ed7ccb3 View commit details
    Browse the repository at this point in the history

Commits on Aug 14, 2023

  1. fixed bugs

    guangzlu committed Aug 14, 2023
    Configuration menu
    Copy the full SHA
    b6d78bd View commit details
    Browse the repository at this point in the history

Commits on Aug 16, 2023

  1. Configuration menu
    Copy the full SHA
    21b45c3 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    52427b5 View commit details
    Browse the repository at this point in the history

Commits on Aug 17, 2023

  1. Configuration menu
    Copy the full SHA
    6d88e70 View commit details
    Browse the repository at this point in the history
  2. fixed bug

    guangzlu committed Aug 17, 2023
    Configuration menu
    Copy the full SHA
    eabcebf View commit details
    Browse the repository at this point in the history

Commits on Aug 18, 2023

  1. added unpad for kloop

    guangzlu committed Aug 18, 2023
    Configuration menu
    Copy the full SHA
    fe1cb5a View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2023

  1. Merge pull request #10 from ROCmSoftwarePlatform/inference-opt

    Optimization based on profiling for forward.
    guangzlu authored Aug 22, 2023
    Configuration menu
    Copy the full SHA
    4619d9c View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2023

  1. gfx941 support

    root committed Aug 28, 2023
    Configuration menu
    Copy the full SHA
    c902c75 View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2023

  1. fix RTN logic

    Junhao committed Aug 29, 2023
    Configuration menu
    Copy the full SHA
    ce59e9f View commit details
    Browse the repository at this point in the history

Commits on Aug 31, 2023

  1. Optimized API for packed conditions (#12)

    * optimized api for fwd in packed conditions
    
    * optimized api for bwd
    guangzlu authored Aug 31, 2023
    Configuration menu
    Copy the full SHA
    d394549 View commit details
    Browse the repository at this point in the history

Commits on Sep 13, 2023

  1. compatiable with xformers (#13)

    * compatiable with xformers
    
    * add get_package_version function
    fsx950223 authored Sep 13, 2023
    Configuration menu
    Copy the full SHA
    efd5e04 View commit details
    Browse the repository at this point in the history

Commits on Sep 15, 2023

  1. Merge tag 'v2.0.0' of https://github.com/Dao-AILab/flash-attention in…

    …to junhzhan/ifu-v2.0.0
    Junhao Zhang committed Sep 15, 2023
    Configuration menu
    Copy the full SHA
    0b037c2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0de9665 View commit details
    Browse the repository at this point in the history
  3. modified mha_fwd; added mha_varlen_fwd

    Junhao Zhang committed Sep 15, 2023
    Configuration menu
    Copy the full SHA
    48f57bf View commit details
    Browse the repository at this point in the history

Commits on Sep 18, 2023

  1. enable mha_bwd + mha_varlen_bwd

    Junhao Zhang committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    cef81d1 View commit details
    Browse the repository at this point in the history
  2. updated ck and removed kloop

    guangzlu committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    16b0c17 View commit details
    Browse the repository at this point in the history
  3. update python interface

    Junhao Zhang committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    1e9ddf8 View commit details
    Browse the repository at this point in the history
  4. removed kloop related files

    guangzlu committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    7581be2 View commit details
    Browse the repository at this point in the history
  5. updated test file

    guangzlu committed Sep 18, 2023
    Configuration menu
    Copy the full SHA
    94273a8 View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2023

  1. modified test file

    guangzlu committed Sep 19, 2023
    Configuration menu
    Copy the full SHA
    b13603f View commit details
    Browse the repository at this point in the history
  2. fix get_env_

    Junhao Zhang authored Sep 19, 2023
    Configuration menu
    Copy the full SHA
    f978f3e View commit details
    Browse the repository at this point in the history
  3. added bwd light version

    guangzlu committed Sep 19, 2023
    Configuration menu
    Copy the full SHA
    c656139 View commit details
    Browse the repository at this point in the history
  4. format code

    Junhao Zhang authored Sep 19, 2023
    Configuration menu
    Copy the full SHA
    3f53461 View commit details
    Browse the repository at this point in the history
  5. optimize code for light

    guangzlu committed Sep 19, 2023
    Configuration menu
    Copy the full SHA
    631f027 View commit details
    Browse the repository at this point in the history
  6. sync to 2.0.4

    Junhao Zhang committed Sep 19, 2023
    Configuration menu
    Copy the full SHA
    a6900a4 View commit details
    Browse the repository at this point in the history
  7. sync to 2.0.4

    Junhao Zhang committed Sep 19, 2023
    Configuration menu
    Copy the full SHA
    dc98ee5 View commit details
    Browse the repository at this point in the history
  8. bug fixes

    Junhao committed Sep 19, 2023
    Configuration menu
    Copy the full SHA
    37e5961 View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2023

  1. Configuration menu
    Copy the full SHA
    609262f View commit details
    Browse the repository at this point in the history
  2. modified ratit for bwd

    guangzlu committed Sep 20, 2023
    Configuration menu
    Copy the full SHA
    8216584 View commit details
    Browse the repository at this point in the history
  3. added padding branch

    guangzlu committed Sep 20, 2023
    Configuration menu
    Copy the full SHA
    de70d9d View commit details
    Browse the repository at this point in the history

Commits on Sep 21, 2023

  1. removed kloop stuff

    guangzlu committed Sep 21, 2023
    Configuration menu
    Copy the full SHA
    e82b97a View commit details
    Browse the repository at this point in the history
  2. added rtn to ck

    guangzlu committed Sep 21, 2023
    Configuration menu
    Copy the full SHA
    d7be208 View commit details
    Browse the repository at this point in the history

Commits on Sep 22, 2023

  1. Merge pull request #15 from ROCmSoftwarePlatform/bwd-prof-opt

    * updated ck and removed kloop
    
    * removed kloop related files
    
    * updated test file
    
    * modified test file
    
    * added bwd light version
    
    * optimize code for light
    
    * stage process for bwd nonpadding
    
    * modified ratit for bwd
    
    * added padding branch
    
    * removed kloop stuff
    
    * added rtn to ck
    
    TBD: Fix accuracy degradation since introduction of int8 drop.
    sabreshao authored Sep 22, 2023
    Configuration menu
    Copy the full SHA
    444e15a View commit details
    Browse the repository at this point in the history

Commits on Sep 28, 2023

  1. bug fixes

    Junhao committed Sep 28, 2023
    Configuration menu
    Copy the full SHA
    1d4913f View commit details
    Browse the repository at this point in the history
  2. bug fixes

    Junhao committed Sep 28, 2023
    Configuration menu
    Copy the full SHA
    39c6578 View commit details
    Browse the repository at this point in the history
  3. bug fixes

    Junhao Zhang committed Sep 28, 2023
    Configuration menu
    Copy the full SHA
    0d557df View commit details
    Browse the repository at this point in the history
  4. Merge pull request '#15' into junhzhan/ifu-v2.0.0;

    Junhao Zhang committed Sep 28, 2023
    Configuration menu
    Copy the full SHA
    623ffbb View commit details
    Browse the repository at this point in the history
  5. Update README.md

    Junhao Zhang authored Sep 28, 2023
    Configuration menu
    Copy the full SHA
    e61ba7a View commit details
    Browse the repository at this point in the history

Commits on Oct 7, 2023

  1. added batched template

    Junhao Zhang authored Oct 7, 2023
    Configuration menu
    Copy the full SHA
    94b9dd5 View commit details
    Browse the repository at this point in the history
  2. added batched template

    Junhao Zhang authored Oct 7, 2023
    Configuration menu
    Copy the full SHA
    67162e3 View commit details
    Browse the repository at this point in the history
  3. bug fixes for batched template

    Junhao Zhang committed Oct 7, 2023
    Configuration menu
    Copy the full SHA
    fe31011 View commit details
    Browse the repository at this point in the history
  4. bug fixes for batched template

    Junhao Zhang committed Oct 7, 2023
    Configuration menu
    Copy the full SHA
    ae87e65 View commit details
    Browse the repository at this point in the history

Commits on Oct 8, 2023

  1. added batched

    Junhao Zhang committed Oct 8, 2023
    Configuration menu
    Copy the full SHA
    89806a4 View commit details
    Browse the repository at this point in the history

Commits on Oct 11, 2023

  1. params -> BaseParams for static members

    Junhao Zhang authored Oct 11, 2023
    Configuration menu
    Copy the full SHA
    aa59b0f View commit details
    Browse the repository at this point in the history
  2. hpp suffix is prefered in cpp hence changed

    Junhao Zhang authored Oct 11, 2023
    Configuration menu
    Copy the full SHA
    aa96f3e View commit details
    Browse the repository at this point in the history
  3. removing deprecated files for ifu readiness

    Junhao Zhang authored Oct 11, 2023
    Configuration menu
    Copy the full SHA
    f47a112 View commit details
    Browse the repository at this point in the history
  4. improving logic

    Junhao Zhang authored Oct 11, 2023
    Configuration menu
    Copy the full SHA
    86964ee View commit details
    Browse the repository at this point in the history
  5. refine params

    Junhao Zhang committed Oct 11, 2023
    Configuration menu
    Copy the full SHA
    185fd79 View commit details
    Browse the repository at this point in the history
  6. cleaned redundencies

    Junhao Zhang authored Oct 11, 2023
    Configuration menu
    Copy the full SHA
    46172fb View commit details
    Browse the repository at this point in the history

Commits on Oct 12, 2023

  1. bug fixes

    Junhao Zhang committed Oct 12, 2023
    Configuration menu
    Copy the full SHA
    bc37a40 View commit details
    Browse the repository at this point in the history
  2. bug fixes

    Junhao Zhang committed Oct 12, 2023
    Configuration menu
    Copy the full SHA
    14df6f1 View commit details
    Browse the repository at this point in the history
  3. bug fixed

    Junhao Zhang committed Oct 12, 2023
    Configuration menu
    Copy the full SHA
    149c2b4 View commit details
    Browse the repository at this point in the history

Commits on Oct 13, 2023

  1. update CK and RTN logic

    Junhao Zhang committed Oct 13, 2023
    Configuration menu
    Copy the full SHA
    8cfc14c View commit details
    Browse the repository at this point in the history
  2. bug fixes

    Junhao Zhang committed Oct 13, 2023
    Configuration menu
    Copy the full SHA
    f047ddb View commit details
    Browse the repository at this point in the history
  3. bug fixes

    Junhao Zhang committed Oct 13, 2023
    Configuration menu
    Copy the full SHA
    2338516 View commit details
    Browse the repository at this point in the history
  4. bug fixing

    Junhao Zhang committed Oct 13, 2023
    Configuration menu
    Copy the full SHA
    7a382ae View commit details
    Browse the repository at this point in the history
  5. bug fixing

    Junhao Zhang committed Oct 13, 2023
    Configuration menu
    Copy the full SHA
    889d4a8 View commit details
    Browse the repository at this point in the history

Commits on Oct 18, 2023

  1. using ck index

    Junhao Zhang authored Oct 18, 2023
    Configuration menu
    Copy the full SHA
    182ef77 View commit details
    Browse the repository at this point in the history
  2. initilize vectors

    Junhao Zhang authored Oct 18, 2023
    Configuration menu
    Copy the full SHA
    89e44a0 View commit details
    Browse the repository at this point in the history
  3. fixing mqa/gqa params

    mqa/gqa readiness
    Junhao Zhang authored Oct 18, 2023
    Configuration menu
    Copy the full SHA
    790eca7 View commit details
    Browse the repository at this point in the history
  4. added mqa/gqa APIs

    Junhao Zhang authored Oct 18, 2023
    Configuration menu
    Copy the full SHA
    6bcce4f View commit details
    Browse the repository at this point in the history
  5. remove zombie code

    Junhao Zhang authored Oct 18, 2023
    Configuration menu
    Copy the full SHA
    393b1ad View commit details
    Browse the repository at this point in the history
  6. update interface

    Junhao Zhang authored Oct 18, 2023
    Configuration menu
    Copy the full SHA
    cbca76f View commit details
    Browse the repository at this point in the history
  7. bug fixes

    Junhao Zhang committed Oct 18, 2023
    Configuration menu
    Copy the full SHA
    490a01b View commit details
    Browse the repository at this point in the history
  8. bug fixes

    Junhao Zhang committed Oct 18, 2023
    Configuration menu
    Copy the full SHA
    232e5a9 View commit details
    Browse the repository at this point in the history
  9. bug fixing

    Junhao Zhang authored Oct 18, 2023
    Configuration menu
    Copy the full SHA
    db9541b View commit details
    Browse the repository at this point in the history

Commits on Oct 20, 2023

  1. fixing unit test cases

    Junhao Zhang committed Oct 20, 2023
    Configuration menu
    Copy the full SHA
    f046d04 View commit details
    Browse the repository at this point in the history
  2. bug fixes

    Junhao Zhang committed Oct 20, 2023
    Configuration menu
    Copy the full SHA
    1fe24cf View commit details
    Browse the repository at this point in the history

Commits on Oct 24, 2023

  1. passed all unit test

    Junhao Zhang committed Oct 24, 2023
    Configuration menu
    Copy the full SHA
    f5783bb View commit details
    Browse the repository at this point in the history
  2. sync interface

    Junhao Zhang committed Oct 24, 2023
    Configuration menu
    Copy the full SHA
    cd463f9 View commit details
    Browse the repository at this point in the history
  3. Merge branch 'junhzhan/ifu-v2.0.0' of https://github.com/ROCmSoftware…

    …Platform/flash-attention into junhzhan/ifu-v2.0.0
    Junhao Zhang committed Oct 24, 2023
    Configuration menu
    Copy the full SHA
    9f90750 View commit details
    Browse the repository at this point in the history

Commits on Oct 25, 2023

  1. added optional FP32 dQKV for unit tests

    Junhao Zhang committed Oct 25, 2023
    Configuration menu
    Copy the full SHA
    5d1365a View commit details
    Browse the repository at this point in the history
  2. pass qkv.contiguous() instead of assigning values

    Junhao Zhang committed Oct 25, 2023
    Configuration menu
    Copy the full SHA
    a807948 View commit details
    Browse the repository at this point in the history
  3. simple code

    fsx950223 committed Oct 25, 2023
    Configuration menu
    Copy the full SHA
    3a31e7e View commit details
    Browse the repository at this point in the history

Commits on Oct 26, 2023

  1. add time kernel env

    fsx950223 committed Oct 26, 2023
    Configuration menu
    Copy the full SHA
    f4c8dde View commit details
    Browse the repository at this point in the history
  2. updated ckbackend

    guangzlu committed Oct 26, 2023
    Configuration menu
    Copy the full SHA
    6bc3374 View commit details
    Browse the repository at this point in the history
  3. simple code

    fsx950223 committed Oct 26, 2023
    Configuration menu
    Copy the full SHA
    b4d20b2 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    15c19e2 View commit details
    Browse the repository at this point in the history
  5. fix dropout z tensors allocation; enable unit test

    Junhao Zhang committed Oct 26, 2023
    Configuration menu
    Copy the full SHA
    0c5b579 View commit details
    Browse the repository at this point in the history
  6. Merge branch 'junhzhan/ifu-v2.0.0' of https://github.com/ROCmSoftware…

    …Platform/flash-attention into junhzhan/ifu-v2.0.0
    Junhao Zhang committed Oct 26, 2023
    Configuration menu
    Copy the full SHA
    d7b631a View commit details
    Browse the repository at this point in the history
  7. added mqa gqa

    guangzlu committed Oct 26, 2023
    Configuration menu
    Copy the full SHA
    b5ba498 View commit details
    Browse the repository at this point in the history
  8. updated ck backend

    guangzlu committed Oct 26, 2023
    Configuration menu
    Copy the full SHA
    cc78698 View commit details
    Browse the repository at this point in the history

Commits on Oct 27, 2023

  1. Configuration menu
    Copy the full SHA
    6daeb0c View commit details
    Browse the repository at this point in the history
  2. fixed params

    guangzlu committed Oct 27, 2023
    Configuration menu
    Copy the full SHA
    b6a9f6e View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5e80fc7 View commit details
    Browse the repository at this point in the history

Commits on Oct 30, 2023

  1. Merge pull request #16 from ROCmSoftwarePlatform/ifu-mqa

    Add MQA & GQA
    guangzlu authored Oct 30, 2023
    Configuration menu
    Copy the full SHA
    02c234b View commit details
    Browse the repository at this point in the history
  2. Update .gitignore

    Junhao Zhang authored Oct 30, 2023
    Configuration menu
    Copy the full SHA
    b27bd1d View commit details
    Browse the repository at this point in the history
  3. better .gitignore

    Junhao Zhang authored Oct 30, 2023
    Configuration menu
    Copy the full SHA
    9a5273d View commit details
    Browse the repository at this point in the history
  4. update git ignore

    Junhao committed Oct 30, 2023
    Configuration menu
    Copy the full SHA
    2d11119 View commit details
    Browse the repository at this point in the history
  5. update uint8 dropout in FA

    Junhao committed Oct 30, 2023
    Configuration menu
    Copy the full SHA
    4d79450 View commit details
    Browse the repository at this point in the history
  6. update RTN swtich; enable MQA/GQA UT

    Junhao committed Oct 30, 2023
    Configuration menu
    Copy the full SHA
    a197406 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    8da5b66 View commit details
    Browse the repository at this point in the history

Commits on Oct 31, 2023

  1. tidy codes

    Junhao committed Oct 31, 2023
    Configuration menu
    Copy the full SHA
    5378a20 View commit details
    Browse the repository at this point in the history
  2. add legacy interface support

    Junhao committed Oct 31, 2023
    Configuration menu
    Copy the full SHA
    1b808f4 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    23ee8fb View commit details
    Browse the repository at this point in the history
  4. code formatting

    Junhao committed Oct 31, 2023
    Configuration menu
    Copy the full SHA
    0c92f31 View commit details
    Browse the repository at this point in the history

Commits on Nov 1, 2023

  1. Configuration menu
    Copy the full SHA
    2c057b4 View commit details
    Browse the repository at this point in the history

Commits on Nov 3, 2023

  1. Disable MQA UT

    Junhao Zhang authored Nov 3, 2023
    Configuration menu
    Copy the full SHA
    1cd7f89 View commit details
    Browse the repository at this point in the history
  2. Merge pull request #14 from ROCmSoftwarePlatform/junhzhan/ifu-v2.0.0

    IFU to v2.0.4
    Add MQA/GQA but MQA UT is disabled due to some  failures.
    Support new hardware.
    sabreshao authored Nov 3, 2023
    Configuration menu
    Copy the full SHA
    edc7698 View commit details
    Browse the repository at this point in the history

Commits on Nov 17, 2023

  1. Remove Hardcoded Building Options (#19)

    * Update README
    
    * Update Dockerfile for customized image building
    
    * Sync test scripts
    
    * Remove internal cmake file since no longer worked
    
    * Remove headers that is used for internal testing
    
    * Refine and add options for different GCN archs
    
    * Add clang-format file
    
    * Remove dockerfile that is no longer used
    
    * Chang utils location
    Junhao Zhang authored Nov 17, 2023
    Configuration menu
    Copy the full SHA
    5f1ae07 View commit details
    Browse the repository at this point in the history

Commits on Nov 21, 2023

  1. Add build script

    jayz0123 committed Nov 21, 2023
    Configuration menu
    Copy the full SHA
    675d324 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8a77d72 View commit details
    Browse the repository at this point in the history

Commits on Nov 29, 2023

  1. Update README.md

    jayz0123 authored Nov 29, 2023
    Configuration menu
    Copy the full SHA
    18060ee View commit details
    Browse the repository at this point in the history
  2. Update README.md

    jayz0123 authored Nov 29, 2023
    Configuration menu
    Copy the full SHA
    fa589c3 View commit details
    Browse the repository at this point in the history
  3. Update README.md

    jayz0123 authored Nov 29, 2023
    Configuration menu
    Copy the full SHA
    fa285bf View commit details
    Browse the repository at this point in the history
  4. Update README.md

    jayz0123 authored Nov 29, 2023
    Configuration menu
    Copy the full SHA
    3d2b6f5 View commit details
    Browse the repository at this point in the history
  5. Update README.md

    Fix numbers
    Naomiusearch authored Nov 29, 2023
    Configuration menu
    Copy the full SHA
    3b786a2 View commit details
    Browse the repository at this point in the history
  6. Update README.md

    Naomiusearch authored Nov 29, 2023
    Configuration menu
    Copy the full SHA
    820b2b1 View commit details
    Browse the repository at this point in the history

Commits on Dec 5, 2023

  1. Merge pull request #23 from Naomiusearch/flash_attention_for_rocm

    Make installation steps look better
    jayz0123 authored Dec 5, 2023
    Configuration menu
    Copy the full SHA
    68aac13 View commit details
    Browse the repository at this point in the history

Commits on Jan 26, 2024

  1. Allow gfx908 to build

    luizanao committed Jan 26, 2024
    Configuration menu
    Copy the full SHA
    b64f45e View commit details
    Browse the repository at this point in the history

Commits on Feb 4, 2024

  1. Merge pull request #38 from luizanao/add-support-gfx908

    Allow  gfx908 to build
    jayz0123 authored Feb 4, 2024
    Configuration menu
    Copy the full SHA
    ae7928c View commit details
    Browse the repository at this point in the history

Commits on Mar 8, 2024

  1. add benchmark script (#49)

    * add benchmark script
    
    * fix bugs
    
    * fix a bug
    
    * add output csv
    fsx950223 authored Mar 8, 2024
    Configuration menu
    Copy the full SHA
    2554f49 View commit details
    Browse the repository at this point in the history