llama : fix attention layer count sanity check #6550

ggerganov · 2024-04-08T18:18:50Z

There was otherwise a warning when compiling.

compilade

Thanks! From my tests it seems to work.

The assertion will need to be changed again with Jamba (because some (but not all) of its layers are attention layers), but this will be fixed later, when it will be relevant.

* llama : fix attention layer count sanity check * llama : fix parentheses in attention layer count sanity check There was otherwise a warning when compiling. --------- Co-authored-by: Francis Couture-Harpin <git@compilade.net>

llama : fix attention layer count sanity check

6804714

ggerganov requested a review from compilade April 8, 2024 18:18

llama : fix parentheses in attention layer count sanity check

7bab4c0

There was otherwise a warning when compiling.

compilade approved these changes Apr 8, 2024

View reviewed changes

ggerganov merged commit cc4a954 into master Apr 8, 2024
57 of 60 checks passed

ggerganov deleted the gg/quantize-mamba-assert branch April 8, 2024 19:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : fix attention layer count sanity check #6550

llama : fix attention layer count sanity check #6550

ggerganov commented Apr 8, 2024 •

edited

Loading

compilade left a comment

llama : fix attention layer count sanity check #6550

llama : fix attention layer count sanity check #6550

Conversation

ggerganov commented Apr 8, 2024 • edited Loading

compilade left a comment

Choose a reason for hiding this comment

ggerganov commented Apr 8, 2024 •

edited

Loading