[`core`] Replace `QuantLlamaMLP` with `QuantFusedMLP` #188

younesbelkada · 2023-11-13T10:54:04Z

In the context of huggingface/transformers#27411 I wanted to create a more generic class for fused MLP layers. I kept the previous QuantLlamaMLP for backward compatiblity

cc @casper-hansen

This PR also fixed another issue where users face a strange issue whenever they try to perform multi-turn generation.

casper-hansen · 2023-11-13T14:01:38Z

Nice one @younesbelkada! 👍

What do you think about setting a default activation=F.silu? Then we can use QuantFusedMLP internally in AutoAWQ as well when migrating from QuantLlamaMLP.

younesbelkada · 2023-11-13T14:06:54Z

Sounds great! Happy to address these changes

younesbelkada · 2023-11-13T14:09:51Z

Done ! LMK what do you think

casper-hansen · 2023-11-15T10:50:56Z

I am accepting this PR for now to unblock further work on the integration with transformers. However, I do intend to run more tests to confirm the last patch for multiple generations does not cause bugs with inference in AutoAWQ.

I might also consider implementing ‘is_transformers’ and default it to True to make sure transformers has explicitly different behavior which is evidently needed in some cases.

v1

b1c3352

younesbelkada requested a review from casper-hansen November 13, 2023 10:54

add F.silu to default and replace in other architectures as well

cac10e5

fix multiple generate

03ab25c

casper-hansen merged commit 3b362c0 into main Nov 15, 2023

casper-hansen deleted the fix-fused-modules branch December 3, 2023 14:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`core`] Replace `QuantLlamaMLP` with `QuantFusedMLP` #188

[`core`] Replace `QuantLlamaMLP` with `QuantFusedMLP` #188

younesbelkada commented Nov 13, 2023 •

edited

Loading

casper-hansen commented Nov 13, 2023

younesbelkada commented Nov 13, 2023

younesbelkada commented Nov 13, 2023

casper-hansen commented Nov 15, 2023

[core] Replace QuantLlamaMLP with QuantFusedMLP #188

[core] Replace QuantLlamaMLP with QuantFusedMLP #188

Conversation

younesbelkada commented Nov 13, 2023 • edited Loading

casper-hansen commented Nov 13, 2023

younesbelkada commented Nov 13, 2023

younesbelkada commented Nov 13, 2023

casper-hansen commented Nov 15, 2023

[`core`] Replace `QuantLlamaMLP` with `QuantFusedMLP` #188

[`core`] Replace `QuantLlamaMLP` with `QuantFusedMLP` #188

younesbelkada commented Nov 13, 2023 •

edited

Loading