Add different swish implementations #88

qubvel · 2019-10-14T09:27:22Z

Add different swish implementations:

Memory efficient swish

GPU memory friendly
Less computationally efficient while training
Does not supported by torch.jit / torch.onnx

Original swish (x * torch.sigmoid(x))

Less memory efficient
More computationally efficient while training
Model can be saved with torch.jit / torch.onnx

Default: memory efficient
Model swish implementation can be changed by .set_swish(memory_efficient=False/True) method

glenn-jocher · 2019-11-26T20:49:04Z

@qubvel thanks for function! I've tried to implement this in our repo: https://github.com/ultralytics/yolov3, but get worse results (lower mAP and higher loss) when compared to a default Swish() class. Do you know why this might be? See ultralytics/yolov3#441 (comment)

cswwp · 2020-08-05T06:52:12Z

@qubvel If i train with Memory efficient swish, and exporting model.pt with model.set_swish(memory_efficient=False) + torch.jit.trace(model, example), will it hurt the score?

Add different swish implementations

8a5da1d

lukemelas merged commit 8a5da1d into lukemelas:master Oct 15, 2019

okanlv mentioned this pull request Nov 25, 2019

Activation Function Experiments ultralytics/yolov3#441

Closed

mryab mentioned this pull request Feb 24, 2020

Speed up GELU computation with torch.jit huggingface/transformers#2988

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add different swish implementations #88

Add different swish implementations #88

qubvel commented Oct 14, 2019 •

edited

Loading

glenn-jocher commented Nov 26, 2019

cswwp commented Aug 5, 2020

Add different swish implementations #88

Add different swish implementations #88

Conversation

qubvel commented Oct 14, 2019 • edited Loading

glenn-jocher commented Nov 26, 2019

cswwp commented Aug 5, 2020

qubvel commented Oct 14, 2019 •

edited

Loading