Added Hardtanh activation function #14

mikeoliphant · 2023-03-30T01:37:24Z

This adds support for the "Hardtanh" activation function in WaveNet models. It will have no impact on current models.

My (admittedly limited so far) testing indicates that using a hard tanh activation function (basically clamp to -1/1) results in the same ESR as using a regular tanh. But it is much faster to compute.

It seems to perform about the same as the fast tanh function that is currently disabled, but might be able to be further optimized with SSE magic. It also has the benefit of being available as an activation fuction in pytorch.

If we get it implemented here, now - once it has been out in the wild for a bit we can switch to using it in training (maybe by default, maybe an option, maybe only on lite/feather?)

If you want to test training a model with it, just swap out "Tanh" for "Hardtanh" (lowercase "t") in the WaveNet layer config.

sdatkinson · 2023-03-31T04:13:46Z

My (admittedly limited so far) testing indicates that using a hard tanh activation function (basically clamp to -1/1) results in the same ESR as using a regular tanh. But it is much faster to compute.

Very exciting! I'll have to do some looking into this as well 🙂

alexlarsson · 2023-03-31T12:16:08Z

dsp/dsp.cpp

+    const long j_start, const long j_end) {
+    for (long j = j_start; j < j_end; j++)
+        for (long i = i_start; i < i_end; i++)
+            x(i, j) = hard_tanh_(x(i, j));


I wonder if it would perform better if you used unaryExpr instead? Like so:
x = x.unaryExpr([](float in) {return hard_tanh(in);});

Seems to do better according to: https://godbolt.org/z/zfePz8MTa

Well, you don't want to update the entire matrix, so some x.middleCols() are needed

I wonder if it would perform better if you used unaryExpr instead? Like so:
x = x.unaryExpr([](float in) {return hard_tanh(in);});

Agreed - I just copy/pasted a duplicate of the tanh function structure, but didn't worry about it for now as that code path isn't used by WaveNet - it just uses the full matrix overload and I've already optimized those (for both tanh and hardtanh).

My plan is to refactor the activation code in general soon, but I wanted to get the hardtanh in there asap so we can start think about targeting it with training models.

Btw, I tried using unaryExpr(), and if I recall it performed slightly worse than just rolling over the data directly.

That Compiler Explorer tool you linked to is pretty cool.

Added Hardtanh activation function

43d53d9

mikeoliphant mentioned this pull request Mar 30, 2023

Add fast tanh approximation sdatkinson/NeuralAmpModelerPlugin#95

Merged

sdatkinson merged commit 2e5e1b2 into sdatkinson:main Mar 31, 2023

alexlarsson reviewed Mar 31, 2023

View reviewed changes

mikeoliphant deleted the hard_tanh branch April 2, 2023 20:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Hardtanh activation function #14

Added Hardtanh activation function #14

mikeoliphant commented Mar 30, 2023 •

edited

Loading

sdatkinson commented Mar 31, 2023

alexlarsson Mar 31, 2023

alexlarsson Mar 31, 2023

mikeoliphant Mar 31, 2023

mikeoliphant Mar 31, 2023

Added Hardtanh activation function #14

Added Hardtanh activation function #14

Conversation

mikeoliphant commented Mar 30, 2023 • edited Loading

sdatkinson commented Mar 31, 2023

alexlarsson Mar 31, 2023

Choose a reason for hiding this comment

alexlarsson Mar 31, 2023

Choose a reason for hiding this comment

mikeoliphant Mar 31, 2023

Choose a reason for hiding this comment

mikeoliphant Mar 31, 2023

Choose a reason for hiding this comment

mikeoliphant commented Mar 30, 2023 •

edited

Loading