[RFC] Add f8E4M3 and f8E3M4 types support #2486

apivovarov · 2024-08-09T21:48:08Z

Summary

This is a proposal to add Float8E4M3 and Float8E3M4 floating point types to StableHLO.
Feedback welcome, see RFC: Float8E4M3 and Float8E3M4 for more details.

References and Links

LLVM PR-97179 [APFloat] Add support for f8E4M3 IEEE 754 type (Merged)
LLVM PR-97118 [MLIR] Add f8E4M3 IEEE 754 type (Merged)
LLVM PR-99698 [APFloat] Add support for f8E3M4 IEEE 754 type (Merged)
LLVM PR-101230 [MLIR] Add f8E3M4 IEEE 754 type (Merged)
RFC: FP8 in StableHLO
RFC: Float8E4M3FNUZ and Float8E5M2FNUZ
StableHLO PR-2482 Add f8E4M3 and f8E3M4 types support
Amazon EC2 Trn1 Instances
ml_dtypes PR-161 Add float8_e4m3 (Merged)
ml_dtypes PR-171 Add float8_e3m4 (Merged)
XLA PR-16585 Add support for float8_e4m3

GleasonK · 2024-08-13T20:10:21Z

Signal boosted this to XLA devs, and generally positive feedback. Will give this a week comment period but general consensus is LGTM.

Summarizing some early feedback I've heard:

Given that Amazon hardware supports these types, it makes sense to add to StableHLO.
These new types are very well defined, underspecification tends to be the bigger risk with new type support (wouldn't want a correction to lead to an fp8..._v2 type).
Given the Trainium compilation pipeline (requires support in HLO/MHLO as well), adding type support elsewhere makes sense, and will be mostly boilerplate in XLA.

Also want to note that I'll continue socializing / signal boosting this, will report back additional feedback as I hear it / request feedback be left on the PR!

apivovarov · 2024-08-23T20:06:37Z

The RFC has been open for two and a half weeks. Should we keep it open longer, or is it ready to proceed?

You can find the implementation draft here: #2482
@GleasonK

GleasonK · 2024-08-28T01:18:52Z

Hello! Yes sorry waiting on feedback from one more person from xla who said they wanted to look into this. Will follow up first thing tomorrow.

GleasonK · 2024-09-03T19:20:50Z

RFC LGTM. I wasn't able to get a hold of that last dev that wanted to chime in, but found a few proxy approvals internally who all agree that given that this is in IEEE and LLVM it should be good to go. Thanks for the contribution and apologies for the delay!

apivovarov · 2024-09-03T19:51:53Z

Great news! Thank you, Kevin, for you help and support!

This PR adds f8E4M3 and f8E3M4 types support. f8E4M3 and f8E3M4 types follow IEEE 754 convention. ```c f8E4M3 (IEEE 754) - Exponent bias: 7 - Maximum stored exponent value: 14 (binary 1110) - Maximum unbiased exponent value: 14 - 7 = 7 - Minimum stored exponent value: 1 (binary 0001) - Minimum unbiased exponent value: 1 − 7 = −6 - Precision specifies the total number of bits used for the significand (mantisa), including implicit leading integer bit = 3 + 1 = 4 - Follows IEEE 754 conventions for representation of special values - Has Positive and Negative zero - Has Positive and Negative infinity - Has NaNs Additional details: - Max exp (unbiased): 7 - Min exp (unbiased): -6 - Infinities (+/-): S.1111.000 - Zeros (+/-): S.0000.000 - NaNs: S.1111.{001, 010, 011, 100, 101, 110, 111} - Max normal number: S.1110.111 = +/-2^(7) x (1 + 0.875) = +/-240 - Min normal number: S.0001.000 = +/-2^(-6) - Max subnormal number: S.0000.111 = +/-2^(-6) x 0.875 = +/-2^(-9) x 7 - Min subnormal number: S.0000.001 = +/-2^(-6) x 0.125 = +/-2^(-9) ``` ```c f8E3M4 (IEEE 754) - Exponent bias: 3 - Maximum stored exponent value: 6 (binary 110) - Maximum unbiased exponent value: 6 - 3 = 3 - Minimum stored exponent value: 1 (binary 001) - Minimum unbiased exponent value: 1 − 3 = −2 - Precision specifies the total number of bits used for the significand (mantissa), including implicit leading integer bit = 4 + 1 = 5 - Follows IEEE 754 conventions for representation of special values - Has Positive and Negative zero - Has Positive and Negative infinity - Has NaNs Additional details: - Max exp (unbiased): 3 - Min exp (unbiased): -2 - Infinities (+/-): S.111.0000 - Zeros (+/-): S.000.0000 - NaNs: S.111.{0,1}⁴ except S.111.0000 - Max normal number: S.110.1111 = +/-2^(6-3) x (1 + 15/16) = +/-2^3 x 31 x 2^(-4) = +/-15.5 - Min normal number: S.001.0000 = +/-2^(1-3) x (1 + 0) = +/-2^(-2) - Max subnormal number: S.000.1111 = +/-2^(-2) x 15/16 = +/-2^(-2) x 15 x 2^(-4) = +/-15 x 2^(-6) - Min subnormal number: S.000.0001 = +/-2^(-2) x 1/16 = +/-2^(-2) x 2^(-4) = +/-2^(-6) ``` Related PRs: - LLVM [PR-97179](llvm/llvm-project#97179) [APFloat] Add support for f8E4M3 IEEE 754 type (Merged) - LLVM [PR-97118](llvm/llvm-project#97118) [MLIR] Add f8E4M3 IEEE 754 type (Merged) - LLVM [PR-99698](llvm/llvm-project#99698) [APFloat] Add support for f8E3M4 IEEE 754 type (Merged) - LLVM [PR-101230](llvm/llvm-project#101230) [MLIR] Add f8E3M4 IEEE 754 type (Merged) - StableHLO [PR-2486](#2486) [RFC] Add f8E4M3 and f8E3M4 types support - ml_dtypes [PR-161](jax-ml/ml_dtypes#161) Add float8_e4m3 (Merged) - ml_dtypes [PR-171](jax-ml/ml_dtypes#171) Add float8_e3m4 (Merged) - XLA [PR-16585](openxla/xla#16585) Add support for float8_e4m3

apivovarov mentioned this pull request Aug 9, 2024

Add f8E4M3 and f8E3M4 types support #2482

Merged

apivovarov force-pushed the rfc_f8E4M3_f8E3M4 branch 3 times, most recently from e4fddfa to 78b7485 Compare August 10, 2024 00:53

[RFC] Add f8E4M3 and f8E3M4 types support

389e72c

apivovarov force-pushed the rfc_f8E4M3_f8E3M4 branch from 78b7485 to 389e72c Compare August 10, 2024 01:43

apivovarov mentioned this pull request Aug 15, 2024

Add float8_e4m3 jax-ml/ml_dtypes#161

Merged

GleasonK added the RFC label Aug 20, 2024

apivovarov mentioned this pull request Aug 22, 2024

Add float8_e3m4 jax-ml/ml_dtypes#171

Merged

Merge branch 'main' into rfc_f8E4M3_f8E3M4

b31f83e

apivovarov added 2 commits August 23, 2024 21:17

Merge branch 'main' into rfc_f8E4M3_f8E3M4

fe1c439

Merge branch 'main' into rfc_f8E4M3_f8E3M4

f6c65f5

Merge branch 'main' into rfc_f8E4M3_f8E3M4

a189c09

apivovarov mentioned this pull request Aug 28, 2024

Add support for float8_e4m3 and float8_e3m4 types openxla/xla#16585

Open

apivovarov added 2 commits August 29, 2024 17:25

Merge branch 'main' into rfc_f8E4M3_f8E3M4

1397a85

Merge branch 'main' into rfc_f8E4M3_f8E3M4

35d79d0

GleasonK approved these changes Sep 3, 2024

View reviewed changes

GleasonK merged commit d68ab07 into openxla:main Sep 3, 2024
10 checks passed

apivovarov mentioned this pull request Sep 12, 2024

Add float8_e4m3 and float8_e3m4 types support jax-ml/jax#23585

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Add f8E4M3 and f8E3M4 types support #2486

[RFC] Add f8E4M3 and f8E3M4 types support #2486

apivovarov commented Aug 9, 2024 •

edited

Loading

GleasonK commented Aug 13, 2024

apivovarov commented Aug 23, 2024 •

edited

Loading

GleasonK commented Aug 28, 2024

GleasonK commented Sep 3, 2024

apivovarov commented Sep 3, 2024

[RFC] Add f8E4M3 and f8E3M4 types support #2486

[RFC] Add f8E4M3 and f8E3M4 types support #2486

Conversation

apivovarov commented Aug 9, 2024 • edited Loading

Summary

References and Links

GleasonK commented Aug 13, 2024

apivovarov commented Aug 23, 2024 • edited Loading

GleasonK commented Aug 28, 2024

GleasonK commented Sep 3, 2024

apivovarov commented Sep 3, 2024

apivovarov commented Aug 9, 2024 •

edited

Loading

apivovarov commented Aug 23, 2024 •

edited

Loading