Add support for converting between Stochastic FP numbers, and convenience methods for array conversions #56

jprhyne · 2022-04-12T18:14:24Z

Motivation
I would like to convert between Stochastic Floating Points just as we can for Deterministic types, and have convenience functions to convert entire arrays at a time
Goal
Add the above functionality without breaking any existing features, and test the implementation
Implementation
Added extra functions that convert between the stochastic types with deterministic types as a middle step
Testing
Generated large amounts of random floating point numbers, ensured that these numbers were exactly representable in BFloat16, Float16, Float32, and Float64 via "daisy chaining" these conversions, and then testing that conversions were done properly

…venience functions for array conversions

milankl

Sorry, I didn't see this PR before. Thanks for the contribution 😄 Yes, I appreciate the conversions between stochastic rounding formats!

milankl · 2022-08-27T13:37:04Z

src/conversions.jl

+Float16sr(x::Float32sr) = Float16sr(Float64(x))
+
+# Conversion of arrays as long as they are some kind of floating point number
+function Float16sr(list::T where T<:AbstractArray{<:Union{BFloat16,AbstractFloat}})


BFloat16 is already <:AbstractFloat?

julia> BFloat16<:AbstractFloat true

milankl · 2022-08-27T13:39:29Z

src/float16sr.jl

@@ -1,3 +1,4 @@
+import BFloat16s.BFloat16


instead of adding this to individual files, we can add import BFloat16s: BFloat16 to src/StochasticRounding.jl ?

milankl · 2022-08-27T13:40:53Z

src/float32sr.jl

@@ -53,6 +54,10 @@ Base.Float64(x::Float32sr) = Float64(Float32(x))
 Float32sr(x::Integer) = Float32sr(Float32(x))
 (::Type{T})(x::Float32sr) where {T<:Integer} = T(Float32(x))

+# converting to and from BFloat16
+Float32sr(x::BFloat16) = Float32sr(Float64(x))


This should go via Float32 and not Float64? That will just padd trailing zeros and reinterpret

milankl · 2022-08-27T13:44:27Z

src/conversions.jl

+
+# Conversions from other stochastic floating points to Float16sr
+Float16sr(x::BFloat16sr) = Float16sr(Float32(x))
+Float16sr(x::Float32sr) = Float16sr(Float64(x))


This should go via Float32 not Float64

Johnathan Rhyne added 2 commits April 12, 2022 14:03

adding support to convert between Stochastic Floating Points, and con…

62ae9de

…venience functions for array conversions

removing accidental swap file

46a2648

milankl self-assigned this Aug 27, 2022

milankl reviewed Aug 27, 2022

View reviewed changes

milankl added the enhancement New feature or request label Nov 25, 2022

milankl added this to the v0.7 milestone Nov 25, 2022

milankl added 2 commits January 12, 2023 16:53

BFloat16 centrally imported

d84fa09

suggested array conversions removed as avail via . broadcast

30f78f9

milankl merged commit ee4c157 into milankl:main Jan 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for converting between Stochastic FP numbers, and convenience methods for array conversions #56

Add support for converting between Stochastic FP numbers, and convenience methods for array conversions #56

jprhyne commented Apr 12, 2022

milankl left a comment

milankl Aug 27, 2022

milankl Aug 27, 2022

milankl Aug 27, 2022

milankl Aug 27, 2022

Add support for converting between Stochastic FP numbers, and convenience methods for array conversions #56

Add support for converting between Stochastic FP numbers, and convenience methods for array conversions #56

Conversation

jprhyne commented Apr 12, 2022

milankl left a comment

Choose a reason for hiding this comment

milankl Aug 27, 2022

Choose a reason for hiding this comment

milankl Aug 27, 2022

Choose a reason for hiding this comment

milankl Aug 27, 2022

Choose a reason for hiding this comment

milankl Aug 27, 2022

Choose a reason for hiding this comment