Efficient uniform int-to-float conversion#93
Conversation
|
Thank you @mr-matthew-jones. This is great work. Just tagging #84 here so that GH knows to link that ticket. |
joshbainbridge
left a comment
There was a problem hiding this comment.
This is such a great write up of the method. Thank you for both the work to find it, and then explain the reasoning so clearly.
If we get consensus that this is the right method, I'd be happy for this to be merged. Only comment was on the unit test cost.
|
@mr-matthew-jones Awesome observation/investigation!! Tiny sidenote that on some platforms (e.g. CUDA), rounding is encoded into (most of) the math opcodes themselves, so there the round down version could be even more efficient (most likely will not matter in practice though). |
0e2b9c9 to
7b0922a
Compare
77aa9cd to
bf7a9e6
Compare
|
Tried out the original idea of rounding down in CUDA, and yes, |
Thanks @toxieainc! Do you think we should have an ifdef here to call __uint2float_rd directly, or is that not worth it / would NVCC be smart enough to do that for us? |
|
I don't think the compiler will do a transform from some integer bit fiddling operations to an intrinsic. That sounds like too much effort to analyze. |
Okay cool. Thinking about this more, there's probably a few places we could add some CUDA intrinsics. I think it would be best to get this merged as is. I can then look at doing another pass and possibly push up a separate PR with CUDA specifics. |
|
@mr-matthew-jones I think we are good to go with this! Thank you for addressing the feedback. Can I just ask that you add a note the CHANGELOG.md file, and then we can get this into main. |
bf7a9e6 to
1500a33
Compare
Conversion from an integer value across the full range of representable values to floating point values within the range of [0, 1) is a key part to QMC algorithms, as most calculations are done using integer arithmetic, but the resulting output often needs to be floating point. The current implementation uses a standard conversion to float followed by a division by 2^32. However, the operation uses default rounding mode (round nearest) and therefore may round either up or down. This produces uneven probabilities across the final distribution of values within the representable range. Rounding up also means a value of exactly 1 may generated. Due to this, a min operation is used to clamp all values to the last representable number before 1, which also adds bias. The high-quality mapping presented in 'Quasi-Monte Carlo Algorithms (not only) for Graphics Software' by Keller Wächter and Binder provides an optimal distribution, but this is computationally more expensive and often costs more than the creation of the QMC point. This patch implements a new method that is simpler and more efficient, while providing identical results to the Keller et al method. A simple bitwise shift and mask operation is applied to the input integer to ensure that the value is rounded down. This guarantees that the probability of each output is equal to the density of float representations, and constant in each power of two. For example, every float in [0.5, 1.0) has a 2^-24 probability and is produced by exactly 256 input values. (Issue 84) Signed-off-by: Matthew Jones <mrmatthewjones@icloud.com>
1500a33 to
27a61d2
Compare
21b65e8
into
AcademySoftwareFoundation:main
Conversion from an integer value across the full range of representable values to floating point values within the range of [0, 1) is a key part to QMC algorithms, as most calculations are done using integer arithmetic, but the resulting output often needs to be floating point.
The current implementation uses a standard conversion to float followed by a division by 2^32. However, the operation uses default rounding mode (round nearest) and therefore may round either up or down. This produces uneven probabilities across the final distribution of values within the representable range. Rounding up also means a value of exactly 1 may generated. Due to this, a min operation is used to clamp all values to the last representable number before 1, which also adds bias.
The high-quality mapping presented in 'Quasi-Monte Carlo Algorithms (not only) for Graphics Software' by Keller Wächter and Binder provides an optimal distribution, but this is computationally more expensive and often costs more than the creation of the QMC point.
This patch implements a new method that is simpler and more efficient, while providing identical results to the Keller et al method. A simple bitwise shift and mask operation is applied to the input integer to ensure that the value is rounded down. This guarantees that the probability of each output is equal to the density of float representations, and constant in each power of two. For example, every float in [0.5, 1.0) has a 2^-24 probability and is produced by exactly 256 input values.
(Issue 84)