Now that networks have continued to grow exponentially in size, the exploration of FP8 is the next logical step in reducing training bandwidth demands.” How we got here Floating-point arithmetic is a ...