The other day I discovered that 32-bit FreeBSD 11.2 has strange trouble running in an emulated environment. Utilities like ping
or top
would just hang when trying to print floating-point numbers through printf()
. The dtoa()
library routine was getting stuck in an endless loop (FreeBSD has excellent support for debugging the binaries shipped with the OS, so finding out where things were going wrong was unexpectedly easy).
Closer inspection identified the following instruction sequence:
fldz fxch st(1) fucom st(1) fstp st(1) fnstsw ax sahf jne ... jnp ...
This code relies on “undefined” behavior. The FUCOM instruction compares two floating-point values and sets the FPU condition code bits. The FNSTSW instruction stores the bits into the AX register, where they can be tested directly, or the SAHF instruction first copies them into the flags register where the bits can be conveniently tested by conditional jump instructions.
The problem is the FSTP instruction in between. According to Intel and AMD documentation, the FSTP instruction leaves the FPU condition codes in undefined state. So the FreeBSD library is testing undefined bits… but it just happens to work on all commonly available CPUs, in a very predictable and completely deterministic manner, because the FSTP instruction in reality leaves the condition bits alone. What is going on?
Continue reading