An interesting diversion this week, when we noticed regression tests failing on our TeamCity build server. The differences were mostly tiny, but warranted a closer inspection since there were no significant code changes to explain them, and the tests were still passing on most developer PCs. The only thing that had really changed was that we’d moved the TeamCity build agents to newer hardware (with later-model Haswell chipsets).
We were able to boil the differences in behaviour down to a simple 3-line C++ program (essentially a call to std::exp), and, sure enough, the program gave a different result when compiled in VS2013 and executed on the new Haswell CPU. Any other combination of C++ runtime and chip (and any x86 build) gave us our “expected” answer. Obviously you can’t have your build server and your developer PCs disagree when it comes to floating-point calculations without chaos (or games of baseline whack-a-mole between developers when some of them are running newer PCs than others).
We tried forcing the rounding mode with _controlfp_s, but just ended up with different differences. We tried the Intel C++ compiler (which was slightly more stable but still off between Haswell and older CPUs).
As it turns out, the VC++ 2013 runtime uses FMA3 instructions for some transcendental functions (std::exp included), when available at runtime, and this was the difference for us. After disabling this behaviour in the runtime with _set_FMA3_enable(0), our tests started passing on both types of CPU, with no other code or baseline changes necessary.
Thanks to James McNellis for pointing us in the right direction on the FMA3 optimizations.
James also noted that the FMA3 optimizations are much faster, so at some point we will experiment with enabling those, and update our baselines, but for now we can move on with stable numbers between builds.