Changes in floating point calculation since Windows 24H2

Timothé VAN DEPUTTE 15 Reputation points
2025-01-03T13:22:21.1066667+00:00

Hello,
Since last windows 11 update (24H2), we observe slight differences between our continuous integration (WinServer 2022 21H2) and our develpers computers regarding some of our double precision calculation.

We checked iteratively versions of windows to clearly identify this was introduced when updating to 24H2, using the same binary yield different results thus failing our unit tests since we expect a deterministic result and binary exact equality (e.g dumping a value / re-reading it shall be binary equal).

But what's is interesting is that it do not happen with all our tests, so probably due to special float handling.

Note that :

  • we compile using /fp:strict compiler option
  • the results changes at runtime
  • we checked using the same hw but different windows version, results are not the same
  • the results are deterministics (the same between run using same environment)
  • control_fp is the same between both environment rounding + exceptions
  • we don't mixup float and double and uses double everywhere

I didn't find anything regarding this in release notes or forums except this Post (Changes to SEH on Windows 11 24H2 causing problems) more or less related.

Also I don't yet have a minimal reproducible code to share as I didn't find the exact culprit leading to this change still investigating, will update if I find something new.

Is this something that might happen between windows versions, or a bug fixed in 24H2 leading to this breaking change ?

Do you have any clue how to investigate this further or even a potential fix ?

Thanks for your time.

Have a nice day.

C++
C++
A high-level, general-purpose programming language, created as an extension of the C programming language, that has object-oriented, generic, and functional features in addition to facilities for low-level memory manipulation.
3,818 questions
Windows 11
Windows 11
A Microsoft operating system designed for productivity, creativity, and ease of use.
10,394 questions
{count} vote

1 answer

Sort by: Most helpful
  1. Timothé VAN DEPUTTE 15 Reputation points
    2025-01-06T16:51:17.3633333+00:00

    Hi,

    Thanks for your answer and support

    Sorry if my statement was not clear enough, what I meant is that compilation wasn't involved at all and everything happening at runtime (but after reading here and there most of fp related operations are done at runtime so makes sense).

    Running the binary twice on the exact same computer (at t time ) yields the same results. Running the binary on the exact same computer before and after upgrade to 24H2 yields different results.

    It's clear something changed between both windows version (might be driver, or as you said some optimizations).

    Following your suggestion regarding 1ULP comparison, I checked binary representation of the wrong result we're having, and its seems way more than 1ULP.

    windows =24H2 result : 0011111010100001001010101010100000111101100101111000111010011111

    windows <24H2 result : 0011111010011011110001100110111100000110011010111101101011110000

    I'm not yet familiar enough with 1ULP concept but am I correct if I assume it shouldn't impact a sequence of float operations (add, substr, mult, etc...) and that the result of the sequence should still be within 1ULP ?

    Note that tests that are successful (see below) matches exactly


    To give a bit of a context in which case we observe this behaviour :

    We have 1 test suite with 4 different cases, only one of them is failing.

    In the 4 different cases, we have the following matrix :

    Mesh file Algorithm test result
    File1 Algo1 failure
    File1 Algo1 failure
    File1 Algo2 success
    File2 Algo1 success
    File2 Algo2 success

    So from my understanding something happen in between File1 - Algo1 combination not yet identified.

    During those tests, we want to perform a registration of a plane to another plane.

    We open a mesh, read the triangles get some of the triangles and transform them into a pointcloud to serves as input of the algorithm.

    We build a distance map from the triangle and use it in the cost function.

    From there we try iteratively perform the registration of the points to the appropriate plane based on distance.

    So there are some operations performed but nothing too fancy, we're just using Eigen::Matrix4d (inverse, mult mostly), and TriangleMesh class from Open3D read from an STL file. - everything is done on CPU side.


    I don't know the natures of the algorithms / optimisations that could have been included into windows 24H2 or in a new intel microcode shipped with it and I think it's a good lead actually.

    Do you know where I could find additional informations regarding this specific topic ? A detailed release note of windows 24H2, some driver / microcode version inspection or something like that.

    Otherwise any way to rollback from 24H2 to 23H2 maybe ? Our target product is stuck on 22H2 for now and we need to ensure reproductability as much as possible.

    I'll continue investigating on tests step by step reducing used samples or something like that if there are some "bad" triangles somewhere.

    Thanks you for your time, have a nice day


    Additional infos if it helps

    CPU-Z where this bug hapens after upgrading.
    User's image

    Windows (Specs dump):

    Windows 11 Pro Edition Version 24H2
    OS Build 26100.2605
    Windows Features Packs 1000.26100.36.0


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.