It gives an algorithm for addition, subtraction, multiplication, division and square root, and requires that implementations produce the same result as that algorithm. Thus when a program is moved from one machine to another, the results of the basic operations will be the same in every bit if...
That's exactly what I tried next: scaling test. I didn't put the result here because it's not consistent. But here it is: With 2 threads per core, following is the performance as I keep increasing the number of cores. Ellipsis are used where values are not changing much. Number of ...