Jaa


Is there such a thing as too much precision?

Well, enough chit-chat, back to programming language design.

Suppose you’re building electronic piano software. As we’ve discussed before, the “equal temperament” tuning for a piano goes like this: the 49th note from the left on a standard 88 key piano is A, and its frequency is 440 Hz. Each octave above or below that doubles or halves the frequency. Why? Because humans perceive the ratio between two frequencies as the relevant factor, not the (subtractive) difference. There are twelve semitones in the chromatic scale, so to make each one sound like it has the same relationship with its previous note, each one multiplies the frequency of the previous one by the twelfth root of two.

In short, the frequency of the nth key is 440 x 2 (n-49) / 12

Now, on a real piano, its a bit more complicated than that. The high notes have some interesting problems associated with them. For the long strings in the middle of the piano, the ratio of the diameter of the string to its length is really small; zero, for practical purposes. But the short strings at the high end are short enough that the ratio is no longer practically zero. The thickness of the string causes its harmonic vibrations to be flatter than they ought to be; this is called “inharmonicity”. In addition, humans tend to perceive high notes as being flatter than they really are; our ability to determine relative pitches is somewhat skewed at the high end. For both these reasons, piano tuners often “stretch” the octaves at the high end, making the fundamental frequencies of the high notes slightly sharper than the equal temperament would suggest. Obviously electronic pianos do not have inharmonicity, and we’ll ignore the psychoacoustic problems for our purposes. Because I actually want to talk about floating point arithmetic.

Suppose you say “our expression above is equivalent to 440 x R(n-49) where R is the twelfth root of two. You compute the 12th root of 2 using calc.exe, and write some code:

static double Frequency(int key)
{
const double TwelfthRootOfTwo = 1.0594630943592952645618252949463;
const double A4Frequency = 440.0;
return A4Frequency * Math.Pow(TwelfthRootOfTwo, key-49);
}

Now, this code is fine, it works, and there’s no need to mess with it. It might be nice to validate the parameter to make sure that its between 1 and 88, but, whatever. However, does it strike you as odd how much precision is in the constant? There are about 20 more decimal digits of precision in there than can be represented in a double! A double is only accurate to around 14 or 15 digits of precision.

One thing that we can deduce from this is that calc.exe does not do its calculations in doubles, which I suppose is interesting. It must use some much higher-precision internal format.

Another thing we can note is that for this application, even fifteen digits is way more precision than we need. The highest value we’re going to get out of this is less than 4200 Hz; a rounding error after the fifteenth digit is going to be far, far too small for a human to hear the difference. If you divide an octave up into 1200 notes instead of 12, each note is called a “cent” (because 100 of them fit into a semitone). Most humans cannot hear a difference of less than a handful of cents, and this is plenty of precision to be accurate to within a few cents.

But if the compiler is simply throwing away all that precision, then that portion of the program is essentially meaningless. Shouldn’t the compiler be warning you that it is  throwing away your precision?

Well, no, not really. And here’s why.

Suppose you say

const double d = 108.595;

We wish to represent the exact mathematical value 108595 / 1000 in a double. But doubles use binary arithmetic; we need a power of two on the denominator, not a power of ten. The closest we can get using the 52 precision bits of a double is 1101100.1001100001010001111010111000010100011110101110 or, if you prefer fractions, that’s 3820846886986711 / 35184372088832. That fraction has the exact value in decimal of 108.594999999999998863131622783839702606201171875, which you must admit is awfully close to 108.595, though a touch lower than it.

So here’s the thing. You want to represent the exact value of 108.595, but we don’t let you; we make you accrue some error. And we don’t warn about that; we assume that you know that some error will be accrued. (If you wanted an exact result then you should have used decimal, not double.) But what if the exact value you actually wanted to represent was in fact 108.594999999999998863131622783839702606201171875 ?  That number can be represented in binary exactly! We certainly should not give a warning that you have too much precision if you say

const double d = 108.594999999999998863131622783839702606201171875;

because you have exactly the right amount of precision in order to have zero representation error. It seems wacky to say “please round that off to fifteen decimal digits so that we can give you this exact value right back again.” Why should we punish you for being more accurate?

The best we could do is give a warning when there was more precision than could be represented, *and* the representation error was non-zero, *and* losing the extra precision makes the representation error smaller, *and*… now we’ve written a whole lot of complicated math code, the result of which makes the user scratch their head and say “how the heck am I supposed to make the compiler happy with this constant?” Warnings that are nothing but a source of pain are bad warnings, and we try to avoid them.

Comments

  • Anonymous
    January 07, 2010
    You may find this piano interesting: http://www.youtube.com/watch?v=t7Cq3pbcMkI Now, if only it was computer controlled...

  • Anonymous
    January 07, 2010
    The tuning stuff gets even more interesting.  The "equal temperament" approach results in everything being equally out of tune.  For instance, a perfect fifth is a 3:2 harmonic ratio, wheras 2^(7/12) is just slightly too flat.  Organs are often tuned so that a certain key is perfect, and others are way off.  I'm pretty sure your floating point precision analogy could be extended.  :)

  • Anonymous
    January 07, 2010
    Maybe there should be a warning for any variable containing the word "price" which was a float or double :) I suspect that would actually improve the code of more products than the "excess precision" warning. Another feature which might help would be the ability to get the debugger to say: "give me the exact value" of this double. Actually, I'd like that as a custom numeric format, too - "r" doesn't do it, as that's just a round-trippable string. Currently I have my own little DoubleConverter class which I use every time I want to look at the real exact value of a double...

  • Anonymous
    January 07, 2010
    "Because I actually want to talk about floating point arithmetic." Does anyone else sometimes read this blog in Arlo Guthrie's voice? Apparently you've seen through my attempt to be subtle about my pop culture references. -- Eric

  • Anonymous
    January 07, 2010
    I think, to a certain extent, floating point computation in computers is an example of a leaky abstraction. The language (and libraries, and hardware) is attempting to model the concept of a "rational number" - which human beings have a particular interpretation and understanding of. Unfortunately, the fact that binary floating point representations are (by their nature) approximations - results in quite a few human errors when reasoning about mathematical operations involving floating point math. In fact, if you want to write a function as simple as Variance( IEnumerable<double> ) - you need to take special care to use "computational" versions of the algorithm to avoid catastrophic cancellation or dramatic error accumulation. Languages and compilers are trying to make it easier for people to write computational code - but the fact that too many software developers do not understand (or take the time to understand) the internal representation - and the limitations it brings - creates much of the problem. The language/libraries makes it easier for us to ignore the fact that floating point numbers are approximations. But even if some day, computer languages abandon the floating point abstraction, and adopt something like Mathematica's arbitrary precision representation - we'll simply trade one set of problems for another. There's no getting away from it. Ultimately, developers need to become familiar with the tools of their trade and the abstractions that they are built on - if they want to write correct code.

  • END OF RANT
  • Anonymous
    January 07, 2010
    The simple solution is to have either
  • extended precision floating point numbers with a large 32 byte mantissa
  • use a library supporting rational number storage of fractions and rational number arithmatic (symbolic algebra front ends do this - mathmathica)
  • use an arbritrary precision library These areas are well worth spelunking through the old 8 and 16 bit systems where they originated.
  • A longer term and much better solution is to precompute the values ahead of time, given there are 88 keys/values to compute, and store them a lookup table.
  • Anonymous
    January 07, 2010
    As an example of why you might want that level of precision, look at the following code:        // this is from GMP        #define BYTE_OFS(x) (((char *)&internalEndianMagic)[7-(x)])        static double internalEndianMagic = 7.949928895127363e-275; Yes, it works by exploiting the exact IEEE binary representation of that constant. Source: http://www.steike.com/code/useless/evil-c/

  • Anonymous
    January 07, 2010
    @Bill: I'm not sure what Eric's middle initial is, so I had to guess.  Substitute the correct one if you wish, provide it's not W, which wouldn't scan: You can get anything that you want At Eric J Lippert's blog. You can get anything that you want At Eric J Lippert's blog. Log right in, there's an Atom feed, Grab your coffee cup for a good long read You can get anything that you want At Eric J Lippert's blog (excepting Leah!) You can get anything that you want At Eric J Lippert's blog Cute. FYI, Eric is my middle name. Another beautiful theory spoiled by an ugly fact. -- Eric

  • Anonymous
    January 07, 2010
    "Obviously electronic pianos do not have inharmonicity, and we’ll ignore the psychoacoustic problems for our purposes. Because I actually want to talk about floating point arithmetic." That just makes such a wild yet great quote ;-p

  • Anonymous
    January 07, 2010
    The comment has been removed

  • Anonymous
    January 07, 2010
    @Douglas: Possibly not... but a lot of the time this is for answering quick questions, and I don't even have VS up. It would be nice for there to be a debugger visualizer built into VS which does this and framework support :)

  • Anonymous
    January 07, 2010
    Actually, Raymond wrote about Calc and why it does not use doubles: http://blogs.msdn.com/oldnewthing/archive/2004/05/25/141253.aspx

  • Anonymous
    January 07, 2010
    So do compilers actually use the "excess" digits when they initialize the const (or variable for that matter)? Interesting question. No, we don't, in fact. I just checked the code and we call the OLE Automation library method VarR8FromStr to parse literal doubles. I took a look at its implementation, and it truncates the passed-in string to twenty significant decimal digits, which is enough precision to find the double that has the least representation error. -- Eric

  • Anonymous
    January 07, 2010
    @Jon - any chance you'd be willing to share that code?

  • Anonymous
    January 08, 2010
    So here's another barely-on-topic question... when will we get a BigDecimal/arprec library in C#? Come on, Java has it! Beats the heck out of me. I have no clue what the BCL team is working on next, and even if I did, I would not want to speculate about their schedules. They write a blog too; try asking them! -- Eric  

  • Anonymous
    January 08, 2010
    .NET also has it. Find a library with Google.

  • Anonymous
    January 08, 2010
    @Jon et al: Sounds you are looking for something like C's "hex floating point litterals" which directly map to the underlying binary representation. E.g. 0x1.5Ap2 (if I have the syntax right.)

  • Anonymous
    January 09, 2010
    At some point, VarR8FromStr contained a rounding bug: it was discarding a bit that could be significant in about 1 in 8000 cases. The .NET code to convert strings to doubles was correct, and so you had the strange effect that, for example, the C# expression    4.170404 == Convert.ToDouble(“4.170404“) was false. This was the case in Visual Studio 2005. I don't know if it was ever fixed. If you want the details, I did a write-up on this a while ago: http://blogs.extremeoptimization.com/jeffrey/archive/2006/06/16/16647.aspx

  • Anonymous
    January 09, 2010
    Actually, it was the other way around: VarR8FromStr was correct, and the Rotor code (and presumably the corresponding retail .NET code) contained the error. FWIW, I think it can make sense to give a warning if the exact value, when rounded to the same precision as given by the user, gives a different result. So, when someone writes     108.59500000000000000000000 the warning could be something like: "The value '108.59500000000000000000000' cannot be represented accurately to the specified precision. The value '108.59499999999999886313162' will be used instead." where the replacement value is rounded to the same number of digits as the original literal.

  • Anonymous
    January 10, 2010
    Arron G: http://blogs.msdn.com/bclteam/archive/2009/05/22/what-s-new-in-the-bcl-in-net-4-beta-1-justin-van-patten.aspx (Note BigInteger was scheduled for .NET 3.5 but was made internal at the last minute. It came back for .NET 4. :-)

  • Anonymous
    January 10, 2010
    The comment has been removed

  • Anonymous
    January 16, 2010
    Regarding constants that can be represented in binary exactly -- try printing one with the %f format specifier in printf (or equivalent). Most compilers will not print all the digits, even if you ask for them (gcc on Linux is one of the exceptions). I did a study which you may find interesting: http://www.exploringbinary.com/print-precision-of-dyadic-fractions-varies-by-language/ .

  • Anonymous
    July 01, 2010
    "Another thing we can note is that for this application, even fifteen digits is way more precision than we need." Yep, a hardware engineer once told me that you pretty much never need more than 5 sig figs for engineering purposes.