PEP 327: Decimal Data Type
Stephen Horne
steve at ninereeds.fsnet.co.uk
Tue Feb 3 20:59:41 EST 2004
More information about the Python-list mailing list
Tue Feb 3 20:59:41 EST 2004
- Previous message (by thread): PEP 327: Decimal Data Type
- Next message (by thread): PEP 327: Decimal Data Type
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, 02 Feb 2004 17:07:52 -0500, cookedm+news at physics.mcmaster.ca (David M. Cooke) wrote: >At some point, "Batista, Facundo" <FBatista at uniFON.com.ar> wrote: > >> danb_83 wrote: >> >> #- On the other hand, when I say that I am 1.80 m tall, it doesn't imply >> #- that humans height comes in discrete packets of 0.01 m. It >> #- means that >> #- I'm *somewhere* between 1.795 and 1.805 m tall, depending on my >> #- posture and the time of day, and "1.80" is just a convenient >> #- approximation. And it wouldn't be inaccurate to express my height as >> #- 0x1.CC (=1.796875) or (base 12) 1.97 (=1.7986111...) meters, because >> #- these are within the tolerance of the measurement. So number base >> #- doesn't matter here. >> >> Are you saying that it's ok to store your number imprecisely because you >> don't take well measures? > >What we need for this is an interval type. 1.80 m shouldn't be stored >as '1.80', but as '1.80 +/- 0.005', and operations such as addition >and multiplication should propogate the intervals. I disagree with this, not because it is a bad idea to keep track of precision, but because this should not be a part of the float type or of basic arithmetic operations. When you write a value with its precision specified in the form of an interval, that interval is a second number. The value with the precision is a compound representation, built up using simpler components. It doesn't mean that the components no longer have uses outside of the compound. In Python, the same should apply - a numeric type that can track precision sounds useful, but it shouldn't replace the existing float. One good reason is simply that knowledge of the precision is only sometimes useful. As an obvious example, what would the point be of keeping track of the precision of the calculations in a 3D game - there is no point as the information about precision has no bearing on the rendering of the image. Besides this, there is a much more fundamental problem. The whole point of using an imprecise representation is because manipulating a perfect representation is impractical - mainly slow. It is true that in general the source is inherently approximate too, meaning that floats are a quite a good match for the physical measurements they are often used to represent, but still if it were practical to do perfect arithmetic on those approximate values it would give slightly more precise answers as the arithmetic would not introduce additional sources of error. Having an approximate representation with an interval sounds good, but remember that one error source is the arithmetic itself - e.g. 1.0 / 3.0 cannot be finitely represented in either binary or decimal without error (except as a rational, of course). So therefore, in answer to your question... >How to do that is another question: for addition, do you add the >magnitudes of the intervals, or use the square root of the sums of the >squares, or something else? It greatly depends on what _type_ of error >0.005 measures (is it the width of a Gaussian distribution? a uniform >distribution? something skewed that's not representable by one >number?). None of these is sufficient - they may track the errors resulting from measurement issues (if you choose the appropriate method for your application) but neither takes into account errors resulting from the imprecision of the arithmetic. Furthermore, to keep track of such imprecision precisely means you need an infinitely precise numeric representation for your interval - and if it was practical to do that, it would be far better to just use that representation for the value itself. This doesn't mean that tracking precision is a bad idea. It just means that when it is done, the error interval itself should be imprecise. You should have the guarantee that the real value is never going to be outside of the given bounds, but not the guarantee that the bounds are as close together as possible - the bounds should be allowed to get a little further apart to allow for imprecision in the calculation of the interval. And if the error interval is itself an approximation, why track it on every single arithmetic operation? Unless you have a specific good reason to do so, it makes much more sense to handle the precision tracking at a higher level. And as those higher level operations are often going to be application specific, having a single library for it (ie not tailored to some particular type of task) is IMO unlikely to work. For instance, consider calculating and applying a 3D rotation matrix to a vector. If you track errors on every float value, that is 9 values in the matrix with error values (due to limited precision trig functions etc) and 3 values in the vector, a dozen for the intermediate results in the matrix multiplication, and 3 error intervals for the 3 dimensions of the output vector. But the odds are that all you want is a single float value - the maximum distance between the real point and the point represented by the output vector, and you can probably get a good value for that by multiplying the length of the input vector by some 'potential error from rotation' constant. Incidentally, it would not always be appropriate to include arithmetic errors in error intervals. For instance, some statistical interval types do not guarantee that all values are within the interval range. They may guarantee that 95% of values are within the interval, for instance - _and_ that 5% of values are outside the interval. The 5% outside is as important as the 95% inside, so there is no acceptable direction to move the bounds a little 'just to be safe'. In some cases, you might even want to track the error interval (from arithmetic error) for your error interval value. I can certainly imagine a result with the form... The average widginess of a blodgit is 9.5 +/- 0.2 95% differ from the average by less than 2.7 +/- 0.03 Thus I can say that this randomly chosen blodgit has a widginess of (9.5 +/- 0.2) +/- (2.7 +/- 0.03) with 95% confidence. You might even get results like that it you had estimated the average and distribution of widginess from a sample of the blodgits - in which case, you may still need to account from the arithmetic error which requires potentially another four values ;-) -- Steve Horne steve at ninereeds dot fsnet dot co dot uk
- Previous message (by thread): PEP 327: Decimal Data Type
- Next message (by thread): PEP 327: Decimal Data Type
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list