Precision issue
Tim Peters
tim.one at comcast.net
Mon Oct 13 10:29:26 EDT 2003
More information about the Python-list mailing list
Mon Oct 13 10:29:26 EDT 2003
- Previous message (by thread): Precision issue
- Next message (by thread): Precision issue
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Duncan Booth] >>> There's no reason why Python couldn't do the same: >>> >>> def float_repr(x): >>> s = "%.15g" % x >>> if float(s)==x: return s >>> return "%.17g" % x [Tim] >> Sorry, but there is a reason: if done on a platform whose C library >> implements perfect-rounding double->string (e.g., I think gcc does >> now), this can hit cases where the string may not reproduce x when >> eval'ed back on a different platform whose C library isn't so >> conscientious but which nevertheless meets the 754 standard's more >> forgiving (than perfect rounding) requirements. >> >> This is acutely important because Python's marshal format (used for >> .pyc files) represents floats as repr'ed strings. By making repr() >> pump out 17 digits, we maximize the odds that .pyc files ported >> across platforms load back exactly the same 754 doubles across (754) >> platforms. [Duncan] > Thanks for giving me the reason, but I find this argument > unconvincing on several counts. > > If a system has an inaccurate floating point library, then introducing > further inconsistencies based on whether the .pyc file was compiled > locally or copied from another system doesn't sound like a good > solution. Surely if the library is inaccurate you are going to get > inaccurate results no matter what tweaks Python tries to apply? You snipped most of my msg. As explained in the parts not reproduced here, Python is aiming to work correctly across (at least) platforms where the native C library meets the minimal requirements of the 754 standard for float <-> string accuracy. That doesn't require perfect rounding in all cases, but to call a system meeting no more than the minimal requirements "inaccurate" is quite a stretch. It can require multi-thousand bit arithmetic (in some cases) to do perfect rounding, and that's why the standard allowed for a small bit of slop. Perfect rounding isn't necessary for eval(str(float)) == float to hold always; it's enough that platforms meet the minimal 754 requirements and at least 17 significant digits are produced in the float->string direction. > Also the marshal code doesn't actually use repr. For that matter the > interactive prompt which is what causes the problems I want to avoid in > the first place doesn't use repr either! (Marshal uses > PyFloat_AsReprString which comments say should be deprecated, repr > uses float_repr, and interactive mode uses float_print.) PyFloat_AsReprString(afloat) is the C API spelling of the Python-level repr(afloat), as documented in floatobject.h. The comments say it should be deprecated because it "pass[es] a char buffer without passing a length", which has nothing to do with the result it produces; adding a buffer length argument would satisfy the complaint. It's a general rule that repr(obj) is produced at the interactive prompt regardless of the type of obj; the specific function called to produce that result in the specific case of isintance(obj, float) isn't really interesting; what's relevant is that it *does* produce repr(float), however it's implemented. It's also a general rule that eval(repr(obj)) == obj should hold when sanely possible, again without regard to type(obj). That last rule is why repr(float) does what it does; marshal exploits it. There are other complaints that can be made about the interactive prompt using repr(), and many such have been made over the years. sys.displayhook was introduced in the hopes that people would build prompt format functions they like better, and share them. It's remarkable (to me -- that's why I'm remarking <wink>) that so few have. > If you think it is important, I don't have any problems with leaving > the marshalling code generating as many digits as it wants. It's vital for marshal to try to reproduce floats across platforms. It does OK at that now, but I think it would be better for marshal to move to a binary format. That's got problems of its own, due to compatibility hassles. Regardless of what marshal does, it's still a general rule that Python strive to maintain that eval(repr(x)) == x. This is true now for all builtin scalar types, and for lists, tuples and dicts composed (possibly recursively) of those. repr(obj) can be an undesirable thing to produce at an interactive prompt for many reasons, some depending on taste. That's why sys.displayhook exists, so you can change interactive prompt behavior. The reason I like, e.g., 0.1 *not* to display as "0.1" by default was given toward the end of my msg (and had nothing to do with marshal, btw).
- Previous message (by thread): Precision issue
- Next message (by thread): Precision issue
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list