[Python-ideas] Customizing format()
Terry Reedy
tjreedy at udel.edu
Tue Mar 17 22:43:10 CET 2009
More information about the Python-ideas mailing list
Tue Mar 17 22:43:10 CET 2009
- Previous message: [Python-ideas] Customizing format()
- Next message: [Python-ideas] Customizing format()
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Raymond Hettinger wrote: > I've been exploring how to customize our thousands separators and decimal > separators and wanted to offer-up an idea. It arose when I was looking > at Java's DecimalFormat class and its customization tool > DecimalFormatSymbols > http://java.sun.com/javase/6/docs/api/java/text/DecimalFormat.html . > Also, I looked at how regular expression patterns provide options to change > the meaning of its special characters using (?iLmsux). > > I. Simplest version -- Translation pairs > > format(1234, "8,.1f") --> ' 1,234.0' > format(1234, "(,_)8,.1f") --> ' 1_234.0' > format(1234, "(,_)(.,)8,.1f") --> ' 1_234,0' > > This approach is very easy to implement and it doesn't make life difficult > for the parser which can continue to look for just a comma and period > with their standardized meaning. It also fits nicely in our current > framework > and doesn't require any changes to the format() builtin. Of all the > options, > I find this one to be the easiest to read. I strongly prefer suffix to prefix modification. The format gives the overall structure of the output, the rest are details, which a reader may not care so much about. > Also, this version makes it easy to employ a couple of techniques to > factor-out These techniques apply to any "augment the basic format with an affix" method. > formatting decisions. Here's a gettext() style approach. > > def _(s): > return '(,.)(.,)' + s > . . . > format(x, _('8.1f')) > > Here's another approach using implicit string concatenation: > > DEB = '(,_)' # style for debugging > EXT = '(, )' # style for external display > . . . > format(x, DEB '8.1f') > format(y, EXT '8d') > > There are probably many ways to factor-out the decision. We don't need to > decide which is best, we just need to make it possible. > > One other thought, this approach makes it possible to customize all of the > characters that are currently hardwired (including zero and space padding > characters and the 'E' or 'e' exponent symbols). Any "augment the format with affixes" method should do the same. I prefer at most a separator (;) between affixes rather than fences around them. I also prefer, mnemonic key letters to mark the start of each affix, such as in Guido's quick suggestion: Thousands, Decimal_point, Exponent, Grouping, Pad_char, Money, and so on. But I do not think '=' is needed. Since the replacement will almost always be a single non-captital letter char, I am not sure a separator is even needed, but it would make parsing much easier. G would be followed by one or more digits indicating grouping from Decimal_point leftward, with the last repeated. If grouping by 9s is not large enough, allow a-f to get grouping up to 15 ;-). Example above would be format(1234, '8.1f;T.;P,') > II. Javaesque version -- FormatSymbols object > > This is essentially the same idea as previous one but involves modifying > the format() builtin to accept a symbols object and pass it to > __format__ methods. This moves the work outside of the format string > itself: > > DEB = FormatSymbols(comma='_') > EXT = FormatSymbols(comma=' ') > . . . > format(x, '8.1f', DEB) > format(y, '8d', EXT) > > The advantage is that this technique is easily extendable beyond simple > symbol translations and could possibly allow specification of grouping > sizes in hundreds and whatnot. It also looks more like a real program > as opposed to a formatting mini-language. The disadvantage is that > it is likely slower and it requires mucking with the currently dirt simple > format() / __format__() protocol. It may also be harder to integrate > with existing __format__ methods which are currently very string oriented. I suggested in the thread in exposing the format parse result that the resulting structure (dict or named tuple) could become an alternative, wordy interface to the format functions. I think the mini-language itself should stay mini. Terry Jan Reedy
- Previous message: [Python-ideas] Customizing format()
- Next message: [Python-ideas] Customizing format()
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-ideas mailing list