PyYaml?
Clark C. Evans
cce at clarkevans.com
Mon Sep 20 17:01:12 EDT 2004
More information about the Python-list mailing list
Mon Sep 20 17:01:12 EDT 2004
- Previous message (by thread): PyYaml?
- Next message (by thread): PyYaml?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sun, Sep 19, 2004 at 02:53:22PM +0100, Paul Moore wrote: | It seems to claim to be different things at different times - a | serialization format, a config file format, a replacement for XML At conception, I wanted a text format for invoices and other transactional business documents that was: (a) very human readable, (b) loaded into native data structures without requiring a DOM or a bunch of parser-hand-holding, (c) had a simple enough information model that a schema and transformation language would not be a serious exercise in topology. Brian Ingerson, one of the other co-authors was working on something similar to Pickle for Perl. | At the time, I was looking for a config format, and it wasn't | *quite* what I wanted, because some of the serialization and XML | aspects made it slightly clumsy as a config format. That some people use it for configuration files is due to Brian's influence on the more-than-one-way-to-write-it. Also, our earlier goals of a cross-language serialization tool got in the way of making it a great configuration file language. We've since had to make some compromises in this regard. Two other good uses for YAML include log files and tests suites. Neither of which were the initial focus, but alas, some things get a life of their own. | I suspect that people who want to use YAML for serialization, | or as an XML replacement, may feel the same way. And yet, I don't get | the feeling that YAML is being developed as a "compromise" format, so | I am obviously missing a key design principle. I work with business documents all the time; especially ones that move between computer systems using different programming languages. So, this was my primary goal; we advertise YAML as a serialization language since this is the 'easiest category' to put ourselves in. | As regards the existing YAML libraries for Python, when I looked I | found that the PyYAML website claimed that it was out of date with | respect to the latest spec. I also tried SYCK, which looks OK, but | which I did manage to provoke a crash from without trying too hard. Er ya. Don't do "syck.parse", I need to remove that function from the public interface. The newest release of Syck is far more stable so you may want to try it again. | None of this is a criticism of YAML and/or its libraries themselves. | However, it does make any suggestion that YAML be used to replace a | key part of the Python standard library seem a little premature, at | least. Definitely. YAML has at least two more years of work before it'd be ready for even proposing that it be considered as a core library. | I just re-read some of the YAML website. It appears clear from there | that YAML is designed as a serialization format. But there seems to | be a lack of justification as to *why* the design goals (section 1.1 | of the spec) are important. Also, security is *not* an explicit goal, | and section 3.1.6 (the "Construct" process) is completely lacking in | any discussion of the security or other implications of converting a | YAML file to a native language object. This seems somewhat surprising | in a specification for a serialization format... *nods* I hope the discussion above helps. I doubt that YAML would ever be a good 'drop-in' replacement for pickle. If in the far-distant future someone were to propose using YAML in this way, it'd probably be one of N 'formats' for a more pluggable pickle module. | More portable - hmm, OK. I'm not sure where you want portability | *between*, though. Pickle is, as far as I know, portable across | platforms. Are you talking about portability between languages? I | can't think where I'd want to dump a Python object for loading into | Perl or Ruby, though. Can you offer me some real-life use cases? Certainly. I work with several programmers in different shops, we move transactional documents around, traditionally with XML, but more so with YAML. By next year this time I hope it is all YAML. If you are just using hash/list/scalar data types (90% of our use cases) then YAML is a great option. In fact, recently we had a customer start using the Perl version of YAML with our system and it worked. | More readable - I'll give you this. And yes, it can be useful. I've | been stuffed before now with Java programs whose configuration is | stored as a serialized-to-disk object which is completely opaque to | external tools, let alone human readers. But this is a property that | is useful only in case of failure (if the config gets stuffed, I can | hand-hack the dump file, or if I forget what I set parameter X to, I | can look in the dump). If the application design *requires* the dump | format to be readable, we've moved away from serialization, and | started to talk about configuration formats (which is a separate | issue, one in which it is quite possible that YAML is strong, but | *not* one in which it is competing with Pickle). Exactly. The older PyYaml made configuration files painful, as it was trying to implicitly type all kinda of data (recognizing floating points, dates, etc.). We found this behavior to be a bit counter-productive for config files, and hence this "implicit typing" is now strictly optional, application directed behavior. Best, Clark
- Previous message (by thread): PyYaml?
- Next message (by thread): PyYaml?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list