With that explanation, that it is one case out of six that fails, for whatever reason, I agree.
That leaves the issue of whether the fix is the right one. I currently agree with Victor that we should do what the rest of Python does and what is most universally useful. That fact that an old standard requires a *storage* encoding for a nearly unused field for .gz files that (I believe) only works for Western Europe, does not mean we should use it for *opening* .tar files. WestEuro-centrism is as bad as Anglo-centrism. If the unicode filename cannot be Latin-1 encoded, the filename field should be left blank. But it seems to me that the filename should be converted to the bytes that the user wants, expects, and can use. |