[Python-Dev] Decoding incomplete unicode
Walter Dörwald
walter at livinglogic.de
Wed Aug 18 22:11:48 CEST 2004
More information about the Python-Dev mailing list
Wed Aug 18 22:11:48 CEST 2004
- Previous message: [Python-Dev] Decoding incomplete unicode
- Next message: [Python-Dev] Decoding incomplete unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Martin v. Löwis wrote:
> M.-A. Lemburg wrote:
>
>> I've thought about this some more. Perhaps I'm still missing
>> something, but wouldn't it be possible to add a feeding
>> mode to the existing stream codecs by creating a new queue
>> data type (much like the queue you have in the test cases of
>> your patch) and using the stream codecs on these ?
>
> Here is the problem. In UTF-8, how does the actual algorithm
> tell (the application) that the bytes it got on decoding provide
> for three fully decodable characters, and that 2 bytes are left
> undecoded, and that those bytes are not inherently ill-formed,
> but lack a third byte to complete the multi-byte sequence?
>
> On top of that, you can implement whatever queuing or streaming
> APIs you want, but you *need* an efficient way to communicate
> incompleteness.
We already have an efficient way to communicate incompleteness:
the decode method returns the number of decoded bytes.
The questions remaining are
1) communicate to whom? IMHO the info should only be used
internally by the StreamReader.
2) When is incompleteness OK? Incompleteness is of course
not OK in the stateless API. For the stateful API,
incompleteness has to be OK even when the input stream
is (temporarily) exhausted, because otherwise a feed mode
wouldn't work anyway. But then incompleteness is always OK,
because the StreamReader can't distinguish a temporarily
exhausted input stream from a permanently exhausted one.
The only fix for this I can think of is the final argument.
Bye,
Walter Dörwald
- Previous message: [Python-Dev] Decoding incomplete unicode
- Next message: [Python-Dev] Decoding incomplete unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-Dev mailing list