bpo-30588: document codecs.escape_decode by carlbordum · Pull Request #14747 · python/cpython

@carlbordum

@carlbordum

@mangrisano

@zooba

I believe this is okay, but if @serhiy-storchaka and @asvetlov want to block it (as discussed in the bug), happy to let them overrule me :)

@doerwalter

IMHO, if we document this function, the added description shouldn't describe what a generic *_decode function does, it should decribe what escape_decode specifically does. I.e as it is now the first sentence is redundant, the second too vague.

gpshead

Comment on lines +248 to +249

length consumed). This is useful for decoding ascii escape sequences mixed
with unicode characters.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "mixed with unicode characters" supposed mean in this context? data is a bytes-like object, it can't contain unicode runes. We should include an example of what this does that is different than using one of the text encoding codecs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Matthieu Dartiailh described on bugs.python.org, an example of ascii decode characters mixed with unicode is 'Δ\nΔ'.

Here is the difference:

>>> codecs.unicode_escape_decode(\nΔ')
(\x94\nÎ\x94', 5)
>>> codecs.escape_decode(\nΔ')
(b'\xce\x94\n\xce\x94', 5)
>>> codecs.escape_decode(\nΔ')[0].decode('utf-8')
\nΔ'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A real-world example. We can assume that many more homegrown 'solutions' exist.

Wouldn't it be great if kludgy, slow, error-prone workarounds people have come up with were replaced with something elegant and Python-worthy?

Please consider that this function is so rarely seen outside the Python developer world because it is kept almost a secret.

@csabella

@carlbordum please address the review comments. Thanks!

JelleZijlstra


.. function:: escape_decode(data, errors=None)

Decode the bytes-like object *data* and return a tuple (decoded object,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"bytes-like object" seems incorrect; it accepts strings too.

@JelleZijlstra

Closing as it's been a few years and the feedback hasn't been addressed.