bpo-33305: Improve SyntaxError for invalid numerical literals. by serhiy-storchaka · Pull Request #6517 · python/cpython
It is easy to report only if an invalid digit (in the range 2-9 or 8-9) is occurred. In general case there are much subtle details, handling them will complicate the code too much:
- Not always an invalid character exists. This error can be raised at the end of the input.
- It can be non-ASCII. In this case we need to decode a multibyte UTF-8 for getting a character.
- It can be non-printable.
- Even if it is printable from the Unicode's point of view, it can look indistinguishably from other characters. For example, non-breakable space character looks like an ordinary space for humans, but not for the Python parser.
- Even in ASCII there are non-printable characters, or characters that need special handling: tab, newline, single quote, backslash, ...
It may be worth to produce more specialized error message for some cases, but just reporting the next invalid character is no a way.