Message65490
| Author | ishimoto |
|---|---|
| Recipients | amaury.forgeotdarc, gvanrossum, ishimoto |
| Date | 2008-04-15.01:40:25 |
| SpamBayes Score | 0.0194016 |
| Marked as misclassified | No |
| Message-id | <1208223627.31.0.909635757536.issue2630@psf.upfronthosting.co.za> |
| In-reply-to |
| Content | |
|---|---|
> I think this has potential, but it is too liberal. There are many more
> characters that cannot be assumed printable, e.g. many of the Latin-1
> characters in the range 0x80 through 0x9F. Isn't there some Unicode
> data table that shows code points that are safely printable?
As Michael Urman pointed out, we can use Unicode properties.
Or we can define a set of non-printable characters (e.g.
sys.nonprintablechars).
> OTOH there are other potential use cases where it would be nice to see
> the \u escapes, e.g. when one is concerned about sequences that print
> the same but don't have the same content (e.g. pre-normalization).
For such cases, print(s.encode("ascii", "backslashreplace")) might work. |
|
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2008-04-15 01:40:27 | ishimoto | set | spambayes_score: 0.0194016 -> 0.0194016 recipients: + ishimoto, gvanrossum, amaury.forgeotdarc |
| 2008-04-15 01:40:27 | ishimoto | set | spambayes_score: 0.0194016 -> 0.0194016 messageid: <1208223627.31.0.909635757536.issue2630@psf.upfronthosting.co.za> |
| 2008-04-15 01:40:26 | ishimoto | link | issue2630 messages |
| 2008-04-15 01:40:25 | ishimoto | create | |