char 128? no... 256
Roman Suzi
rnd at onego.ru
Wed Feb 12 12:18:02 EST 2003
More information about the Python-list mailing list
Wed Feb 12 12:18:02 EST 2003
- Previous message (by thread): char 128? no... 256
- Next message (by thread): char 128? no... 256
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 12 Feb 2003, Afanasiy wrote: >On Wed, 12 Feb 2003 15:50:53 GMT, Afanasiy <abelikov72 at hotmail.com> wrote: >>On Wed, 12 Feb 2003 03:18:43 GMT, Afanasiy <abelikov72 at hotmail.com> wrote: >>Now, even encoding the 'latin-1', 8 bit, is problematic, because symbols >>which are 8 bit in Windows, such as the TradeMark symbol will not encode >>into 8 bit, as the ordinal value in the Unicode object is 8482. >> >>This is hex 99 on a plain Windows 2000 install, I presume 'latin-1'. That is why your Windows doesn't use latin-1. $ grep -i trade /usr/local/lib/python2.3/encodings/*.py cp1250.py: 0x0099: 0x2122, # TRADE MARK SIGN cp1251.py: 0x0099: 0x2122, # TRADE MARK SIGN cp1252.py: 0x0099: 0x2122, # TRADE MARK SIGN cp1253.py: 0x0099: 0x2122, # TRADE MARK SIGN cp1254.py: 0x0099: 0x2122, # TRADE MARK SIGN cp1255.py: 0x0099: 0x2122, # TRADE MARK SIGN cp1256.py: 0x0099: 0x2122, # TRADE MARK SIGN cp1257.py: 0x0099: 0x2122, # TRADE MARK SIGN cp1258.py: 0x0099: 0x2122, # TRADE MARK SIGN mac_cyrillic.py: 0x00aa: 0x2122, # TRADE MARK SIGN mac_greek.py: 0x0093: 0x2122, # TRADE MARK SIGN mac_iceland.py: 0x00aa: 0x2122, # TRADE MARK SIGN mac_latin2.py: 0x00aa: 0x2122, # TRADE MARK SIGN mac_roman.py: 0x00aa: 0x2122, # TRADE MARK SIGN mac_turkish.py: 0x00aa: 0x2122, # TRADE MARK SIGN palmos.py: 0x0099: 0x2122, # TRADE MARK SIGN So, you need to convert to one of these instead of latin-1. (Hmmm... I thought cp1250 is latin1.) Aliases of latin-1: '8859' : 'latin_1', 'cp819' : 'latin_1', 'csisolatin1' : 'latin_1', 'ibm819' : 'latin_1', 'iso8859' : 'latin_1', 'iso_8859_1' : 'latin_1', 'iso_8859_1_1987' : 'latin_1', 'iso_ir_100' : 'latin_1', 'l1' : 'latin_1', 'latin' : 'latin_1', 'latin1' : 'latin_1', >>(Which is iso-8859-1 afaik) This will show up in webpages designated : >> >><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> >> >>This will show up in notepad... and in my non-unicode text editors. >> >>It always shows up as the TradeMark symbol. >> >>So how would I encode this Unicode character, 8482 so that it would >>show up as a TradeMark symbol on Windows 2000 machines. Windows 2000 >>can display a TradeMark symbol in non Unicode applications. > >To clarify, the TradeMark symbol is being transformed to Unicode #8482 >automatically, presumably by COM or ADO. In Python, I do not know how >I am supposed to be able to print (for example) the Unicode object I >receive which contains this transformed TradeMark symbol. s = u"Your Unicode string\u2122" s = s.replace(u"\u2122", u"(tm)") print s.encode("latin-1") Or, most probably: s = u"Bill Gates Makes Your Life Interesting\u2122" print s.encode("cp1250") Sincerely yours, Roman Suzi P.S. All Trademarks belong to their respective owners. ;-) -- rnd at onego.ru =\= My AI powered by Linux RedHat 7.3
- Previous message (by thread): char 128? no... 256
- Next message (by thread): char 128? no... 256
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list