7.1. Unicode — Python - from None to AI
re.UNICODEre.ASCIIASCII for letters in latin alphabet
UNICODE includes diacritics and accent characters (ąśćłóźć, etc.)
>>> import string >>> >>> >>> string.ascii_lowercase 'abcdefghijklmnopqrstuvwxyz' >>> >>> string.ascii_uppercase 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' >>> >>> string.ascii_letters 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> import unicodedata >>> >>> >>> unicodedata.name('a') 'LATIN SMALL LETTER A' >>> >>> unicodedata.name('ą') 'LATIN SMALL LETTER A WITH OGONEK' >>> >>> unicodedata.name('ś') 'LATIN SMALL LETTER S WITH ACUTE' >>> >>> unicodedata.name('ł') 'LATIN SMALL LETTER L WITH STROKE' >>> >>> unicodedata.name('ż') 'LATIN SMALL LETTER Z WITH DOT ABOVE' >>>
>>> print('\U0001F680') 🚀
>>> import unicodedata >>> >>> >>> a = '\U0001F9D1' # 🧑 >>> b = '\U0000200D' # '' >>> c = '\U0001F680' # 🚀 >>> >>> astronaut = a + b + c >>> print(astronaut) 🧑🚀 >>> >>> unicodedata.name(a) 'ADULT' >>> >>> unicodedata.name(b) 'ZERO WIDTH JOINER' >>> >>> unicodedata.name(c) 'ROCKET' >>> >>> unicodedata.name(astronaut) Traceback (most recent call last): TypeError: name(): argument 1 must be a unicode character, not a string of length 3