Google doc dates returned as unicode (e.g., \ue907)
Example code:
import os from google.oauth2.credentials import Credentials from googleapiclient.discovery import build from google.auth import default def get_document_dates(doc_id, creds_file=None): scopes = ['https://www.googleapis.com/auth/documents.readonly'] if creds_file and os.path.exists(creds_file): creds = Credentials.from_service_account_file(creds_file, scopes=scopes) else: creds, project = default(scopes=scopes) # Build the Docs API service service = build('docs', 'v1', credentials=creds) # Get the document document = service.documents().get( documentId=doc_id, fields='body' ).execute() # Access the document's content content = document.get('body').get('content') # Process each element for element in content: if 'paragraph' in element: paragraph = element.get('paragraph') elements = paragraph.get('elements', []) for elem in elements: print(elem)
The first section of the doc:
I want to parse the date via the python API: Jan 13, 2025.
The first few elements printed:
{'startIndex': 1, 'endIndex': 5, 'textRun': {'content': '\ue907 | ', 'textStyle': {}}}
{'startIndex': 5, 'endIndex': 6, 'richLink': {'richLinkId': 'kix.p3Xj3hkh7bXl', 'textStyle': {}, 'richLinkProperties': {'title': 'Asana Board New NGS Submissions', 'uri': 'https://www.google.com/calendar/event?eid=XXX'}}}
{'startIndex': 6, 'endIndex': 7, 'textRun': {'content': '\n', 'textStyle': {}}}
{'startIndex': 7, 'endIndex': 18, 'textRun': {'content': 'Attendees: ', 'textStyle': {}}}
The date is returned in the first element as \ue907. How can that be converted to a date?
Note: there is a richLinkId in the second element, but that is for a separate calendar element, and not the Jan 13, 2025 date element.
More generally, why are date elements returned as unicode instead of something easier to work with?
