bpo-37111: Add 'encoding' and 'errors' parameters to logging.basicCon… by vsajip · Pull Request #14008 · python/cpython

sourcejedi added a commit to sourcejedi/etckeeper that referenced this pull request

@sourcejedi

stderr does not belong to etckeeper-dnf
---------------------------------------

The Ansible dnf module uses the python dnf bindings.  In contexts like
these, stdout/stderr is owned by the host app (Ansible).  dnf should not
mess with stdout/stderr, unless the host app asks it to log there.

Specifically, we were breaking the JSON output of the Ansible dnf module.
This was only noticeable when the etckeeper message began with a "["
character.  Ansible has a mechanism to try and skip header messages like
login banners.  However, "[" is a valid character to start a JSON document.

https://unix.stackexchange.com/questions/511210/ansible-dnf-module-module-failure/

Solution
--------

Instead, log any non-zero exit status through the dnf logger.

For the pre-transaction etckeeper run, this message replaces an exception.
So we now avoid logging a traceback, which did not appear to add anything
useful.  (In my testing, dnf was continued to install after the exception
was logged).

I have also added a warning message for the post-transaction etckeeper run.

I switched from os.system() to subprocess.call().  The latter interface
supports better error reporting.  (When I tested os.system(), it returned
an errno number which was indistinguishable from an exit code).

etckeeper >/dev/null 2>&1
--------------------------

Any specific error messages from etckeeper are now discarded.  Because
unfortunately, python conventions about passing text strings have been
somewhat unclear.  It is an error to log strings if the encoding settings
of a registered logger cannot handle them.  Because a "bad" character
causes the entire string to be discarded, and optionally an exception
printed to stderr.  In a previous proposal I showed code that should be
correct in all expected cases.  However, the length of comment required to
define "all expected cases" was not very reassuring.

That was on top of explaining that dnf has a workaround that will avoid
raising exceptions when arbitrary unicode is logged to stdout.  My proposal
had to include that explanation, to show that we were not breaking a
system package manager.

It might sound strange to talk about error messages with characters
we cannot encode.  But I have one word for you: "filenames".

(After the recent python PR #14008[1], logging.basicConfig() defaults
to errors='backslashreplace'.  Also, errors= can be manually specified when
creating a lower-level logging "Handler".  I think this will resolve the
problem in the future.  It means that when programs are written for this
new version of python, it should be the responsibility of the log setup
code to prevent UnicodeEncodeError.)

[1] python/cpython#14008