Add support for newlines, backslashes, trailing comments and unquoted UTF-8 by bbc2 · Pull Request #148 · theskumar/python-dotenv
This was also caught by Flake8 as:
./dotenv/main.py:19:2: W605 invalid escape sequence '\$'
./dotenv/main.py:19:4: W605 invalid escape sequence '\{'
./dotenv/main.py:19:8: W605 invalid escape sequence '\}'
./dotenv/main.py:19:12: W605 invalid escape sequence '\}'
This avoids the use of the `is_file` class variable by abstracting away the difference between `StringIO` and a file stream.
Parsing .env files is a critical part of this package. To make it easier to change it and test it, it is important that it is done in only one place. Also, code that uses the parser now doesn't depend on the fact that each key-value binding spans exactly one line. This will make it easier to handle multiline bindings in the future.
bbc2
deleted the
improve-parser
branch
bbc2
mentioned this pull request
bbc2
mentioned this pull request
bbc2
mentioned this pull request
johnbergvall pushed a commit to johnbergvall/python-dotenv that referenced this pull request
Aug 13, 2021… UTF-8 (theskumar#148) * Fix deprecation warning for POSIX variable regex This was also caught by Flake8 as: ./dotenv/main.py:19:2: W605 invalid escape sequence '\$' ./dotenv/main.py:19:4: W605 invalid escape sequence '\{' ./dotenv/main.py:19:8: W605 invalid escape sequence '\}' ./dotenv/main.py:19:12: W605 invalid escape sequence '\}' * Turn get_stream into a context manager This avoids the use of the `is_file` class variable by abstracting away the difference between `StringIO` and a file stream. * Deduplicate parsing code and abstract away lines Parsing .env files is a critical part of this package. To make it easier to change it and test it, it is important that it is done in only one place. Also, code that uses the parser now doesn't depend on the fact that each key-value binding spans exactly one line. This will make it easier to handle multiline bindings in the future. * Parse newline, UTF-8, trailing comment, backslash This adds support for: * multiline values (i.e. containing newlines or escaped \n), fixes theskumar#89 * backslashes in values, fixes theskumar#112 * trailing comments, fixes theskumar#141 * UTF-8 in unquoted values, fixes theskumar#147 Parsing is no longer line-based. That's why `parse_line` was replaced by `parse_binding`. Thanks to the previous commit, users of `parse_stream` don't have to deal with this change. This supersedes a previous pull-request, theskumar#142, which would add support for multiline values in `Dotenv.parse` but not in the CLI (`dotenv get` and `dotenv set`). The key-value binding regular expression was inspired by https://github.com/bkeepers/dotenv/blob/d749366b6009126b115fb7b63e0509566365859a/lib/dotenv/parser.rb#L14-L30 Parsing of escapes was fixed thanks to https://stackoverflow.com/questions/4020539/process-escape-sequences-in-a-string-in-python/24519338#24519338
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters