Add Janet Lexer by sogaiu · Pull Request #2557 · pygments/pygments

@sogaiu

This PR is an attempt at adding a lexer for Janet.

There was an earlier attempt, PR #1519 by uvtc [1], but the current PR is mostly an independent work.

Snippet test content was adapted from the custom tests for tree-sitter-janet-simple.

Result of running tox locally gave roughly:

============ 4274 passed, 12 skipped in 17.71s =============
.pkg: _exit> python <path-elided>/pygments/venv/lib/python3.12/site-packages/pyproject_api/_backend.py True setuptools.build_meta
  py: OK (19.22=setup[0.94]+cmd[18.28] seconds)
  congratulations :) (19.31 seconds)

Cc: @pyrmont, @uvtc


[1] The earlier PR was studied however, and thus I've included uvtc in the AUTHORS file (along with pyrmont who helped with this PR).

@sogaiu

The most recent force-push was for updating handling of radix-based numbers. Number handling should be more uniform across decimal, hex, and radix now.

@sogaiu

I removed the use of an f-string and pushed that change.

@sogaiu

BTW, I've not had luck getting tox -e check to work locally yet.

Here is part of the invocation output that seems to result in erroring out:

python scripts/check_sources.py -i pygments/lexers/_mapping.py -i pygments/styles/_mapping.py -i docs/_build -i pygments/formatters/_mapping.py -i pygments/unistring.py -i tests/support/empty.py
venv/lib/python3.12/site-packages/virtualenv/config/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/config/cli/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/seed/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/seed/embed/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/seed/embed/via_app_data/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/seed/embed/via_app_data/pip_install/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/discovery/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/util/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/create/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/create/via_global_ref/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/create/via_global_ref/builtin/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/create/via_global_ref/builtin/cpython/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/create/via_global_ref/builtin/pypy/__init__.py:0: empty file
venv/lib/python3.12/site-packages/virtualenv/run/plugin/__init__.py:0: empty file
venv/lib/python3.12/site-packages/chardet/cli/__init__.py:0: empty file
venv/lib/python3.12/site-packages/chardet/metadata/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pluggy/_manager.py:218: using == None/True/False
venv/lib/python3.12/site-packages/pip/_vendor/chardet/cli/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_vendor/chardet/metadata/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_vendor/urllib3/connectionpool.py:321: using == None/True/False
venv/lib/python3.12/site-packages/pip/_vendor/urllib3/contrib/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_vendor/urllib3/contrib/_securetransport/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_vendor/urllib3/packages/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_vendor/urllib3/packages/backports/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_vendor/resolvelib/compat/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_internal/resolution/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_internal/resolution/legacy/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_internal/resolution/resolvelib/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_internal/utils/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_internal/operations/__init__.py:0: empty file
venv/lib/python3.12/site-packages/pip/_internal/operations/build/__init__.py:0: empty file
venv/lib/python3.12/site-packages/tox/config/__init__.py:0: empty file
venv/lib/python3.12/site-packages/tox/config/cli/__init__.py:0: empty file
venv/lib/python3.12/site-packages/tox/config/loader/__init__.py:0: empty file
venv/lib/python3.12/site-packages/tox/util/__init__.py:0: empty file
venv/lib/python3.12/site-packages/tox/tox_env/python/__init__.py:0: empty file
venv/lib/python3.12/site-packages/tox/tox_env/python/virtual_env/__init__.py:0: empty file
venv/lib/python3.12/site-packages/tox/tox_env/python/virtual_env/package/__init__.py:0: empty file
venv/lib/python3.12/site-packages/tox/tox_env/python/pip/__init__.py:0: empty file
venv/lib/python3.12/site-packages/tox/session/cmd/__init__.py:0: empty file
40 errors found.

That output looks related to this line.

The version of tox here appears to be 4.11.3 and it looks to me like in that version, tox/session/cmd/__init__.py is empty. At least, that's what this suggests to me.

Similarly for pip, the version of pip here appears to be 23.2.1 and it looks to me like in that version, pip/_internal/operations/build/__init__.py is empty.

Or may be I am just confused...in any case, any hints about how I might get tox -e check to work locally?

@sogaiu

I made some modifications to check_sources.py to proceed beyond the point of the empty file error output.

I got further error output that looks like:

python scripts/check_sources.py -i pygments/lexers/_mapping.py -i pygments/styles/_mapping.py -i docs/_build -i pygments/formatters/_mapping.py -i pygments/unistring.py -i tests/support/empty.py
venv/lib/python3.12/site-packages/pluggy/_manager.py:218: using == None/True/False
venv/lib/python3.12/site-packages/pip/_vendor/urllib3/connectionpool.py:321: using == None/True/False
2 errors found.

I took a look at those lines and what I found for pluggy 1.3.0 was:

        # if self._name2plugin[name] == None registration was blocked: ignore

and for pip 23.2.1 was:

            # This should never happen if self.block == True

These lines do not look like they are active to me (comment / commented?).

I tried disabling the associated checking in check_sources.py and reran tox -e check. The final output was:

  check: OK (18.14=setup[0.95]+cmd[0.04,0.14,2.01,1.81,13.19] seconds)
  congratulations :) (18.23 seconds)

@jeanas

You apparently have a virtual environment venv/ in your source tree. The tox -e check command simply looks at all .py files, so it's also finding the sources of other Python packages installed in the venv. You should just remove it.

@sogaiu

Thanks a lot for taking a look and the explanation.

venv is the virtual environment I'm using while working on pygments.

It seems that adding -i venv to an appropriate location in tox.ini works as well so I went with that.

@jeanas

You don't need a venv. The whole point of tox is that it manages venvs for you.

@sogaiu

Ah, indeed? I had no idea.

That sounds nice. Thanks for the explanation.

@sogaiu

@jeanas

As a rule, you should use your distro exclusively to install things into the global environment, and use virtual environments for everything else, possibly through a tool like tox or pipx.

Never ever use pip outside of a virtual environment unless you know very well what you're doing. (Depending on your distro, it might already give an error to protect you, if it has already adopted a standard known as PEP 668).

There is inconsistent advice about this in the wild because the situation was confusing and bad for a long time, but it's clarified now.

In the case of tox, simply try installing the tox package.

@sogaiu

It's nice to hear things have been clarified, thanks for mentioning it. The last time I used Python in any significant capacity I don't remember it being that way.

Working with multiple projects also means that one might want a different version of a tool per project for reliable / reproducible results, so I try to avoid installing certain tools globally.

jeanas

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work, here is some review.

sogaiu added 10 commits

November 7, 2023 12:02

@sogaiu

Thanks a lot for the review and suggestions.

I think I've responded to all comments and made attempts to address each item (except the Unicode identifier bits for which I gave my current thoughts).

(I've also renamed following_chars* and made related adjustments for readability and length reasons.)

jeanas

sogaiu added 2 commits

November 7, 2023 19:09

@Anteru

Uh ok. I was just about to merge this. What's the reason for closing this? We'd obviously like to have a Janet lexer in the next release.

@sogaiu

Ah sorry, there is a discussion starting here that gives background.

To give a little bit more information, we're using a hosted Zulip setup (which uses pygments AFAICT) for Janet community discussions and (AFAIK) we have no fine-grained control over how our pygments setup would behave (i.e. I don't think we can install a plugin to get alternate behavior for Janet syntax highlighting).

If pygments ends up with a Janet lexer with the requested changes to identifiers to include most of Unicode and we specify using Janet syntax highlighting, I believe (non-ASCII containing) Unicode identifiers will end up being shown as acceptable with no warning (see this comment for some of my concerns).

At this time, I do not feel comfortable being partially responsible for that kind of situation.

I didn't realize this might happen before submitting the initial PR. I realize the pygments project has its own priorities and can appreciate the comment here. Thus, I thought it would be better to hold off until the situation is better understood (or we find some way to customize our highlighting).

@jeanas

After discussing it we are OK with merging this in its current form. Thus I will reopen it and merge it. However, the 2.17 release was done in the meantime and there needs to be a bugfix release, so please allow a few days before the merge.

@sogaiu

Thanks for the consideration and explanation.

@jeanas

Thank you for your contribution.