use `sys.platform == "cygwin"` to figure out when we are using cygwin by gcmarx · Pull Request #2027 · gitpython-developers/GitPython
I've looked into this further.
Keeping uname can be temporary, I hope
I’m wondering if the git executable itself can be convinced to tell us about itself, but don’t currently have a way to test that. I’ll try to poke around at that later this week.
Thanks! Yes, using uname for this in the way we have been doing (or in any way at all, really) doesn't facilitate high-quality detection of whether git is a Cygwin program. Nonetheless, I think the changes in #2026 up to cffa264 (i.e. the tip of the branch prior to recent changes) are strictly improvements compared to what is on our main now, and that they can probably be integrated as an initial mitigation of--though still not a complete fix for--#1979.
I agree that, ultimately, we will want to do something else. I do think that will involve inspecting the contents of the git executable; though it might imaginably use some other technique to figure it out; or it might turn out not to be worthwhile for GitPython to try to figure this out, in which case it could still be changed to just check sys.platform == "cygwin", as in this PR.
Why I advocate starting that way
The reason I would favor the (original) approach taken in #2026 as a first step is that it is maximally consistent with the previous behavior. This is valuable because GitPython is widely used, and it's easy to break existing usage by accident, including in Cygwin-related ways (e.g. #1646).
I think that makes it a good initial change, and since it addresses #1979 in the cases where people seem usually to have encountered it--where the environment has nothing to do with Windows or Cygwin--we could even maybe do a patch release shortly afterwards. Even if not, it would be in place so that, if a patch release is done for some other reason, it will have it.
Also, as detailed below (at the very end), I am not sure if always returning True when sys.platform == "cygwin" does the right thing on MSYS2.
Counterpoints/caveats
Although my position is that we should not rush to replace the implementation of is_cygwin_git with a simple check to sys.platform == "cygwin", my analysis would be incomplete if I did not acknowledge three arguments for doing so:
- It's very simple.
- It's not necessarily less accurate than what we've been doing.
- Code from before 2022 that it would break has probably already broken.
The first point is self-evident, while the second and third touch on some things that are potentially of interest to continuing work on is_cygwin_git no matter how we proceed, and are thus detailed below.
2. The uname way isn't very accurate
Neither the value of
sys.platformnor “unamein the same directory asgit” seem like a bulletproof way to tell whether thegitin question is Cygwin git or not.
Yes, the "uname in the same directory as git" way we've been doing is admittedly not good, and very brittle.
In terms of accuracy, it may be worse than the approach here of just checking sys.platform == "cygwin" and assuming git is a Cygwin program if the Python interpreter is a Cygwin program. This may be so even when changed to include such a check (though that change, in 3f5a942, is good and definitely one of the benefits of #2026). Here are a few examples of why this may be so:
-
GitPython could be used in a Python script run by (or otherwise through)
git, such as a customgitcommand, a transport command, or a hook. Then thegitexecutable GitPython finds will usually be the one ingit'slibexec/git-coredirectory--more precisely, the one in the directorygit --exec-pathreports. This is becausegitcurrently unconditionally prepends that directory toPATHwhen performing most subprocess invocations. That directory does not also contain aunameexecutable. -
Checking if the
gitexecutable is in the same directory as aunameexecutable that reports the operating system as Cygwin is--when it works--effectively a way to check ifgitis in a Cygwin/usr/bindirectory where the Cygwin package manager places both binaries.is_cygwin_gitand supportingpy_wherefunctions are written as they are for historical reasons. As detailed below, when they were introduced, the check had different semantics, and the standard library had fewer facilities. But if we were to introduce wholly new code today that guesses, based on the location ofgit, whethergitis from Cygwin, then (ignoring the MSYS2 issue) we would probably do something like this instead:sys.platform == "cygwin" and git_executable and shutil.which(git_executable) == "/usr/bin/git"
The problem is that one can have Cygwin
gitelsewhere--even when using the "primary"gitexecutable rather than the one inlibexec/git-core. This happens if one builds it from source oneself to test a patch or use a different version, or if one runs it through a symlink. -
Eliminating the existing implementation, as done in this PR, would solve other problems unrelated to #1979.
When
is_cygwin_gitis used, the value of itsgit_executableargument is most often"git". To avoid an excessive slowdown,is_cygwin_gitcaches results per-argument in_is_cygwin_cache. This cache entry is never invalidated. But which executable"git"runs--and even where that executable is found--can change. This can happen by changing the contents of the filesystem or changing the value of thePATHenvironment variable. Such a change has other notable effects, including that the cachedgitversion number may no longer be correct. We intend that programs that use GitPython can adjust to such a change by calling its top-levelrefreshfunction. For the most part, this works properly (including invalidating the version). But it does not modify_is_cygwin_cache.This can, of course, be fixed. But I acknowledge that if we no longer try to figure out if the
gitexecutable is a Cygwin program, then that would eliminate this bug automatically. (To be clear, the onus is not on #2026 to fix this bug, since nothing in that PR makes it any worse.)
So it may be that such cases--of which I suspect there are various others--make it so that the simpler change here of just using sys.platform == "cygwin" is more accurate not only than the current implementation but than anything resembling it. However, even if improving overall accuracy, we would run the risk of introducing new inaccuracies in particular unanticipated use cases.
3. The semantics of is_cygwin_git changed a few years ago
In #2027 (review) I noted that the distinction between the Python interpreter being a Cygwin program and the git executable being a Cygwin program is long-standing, and that it can be important. That's true. However, the current behavior of is_cygwin_git only really dates back to 2022. Anything in software using GitPython that a major change to is_cygwin_git would break now is probably not older than that, or it would've been broken then.
There have been two big pushes to improve Cygwin compatibility in GitPython:
-
The first was Support Cygwin's Git on Windows #533 in 2016, which introduced
is_cygwin_gitin e6e23ed. At that time, it began:if not is_win: return False Where:
is_win = (os.name == 'nt') I believe that, even at that time and even in Python 2,
os.nameon Cygwin evaluated to"posix"as it does today. Thus, the scenario whereis_cygwin_gitwould returnTruewas when the Python interpreter was a native Windows executable andgitwas a Cygwin executable! This makes sense, since the title of that PR was "Support Cygwin's Git on Windows".At that time, AppVeyor was used for CI. The
.appveyor.ymlfile from that time confirms that this is what was going on--the original purpose ofis_cygwin_gitwas to allow native Windows Python programs to use GitPython with either Git for Windows or Cygwingit. In relevant part:environment: GIT_DAEMON_PATH: "C:\\Program Files\\Git\\mingw64\\libexec\\git-core" CYGWIN_GIT_PATH: "C:\\cygwin\\bin;%GIT_DAEMON_PATH%" CYGWIN64_GIT_PATH: "C:\\cygwin64\\bin;%GIT_DAEMON_PATH%" matrix: ## MINGW # - PYTHON: "C:\\Python27" PYTHON_VERSION: "2.7" GIT_PATH: "%GIT_DAEMON_PATH%" - PYTHON: "C:\\Python34-x64" PYTHON_VERSION: "3.4" GIT_PATH: "%GIT_DAEMON_PATH%" - PYTHON: "C:\\Python35-x64" PYTHON_VERSION: "3.5" GIT_PATH: "%GIT_DAEMON_PATH%" - PYTHON: "C:\\Miniconda35-x64" PYTHON_VERSION: "3.5" IS_CONDA: "yes" GIT_PATH: "%GIT_DAEMON_PATH%" ## Cygwin # - PYTHON: "C:\\Miniconda-x64" PYTHON_VERSION: "2.7" IS_CONDA: "yes" IS_CYGWIN: "yes" GIT_PATH: "%CYGWIN_GIT_PATH%" - PYTHON: "C:\\Python35-x64" PYTHON_VERSION: "3.5" GIT_PATH: "%CYGWIN64_GIT_PATH%" IS_CYGWIN: "yes" install: - set PATH=%PYTHON%;%PYTHON%\Scripts;%GIT_PATH%;%PATH% -
The second was Re-enable Cygwin CI and get most tests passing #1455 in 2022, which included this very significant change to
is_cygwin_gitin 96fae83:def is_cygwin_git(git_executable: Union[None, PathLike]) -> bool: - if not is_win: + if is_win: + # is_win seems to be true only for Windows-native pythons + # cygwin has os.name = posix, I think return False
Much as #533 was guided by testing on AppVeyor set up so that Cygwin
gitwas being called only from native Windows builds of Python, #1455 was guided by testing on GitHub Actions set up so that Cygwingitwas being called only from Cygwin builds of Python.I cannot tell from the history whether this distinction was recognized at the time! It may be that this change was made based on the belief that
is_cygwin_githad already been intended to returnFalseexcept in Cygwin builds of Python, I am not sure. But it may be that it was intentional. It would make sense if, over time, it has become rarer and rarer to use anything but Git for Windows forgiton Windows, except when operating entirely within some other specific environment.
An important question is whether, since #1455, there is anything in GitPython that actually works because of the ability to identify a non-Cygwin git executable from Cygwin. Operations with such an executable seem either to work on main, in the original #2026 (cffa264), and here... or not to work in any of the three. An example of something that should in principle work if GitPython on Cygwin supports non-Cygwin git, but that fails and fails the same way in all three, is:
(.venv) ✔ ~/repos-cygwin/GitPython [main|⚑ 1]
09:27 $ GIT_PYTHON_GIT_EXECUTABLE=/cygdrive/c/Users/ek/scoop/shims/git python3.9 -c 'import git; r1 = git.Repo.clone_from("https://github.com/EliahKagan/trivial.git", "trivial"); r2 = r1.clone("trivial-clone", bare=True); print(r2.git_dir)'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/ek/repos-cygwin/GitPython/git/repo/base.py", line 1482, in clone
return self._clone(
File "/home/ek/repos-cygwin/GitPython/git/repo/base.py", line 1414, in _clone
finalize_process(proc, stderr=stderr)
File "/home/ek/repos-cygwin/GitPython/git/util.py", line 504, in finalize_process
proc.wait(**kwargs)
File "/home/ek/repos-cygwin/GitPython/git/cmd.py", line 834, in wait
raise GitCommandError(remove_password_if_present(self.args), status, errstr)
git.exc.GitCommandError: Cmd('/cygdrive/c/Users/ek/scoop/shims/git') failed due to: exit code(128)
cmdline: /cygdrive/c/Users/ek/scoop/shims/git clone -v --bare -- /home/ek/repos-cygwin/GitPython/trivial/.git trivial-clone
stderr: 'fatal: repository '/home/ek/repos-cygwin/GitPython/trivial/.git' does not exist
'
What happens there is that a decygpath operation is either not being done or not being done correctly in at least one place where it would need to happen to support Git for Windows git from Cygwin. I am not sure if this has ever worked.
So that's admittedly a reason to prefer this PR: it's not clear under what circumstances, if any, is_cygwin_git has done any better than this, since #1455.
But I think this examination of the history also shows that the original #2026 (up to cffa264) is safe and strictly improves correctness. The sys.platform == "win32" condition on main descends from the is_win check. (sys.platform == "win32" always has the same value as os.name == "nt", at least in Python 3 and I think since well before that.) When I introduced the TODO comment in 42e10c0 (#1859), I was worried that doing the check even when the platform doesn't seem to be Cygwin might've been intentional, and that checking if git seems to be a Cygwin executable only when sys.platform == "cygwin" might somehow break something. I think looking at the history confirms that this is not the case.
But another reason to start slow is MSYS2
Somewhat confusingly, Cygwin is not the only platform in which sys.platform is "cygwin".
We don't test MSYS2 on CI, though maybe we should. I don't know how often, if at all, people use GitPython in a Python interpreter that targets MSYS2. (This should not be confused with other environments that MSYS2 supplies, such as MINGW64, which is actually a native Windows target rather than a Cygwin-like target. I am talking about the MSYS environment of MSYS2, which is a non-Cygwin but Cygwin-like target.)
In MSYS2, sys.platform is "cygwin":
ek@Glub MSYS ~
$ python -c 'import sys; print(sys.platform)'
cygwin
But uname does not report CYGWIN:
ek@Glub MSYS ~
$ uname
MSYS_NT-10.0-19045
The facilities within GitPython that do Cygwin-related things, such as cygpath and decygpath, and their helpers, in git/util.py--which are used in more places when git is detected to be a Cygwin build--don't look like they will always work on MSYS2. While a MSYS2 does have /proc/cygdrive, it does not have /cygdrive; a path like /cygdrive/c in Cygwin is just /c is MSYS2. There may be other relevant differences.
As things stand now, where MSYS2 git run from MSYS2 is not considered to be a Cygwin git, MSYS2 does seem to work pretty well, though maybe not perfectly. I say that based on this test run; see #1988 (comment) for context. It may be good to look into that further before doing anything that would cause MSYS2 to be treated more like Cygwin.
This also applies to the strings approach pushed to #2026 after cffa264--strings will turn up cygwin in MSYS2 git as well:
ek@Glub MSYS ~
$ strings /usr/bin/git | grep -F cygwin
cygwin_conv_path
cygwin_internal
__imp_cygwin_internal
__imp_cygwin_conv_path
So this is part of why I'd be interested to integrate cffa264 first.