`Git.execute` doc is inaccurate on Windows and recommends `shlex` too broadly

The behavior of `Git.execute` when a single string is passed has long been unclear from the documentation, and remains unclear after #2144. I am inclined to consider that PR to have been a move in the right direction--but it also introduced some significant inaccuracies, with possible security implications.

This is a documentation bug only. It can be seen as a sequel to #2016, reflecting the situation as it stands since #2144 (which closed #2016). I hope to fix this soon by building on #2144 to combine the advantages of the wording both before and after that change while at least mostly addressing the disadvantages of both. But I may not be able to do so immediately--and also maybe someone else has a better idea of how to do it than I do--so I'm opening this to track it.

### The documentation bug

Since #2144, the `Git.execute` docstring claims:

> A string is accepted, but with `shell` set to ``False`` it is passed as a single executable name to `subprocess.Popen`. For example, ``"git log -n 1"`` looks for an executable literally named ``git log -n 1`` and will fail with `GitCommandNotFound`.

While that is accurate on Unix-like systems, it is inaccurate on Windows. For example:

```text
(GitPython) E:\repos\GitPython [master]> python
Python 3.12.13 (main, May  4 2026, 21:19:41) [MSC v.1944 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import git
>>> g = git.Git()
>>> g.execute(['git', 'version'])
'git version 2.54.0.windows.1'
>>> g.execute('git version')
'git version 2.54.0.windows.1'
>>> g.execute(['git', 'version'], shell=False)
'git version 2.54.0.windows.1'
>>> g.execute('git version', shell=False)
'git version 2.54.0.windows.1'
```

`shell=False` is the default and thus makes no difference there, but I included it to make extra clear that no shell is involved in parsing the command. And to use the example from the documentation:

```text
>>> g.execute('git log -n 1')
'commit 7b83f7a99af4313264650339e08601834b3ca968\nMerge: c7648c0c d172e541\nAuthor: Sebastian Thiel <sebastian.thiel@icloud.com>\nDate:   Thu May 7 18:50:11 2026 +0800\n\n    Merge pull request #2144 from mvanhorn/osc/2016-clarify-execute-string-arg\n    \n    docs(cmd): clarify Git.execute() string vs list command argument'
>>> g.execute('git log -n 1', shell=False)
'commit 7b83f7a99af4313264650339e08601834b3ca968\nMerge: c7648c0c d172e541\nAuthor: Sebastian Thiel <sebastian.thiel@icloud.com>\nDate:   Thu May 7 18:50:11 2026 +0800\n\n    Merge pull request #2144 from mvanhorn/osc/2016-clarify-execute-string-arg\n    \n    docs(cmd): clarify Git.execute() string vs list command argument'
```

### What's going on

On Unix-like systems, a command's arguments are an array of separate strings. If a command is written as a string, a shell--or something else implementing shell-like syntax--has to parse that string into separate arguments, and they are passed separately. The program actually receives a C-style `argv` array. This is generally intuitive to programmers, and it is Python quirks that are unintuitive: that `subprocess.Popen`, and thus `Git.execute`, convert a single string into a singleton sequence is confusing; I think this motivated #2144, and the clarity problem that it has made progress toward addressing is indeed real.

The opposite happens on Windows, where programs do not really receive separate arguments, but are instead handed a command-line string, which the program is then itself responsible for parsing. Different programs can, and sometimes do, even choose different ways to parse them! Fortunately, most Windows programs use the same rules to do so, and the operating system provides a facility to do this easily (which C programs that use 2- or 3-argument signatures for `main` call automatically before `main` runs). Unfortunately, these quoting rules are different from those used in any other operating system or any shell, and they are somewhat confusing. The command-line string furthermore is *also* parsed before the program runs, but only to determine what part of it identifies the program that is to be run (unless that is specified in another way, which is uncommon).

So on Windows, passing separate arguments actually requires them to be combined using Windows-specific whitespace and quoting/escaping syntax *before* the operating system is asked to run the program. `Git.execute` delegates to `subprocess.Popen`, which takes care of combining arguments into a command-line string. Currently it does so by calling its [`list2cmdline`](https://github.com/python/cpython/blob/92d323aa023ee07b14ddca71a8185018f15e279f/Lib/subprocess.py#L583) helper, but that is an implementation detail on which we are not supposed to rely. Having obtained a single string, `Popen` then uses `CreateProcessW` in the Windows API to run the program. So long as the program being run recognizes the same syntax `list2cmdline` uses when combining arguments--which is the syntax Microsoft suggests, though some programs use custom parsing rules--running a program using a sequence of separate argument strings works.

### The linked issue

#2144 closed #2016, which was an issue mainly about this confusion. But the confusion there, as least as I understand it based on https://github.com/gitpython-developers/GitPython/issues/2016#issuecomment-2736556483, was that the Windows behavior was expected on macOS.

The documentation is now clearer about what happens on Unix-like systems. But I think this is not going to help people who think they can pass multiple arguments as a string (without a shell) *because they can* on the system they are using, and whose actual problem is that they need to make their code portable.

See also https://github.com/gitpython-developers/GitPython/issues/2016#issuecomment-2736689007 on the Windows-specific behavior of how `Popen`, and thus `Git.execute`, treat a single string.

### We should be careful about `shlex.split`

Another change in #2144 is the recommendation to use `shlex.split`:

> To split a command string into argv tokens, pass ``shlex.split(...)`` as a sequence or set `shell` to ``True`` (see the warning below).

`shlex.split` is indeed much safer than using a shell (https://github.com/gitpython-developers/GitPython/discussions/1896). It nonetheless often does not do the right thing, and it is not always safe. We should not recommend it as a default practice. At the same time, it may be right to mention it, as a form of harm reduction--it is usually a far better way to turn a string into separate arguments than passing them through a shell.

`shlex.split` is safe when the string is written in the syntax of a POSIX shell and fully trusted. If developers are able to write an argument sequence like `['foo', 'bar baz', 'quux']` and prefer to write it as `foo 'bar baz' quux` and pass that to `shlex.split`, that's fine.

Otherwise, it's not fine.

Sometimes someone may have a command that is quoted according to the syntax expected for a single command-line string on Windows, in which case this will sometimes be misparsed by `shlex.split` because `shlex.split` implements POSIX-shell-style syntax.

The bigger issue is security. If the input string to `shlex.split` contains attacker-controlled parts, they can still inject whitespace and quoting characters. It may feel more covenient and readable to write something like `f"git diff {ref1} {ref2} -- {path}"` or `f"git diff '{ref1}' '{ref2}' -- '{path}'"`, neither of which is safe. Especially with the current blanket recommendation to use `shlex.split`, I think developers may be misled into thinking those forms (and others) are okay provided that they have verified that untrusted arguments don't begin with the well-known `-`. If `ref1` or `ref2` is something like `v3.1.49 --output=target` for the first case, or `v3.1.49' --output=target'` for the second case, then the arbitrary path `target` has been written.

Of course, programs should use much greater sanitation, at least whenever using GitPython facilities that do not explicitly document that a particular form of sanitization is performed (and sometimes maybe even then, for purposes of defense in depth). Even when passing a list, one should sanitize untrusted input carefully. But once one has done this, one is thinking in terms of specific arguments--in which case, passing an explicitly constructed list is then easier to write and to read than using `shlex.split`, as well as being easier to reason about.

I think, though I am not sure, that the intention behind the current `shlex.split` recommendation is that it only be followed for hard-coded commands that were never built by interpolating anything. I don't think that would be clear to anyone who does not already know about `shlex.split`.

### The fix

Obviously it would be bad to write all that, or even a tenth of it, in the docstring. My first few attempts to write something that is acceptably concise, clear, and accurate have not been successful. I have also tried directing Claude Code to write something, and I have not been happy with that either; if that approach does work, then I'll of course follow the recently merged AI disclosure policy. (To clarify, no part of this issue is AI-generated.)

I hope to figure out a good wording in the next couple of days, unless someone else gets there first. For now, I'm opening this issue. (The process of writing it has hopefully gotten me closer to figuring out the fix.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`Git.execute` doc is inaccurate on Windows and recommends `shlex` too broadly #2146

The documentation bug

What's going on

The linked issue

We should be careful about `shlex.split`

The fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Git.execute doc is inaccurate on Windows and recommends shlex too broadly #2146

Description

The documentation bug

What's going on

The linked issue

We should be careful about shlex.split

The fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`Git.execute` doc is inaccurate on Windows and recommends `shlex` too broadly #2146

We should be careful about `shlex.split`