GH-120794: Use example paths with multiple parts in pathlib docs (GH-122887)
In the documentation of `PosixPath` and `WindowsPath`, and their `Pure*`
equivalents, use example paths with multiple non-anchor parts.
(cherry picked from commit 363374cf69)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Re-order table of corresponding functions with the following priorities:
1. Pure functionality is at the top
2. `os.path` functions are shown before `os` functions
3. Similar functionality is kept together
4. Functionality follows docs order where possible
Add a few missed correspondences:
- `os.path.isjunction` and `Path.is_junction`
- `os.path.ismount` and `Path.is_mount`
- `os.lstat()` and `Path.lstat()`
- `os.lchmod()` and `Path.lchmod()`
Also add footnotes describing a few differences.
(cherry picked from commit cbac8a3888)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
Docs: spelling and grammar fixes (GH-122084)
Corrected some grammar and spelling issues in documentation.
(cherry picked from commit bc264eac3a)
Co-authored-by: Ville Skyttä <ville.skytta@iki.fi>
Co-authored-by: Russell Keith-Magee <russell@keith-magee.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
GH-119054: Add alt text to pathlib inheritance diagram (GH-121158)
(cherry picked from commit 6b280a8498)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
GH-119054: Add "Expanding and resolving paths" section to pathlib docs. (GH-120970)
Add dedicated subsection for `home()`, `expanduser()`, `cwd()`,
`absolute()`, `resolve()` and `readlink()`. The position of this section
keeps all the `Path` constructors (`Path()`, `Path.from_uri()`,
`Path.home()` and `Path.cwd()`) near the top. Within the section, closely
related methods are kept adjacent. Specifically:
- `home()` and `expanduser()` (the former calls the latter)
- `cwd()` and `absolute()` (the former calls the latter)
- `absolute()` and `resolve()` (both make paths absolute)
- `resolve()` and `readlink()` (both read symlink targets)
- Ditto `cwd()` and `absolute()`
- Ditto `absolute()` and `resolve()`
The "Other methods" section is removed.
(cherry picked from commit d6d8707ff2)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
GH-119054: Add "Renaming and deleting" section to pathlib docs. (GH-120465)
Add dedicated subsection for `pathlib.Path.rename()`, `replace()`,
`unlink()` and `rmdir()`.
(cherry picked from commit d88a1f2e15)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
GH-119054: Add "Creating files and directories" section to pathlib docs. (GH-120186)
Add dedicated subsection for `pathlib.Path.touch()`, `mkdir()`,
`symlink_to()` and `hardlink_to()`. Also note that `open()`, `write_text()`
and `write_bytes()` are often used to create files.
(cherry picked from commit c2d810b6d4)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Add a dedicated subsection for `open()`, `read_text()`, `read_bytes()`,
`write_text()` and `write_bytes()`.
(cherry picked from commit bd6d4ed645)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
Add a dedicated subsection for `Path.stat()`-related methods, specifically
`stat()`, `lstat()`, `exists()`, `is_*()`, and `samefile()`.
(cherry picked from commit 81d6336230)
docs: module page titles should not start with a link to themselves (GH-117099)
(cherry picked from commit bcb435ee8f)
Co-authored-by: Ned Batchelder <ned@nedbatchelder.com>
Since 6258844c, paths that might not exist can be fed into pathlib's
globbing implementation, which will call `os.scandir()` / `os.lstat()` only
when strictly necessary. This allows us to drop an initial `self.is_dir()`
call, which saves a `stat()`.
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
Replace tri-state `follow_symlinks` with boolean `recurse_symlinks` argument. The new argument controls whether symlinks are followed when expanding recursive `**` wildcards. The possible argument values correspond as follows:
follow_symlinks recurse_symlinks
=============== ================
False N/A
None False
True True
We therefore drop support for not following symlinks when expanding non-recursive pattern parts; it wasn't requested in the original issue, and it's a feature not found in any shells.
This makes the API a easier to grok by eliminating `None` as an option.
No news blurb as `follow_symlinks` was new in 3.13.
Explain the `full_match()` / `glob()` / `rglob()` pattern language in its own section. Move `rglob()` documentation under `glob()` and reduce duplicated text.
Return files and directories from `pathlib.Path.glob()` if the pattern ends
with `**`. This is more compatible with `PurePath.full_match()` and with
other glob implementations such as bash and `glob.glob()`. Users can add a
trailing slash to match only directories.
In my previous patch I added a `FutureWarning` with the intention of fixing
this in Python 3.15. Upon further reflection I think this was an
unnecessarily cautious remedy to a clear bug.
Add `ntpath.isreserved()`, which identifies reserved pathnames such as "NUL", "AUX" and "CON".
Deprecate `pathlib.PurePath.is_reserved()`.
---------
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Brett Cannon <brett@python.org>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
In 49f90ba we added support for the recursive wildcard `**` in
`pathlib.PurePath.match()`. This should allow arbitrary prefix and suffix
matching, like `p.match('foo/**')` or `p.match('**/foo')`, but there's a
problem: for relative patterns only, `match()` implicitly inserts a `**`
token on the left hand side, causing all patterns to match from the right.
As a result, it's impossible to match relative patterns from the left:
`PurePath('foo/bar').match('bar/**')` is true!
This commit reverts the changes to `match()`, and instead adds a new
`full_match()` method that:
- Allows empty patterns
- Supports the recursive wildcard `**`
- Matches the *entire* path when given a relative pattern
Remove a double negative in the documentation of `mkdir()`'s *exist_ok*
parameter.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Allow `os.PathLike` objects to be passed as patterns to `pathlib.Path.glob()` and `rglob()`. (It's already possible to use them in `PurePath.match()`)
While we're in the area:
- Allow empty glob patterns in `PathBase` (but not `Path`)
- Speed up globbing in `PathBase` by generating paths with trailing slashes only as a final step, rather than for every intermediate directory.
- Simplify and speed up handling of rare patterns involving both `**` and `..` segments.
This is a very soft deprecation of `PurePath.as_uri()`. We instead document
it as a `Path` method, and add a couple of sentences mentioning that it's
also available in `PurePath`.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
This method supports file URIs (including variants) as described in RFC 8089, such as URIs generated by `pathlib.Path.as_uri()` and `urllib.request.pathname2url()`.
The method is added to `Path` rather than `PurePath` because it uses `os.fsdecode()`, and so its results vary from system to system. I intend to deprecate `PurePath.as_uri()` and move it to `Path` for the same reason.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Brings `pathlib.Path.is_dir()` and `in line with `os.DirEntry.is_dir()`, which
will be important for implementing generic path walking and globbing.
Likewise `is_file()`.
This new exception type is raised instead of `NotImplementedError` when
a path operation is not supported. It can be raised from `Path.readlink()`,
`symlink_to()`, `hardlink_to()`, `owner()` and `group()`. In a future
version of pathlib, it will be raised by `AbstractPath` for these methods
and others, such as `AbstractPath.mkdir()` and `unlink()`.
This commit introduces a 'walk-and-match' strategy for handling glob patterns that include a non-terminal `**` wildcard, such as `**/*.py`. For this example, the previous implementation recursively walked directories using `os.scandir()` when it expanded the `**` component, and then **scanned those same directories again** when expanded the `*.py` component. This is wasteful.
In the new implementation, any components following a `**` wildcard are used to build a `re.Pattern` object, which is used to filter the results of the recursive walk. A pattern like `**/*.py` uses half the number of `os.scandir()` calls; a pattern like `**/*/*.py` a third, etc.
This new algorithm does not apply if either:
1. The *follow_symlinks* argument is set to `None` (its default), or
2. The pattern contains `..` components.
In these cases we fall back to the old implementation.
This commit also replaces selector classes with selector functions. These generators directly yield results rather calling through to their successors. A new internal `Path._glob()` method takes care to chain these generators together, which simplifies the lazy algorithm and slightly improves performance. It should also be easier to understand and maintain.
`PurePath.match()` now handles the `**` wildcard as in `Path.glob()`, i.e. it matches any number of path segments.
We now compile a `re.Pattern` object for the entire pattern. This is made more difficult by `fnmatch` not treating directory separators as special when evaluating wildcards (`*`, `?`, etc), and so we arrange the path parts onto separate *lines* in a string, and ensure we don't set `re.DOTALL`.
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Add a keyword-only *follow_symlinks* parameter to `pathlib.Path.glob()` and`rglob()`.
When *follow_symlinks* is `None` (the default), these methods follow symlinks except when evaluating "`**`" wildcards. When set to true or false, symlinks are always or never followed, respectively.
Add `pathlib.PurePath.with_segments()`, which creates a path object from arguments. This method is called whenever a derivative path is created, such as from `pathlib.PurePath.parent`. Subclasses may override this method to share information between path objects.
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
This argument allows case-sensitive matching to be enabled on Windows, and
case-insensitive matching to be enabled on Posix.
Co-authored-by: Steve Dower <steve.dower@microsoft.com>