Commit graph

20 commits

Author SHA1 Message Date
Barney Gale
5d8a3e74b5
pathlib ABCs: Require one or more initialiser arguments (#113885)
Refuse to guess what a user means when they initialise a pathlib ABC
without any positional arguments. In mainline pathlib it's normalised to
`.`, but in the ABCs this guess isn't appropriate; for example, the path
type may not represent the current directory as `.`, or may have no concept
of a "current directory" at all.
2024-01-10 01:12:58 +00:00
Barney Gale
beb80d11ec
GH-113528: Deoptimise pathlib._abc.PurePathBase (#113559)
Apply pathlib's normalization and performance tuning in `pathlib.PurePath`, but not `pathlib._abc.PurePathBase`.

With this change, the pathlib ABCs do not normalize away alternate path separators, empty segments, or dot segments. A single string given to the initialiser will round-trip by default, i.e. `str(PurePathBase(my_string)) == my_string`. Implementors can set their own path domain-specific normalization scheme by overriding `__str__()`

Eliminating path normalization makes maintaining and caching the path's parts and string representation both optional and not very useful, so this commit moves the `_drv`, `_root`, `_tail_cached` and `_str` slots from `PurePathBase` to `PurePath`. Only `_raw_paths` and `_resolving` slots remain in `PurePathBase`. This frees the ABCs from the burden of some of pathlib's hardest-to-understand code.
2024-01-09 23:52:15 +00:00
Barney Gale
cdca0ce0ad
GH-113528: Deoptimise pathlib._abc.PurePathBase.relative_to() (again) (#113882)
Restore full battle-tested implementations of `PurePath.[is_]relative_to()`. These were recently split up in 3375dfe and a15a773.

In `PurePathBase`, add entirely new implementations based on `_stack`, which itself calls `pathmod.split()` repeatedly to disassemble a path. These new implementations preserve features like trailing slashes where possible, while still observing that a `..` segment cannot be added to traverse an empty or `.` segment in *walk_up* mode. They do not rely on `parents` nor `__eq__()`, nor do they spin up temporary path objects.

Unfortunately calling `pathmod.relpath()` isn't an option, as it calls `abspath()` and in turn `os.getcwd()`, which is impure.
2024-01-09 23:04:14 +00:00
Barney Gale
5c7bd0e398
GH-113528: Deoptimise pathlib._abc.PurePathBase.parts (#113883)
Implement `parts` using `_stack`, which itself calls `pathmod.split()`
repeatedly. This avoids use of `_tail`, which will be moved to `PurePath`
shortly.
2024-01-09 22:46:50 +00:00
Barney Gale
1092cfb201
GH-113528: Deoptimise pathlib._abc.PathBase.resolve() (#113782)
Replace use of `_from_parsed_parts()` with `with_segments()` in
`resolve()`.

No effect on `Path.resolve()`, which uses `os.path.realpath()`.
2024-01-09 19:50:23 +00:00
Barney Gale
9100fc407e
GH-113528: Deoptimise pathlib._abc.PathBase._make_child_relpath() (#113532)
Call straight through to `joinpath()` in `PathBase._make_child_relpath()`.
Move optimised/caching code to `pathlib.Path._make_child_relpath()`
2024-01-09 19:11:17 +00:00
Barney Gale
b3dba18eab
GH-113528: Speed up pathlib ABC tests. (#113788)
- Add `__slots__` to dummy path classes.
- Return namedtuple rather than `os.stat_result` from `DummyPath.stat()`.
- Reduce maximum symlink count in `DummyPathWithSymlinks.resolve()`.
2024-01-08 19:31:52 +00:00
Barney Gale
a15a7735e6
GH-113528: Deoptimise pathlib._abc.PurePathBase.relative_to() (#113529)
Replace use of `_from_parsed_parts()` with `with_segments()` in
`PurePathBase.relative_to()`, and move the assignment of `_drv`, `_root`
and `_tail_cached` slots into `PurePath.relative_to()`.
2024-01-06 21:37:38 +00:00
Barney Gale
37bd893a22
GH-113528: Deoptimise pathlib._abc.PurePathBase.parent (#113530)
Replace use of `_from_parsed_parts()` with `with_segments()`, and move
assignments to `_drv`, `_root`, _tail_cached` and `_str` slots into
`PurePath`.
2024-01-06 21:17:51 +00:00
Barney Gale
1e914ad89d
GH-113528: Deoptimise pathlib._abc.PurePathBase.name (#113531)
Replace usage of `_from_parsed_parts()` with `with_segments()` in
`with_name()`, and take a similar approach in `name` for consistency's
sake.
2024-01-06 20:50:25 +00:00
Barney Gale
3375dfed40
GH-113568: Stop raising deprecation warnings from pathlib ABCs (#113757) 2024-01-05 22:56:04 +00:00
Barney Gale
3c4e972d6d
GH-113568: Stop raising auditing events from pathlib ABCs (#113571)
Raise auditing events in `pathlib.Path.glob()`, `rglob()` and `walk()`,
but not in `pathlib._abc.PathBase` methods. Also move generation of a
deprecation warning into `pathlib.Path` so it gets the right stack level.
2024-01-05 21:41:19 +00:00
Barney Gale
c2e8298eba
GH-113225: Speed up pathlib.Path.glob() (#113226)
Use `os.DirEntry.path` as the string representation of child paths, unless
the parent path is empty, in which case we use the entry `name`.
2024-01-04 20:48:26 +00:00
Barney Gale
b664d91599
GH-113225: Speed up pathlib._abc.PathBase.glob() (#113556)
`PathBase._scandir()` is implemented using `iterdir()`, so we can use its
results directly, rather than passing them through `_make_child_relpath()`.
2023-12-28 22:23:01 +00:00
Barney Gale
1b19d73768
GH-110109: pathlib ABCs: drop use of warnings._deprecated() (#113419)
The `pathlib._abc` module will be made available as a PyPI backport
supporting Python 3.8+. The `warnings._deprecated()` function was only
added last year, and it's private from an external package perspective, so
here we switch to `warnings.warn()` instead.
2023-12-27 15:40:03 +00:00
Barney Gale
f8b6e171ad
GH-110109: pathlib ABCs: drop use of io.text_encoding() (#113417)
Do not use the locale-specific default encoding in `PathBase.read_text()`
and `write_text()`. Locale settings shouldn't influence the operation of
these base classes, which are intended mostly for implementing rich paths
on *nonlocal* filesystems.
2023-12-27 15:32:35 +00:00
Barney Gale
a0d3d3ec9d
GH-110109: pathlib ABCs: do not vary path syntax by host OS. (#113219)
Change the value of `pathlib._abc.PurePathBase.pathmod` from `os.path` to
`posixpath`.

User subclasses of `PurePathBase` and `PathBase` previously used the host
OS's path syntax, e.g. backslashes as separators on Windows. This is wrong
in most use cases, and likely to catch developers out unless they test on
both Windows and non-Windows machines.

In this patch we change the default to POSIX syntax, regardless of OS. This
is somewhat arguable (why not make all aspects of syntax abstract and
individually configurable?) but an improvement all the same.

This change has no effect on `PurePath`, `Path`, nor their subclasses. Only
private APIs are affected.
2023-12-22 18:09:50 +00:00
Barney Gale
237e2cff00
GH-110109: Fix misleading pathlib._abc.PurePathBase repr (#113376)
`PurePathBase.__repr__()` produces a string like `MyPath('/foo')`. This
repr is incorrect/misleading when a subclass's `__init__()` method is
customized, which I expect to be the very common.

This commit moves the `__repr__()` method to `PurePath`, leaving
`PurePathBase` with the default `object` repr.

No user-facing changes because the `pathlib._abc` module remains private.
2023-12-22 15:11:16 +00:00
Barney Gale
23df46a1dd
GH-112906: Fix performance regression in pathlib path initialisation (#112907)
This was caused by 76929fdeeb, specifically its use of `super()` and its
packing/unpacking `*args`.

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2023-12-10 00:06:27 +00:00
Barney Gale
a98e7a8112
GH-110109: Move pathlib ABCs to new pathlib._abc module. (#112881)
Move `_PurePathBase` and `_PathBase` to a new `pathlib._abc` module, and
drop the underscores from the class names.

Tests are mostly left alone in this commit, but they'll be similarly split
in a subsequent commit.

The `pathlib._abc` module will be published as an independent PyPI package
(similar to how `zipfile._path` is published as `zipp`), to be refined
and stabilised prior to its possible addition to the standard library.
2023-12-09 16:07:40 +01:00