Commit graph

37 commits

Author SHA1 Message Date
Barney Gale
22a442181d
GH-128520: Divide pathlib ABCs into three classes (#128523)
In the private pathlib ABCs, rename `PurePathBase` to `JoinablePath`, and
split `PathBase` into `ReadablePath` and `WritablePath`. This improves the
API fit for read-only virtual filesystems.

The split of `PathBase` entails a similar split of `CopyWorker` (implements
copying) and the test cases in `test_pathlib_abc`.

In a later patch, we'll make `WritablePath` inherit directly from
`JoinablePath` rather than `ReadablePath`. For a couple of reasons,
this isn't quite possible yet.
2025-01-11 19:27:47 +00:00
Barney Gale
95352dcb93
GH-127381: pathlib ABCs: remove PathBase.move() and move_into() (#128337)
These methods combine `_delete()` and `copy()`, but `_delete()` isn't part
of the public interface, and it's unlikely to be added until the pathlib
ABCs are made official, or perhaps even later.
2025-01-04 12:53:51 +00:00
Barney Gale
ef63cca494
GH-127381: pathlib ABCs: remove uncommon PurePathBase methods (#127853)
Remove `PurePathBase.relative_to()` and `is_relative_to()` because they
don't account for *other* being an entirely different kind of path, and
they can't use `__eq__()` because it's not on the `PurePathBase` interface.

Remove `PurePathBase.drive`, `root`, `is_absolute()` and `as_posix()`.
These are all too specific to local filesystems.
2024-12-29 22:07:12 +00:00
Barney Gale
c78729f2df
GH-127381: pathlib ABCs: remove PathBase.stat() (#128334)
Remove the `PathBase.stat()` method. Its use of the `os.stat_result` API,
with its 10 mandatory fields and low-level types, makes it an awkward fit
for virtual filesystems.

We'll look to add a `PathBase.info` attribute later - see GH-125413.
2024-12-29 21:42:07 +00:00
Barney Gale
8d9f52a7be
GH-127807: pathlib ABCs: move private copying methods to dedicated class (#127810)
Move 9 private `PathBase` attributes and methods into a new `CopyWorker`
class. Change `PathBase.copy` from a method to a `CopyWorker` instance.

The methods remain private in the `CopyWorker` class. In future we might
make some/all of them public so that user subclasses of `PathBase` can
customize the copying process (in particular reading/writing of metadata,)
but we'd need to make `PathBase` public first.
2024-12-22 02:22:08 +00:00
Barney Gale
f5ba74b819
GH-127807: pathlib ABCs: remove a few private attributes (#127851)
From `PurePathBase` delete `_globber`, `_stack` and `_pattern_str`, and
from `PathBase` delete `_glob_selector`. This helps avoid an unpleasant
surprise for a users who try to use these names.
2024-12-22 01:41:38 +00:00
Barney Gale
a959ea1b0a
GH-127807: pathlib ABCs: remove PurePathBase._raw_paths (#127883)
Remove the `PurePathBase` initializer, and make `with_segments()` and
`__str__()` abstract. This allows us to drop the `_raw_paths` attribute,
and also the `Parser.join()` protocol method.
2024-12-22 01:17:59 +00:00
Barney Gale
7146f18946
GH-127807: pathlib ABCs: remove PathBase._unsupported_msg() (#127855)
This method helped us customise the `UnsupportedOperation` message
depending on the type. But we're aiming to make `PathBase` a proper ABC
soon, so `NotImplementedError` is the right exception to raise there.
2024-12-12 17:39:24 +00:00
Barney Gale
292afd1d51
GH-127381: pathlib ABCs: remove remaining uncommon PathBase methods (#127714)
Remove the following methods from `pathlib._abc.PathBase`:

- `expanduser()`
- `hardlink_to()`
- `touch()`
- `chmod()`
- `lchmod()`
- `owner()`
- `group()`
- `from_uri()`
- `as_uri()`

These operations aren't regularly supported in virtual filesystems, so they
don't win a place in the `PathBase` interface. (Some of them probably don't
deserve a place in `Path` :P.) They're quasi-abstract (except `lchmod()`),
and they're not called by other `PathBase` methods.
2024-12-12 06:49:34 +00:00
Barney Gale
12b4f1a5a1
GH-127381: pathlib ABCs: remove PathBase.samefile() and rarer is_*() (#127709)
Remove `PathBase.samefile()`, which is fairly specific to the local FS, and
relies on `stat()`, which we're aiming to remove from `PathBase`.

Also remove `PathBase.is_mount()`, `is_junction()`, `is_block_device()`,
`is_char_device()`, `is_fifo()` and `is_socket()`. These rely on POSIX
file type numbers that we're aiming to remove from the `PathBase` API.
2024-12-11 00:09:55 +00:00
Barney Gale
7f8ec52302
GH-127381: pathlib ABCs: remove PathBase.unlink() and rmdir() (#127736)
Virtual filesystems don't always make a distinction between deleting files
and empty directories, and sometimes support deleting non-empty directories
in a single operation. Here we remove `PathBase.unlink()` and `rmdir()`,
leaving `_delete()` as the sole deletion method, now made abstract. I hope
to drop the underscore prefix later on.
2024-12-08 18:45:09 +00:00
Barney Gale
5b6635f772
GH-127381: pathlib ABCs: remove PathBase.rename() and replace() (#127658)
These methods are obviated by `PathBase.move()`, which can move directories
and supports any `PathBase` object as a target.
2024-12-06 18:10:00 +00:00
Barney Gale
8b3cccf3f9
GH-125413: Revert addition of pathlib.Path.scandir() method (#127377)
Remove documentation for `pathlib.Path.scandir()`, and rename the method to
`_scandir()`. In the private pathlib ABCs, make `iterdir()` abstract and
call it from `_scandir()`.

It's not worthwhile to add this method at the moment - see discussion:
https://discuss.python.org/t/ergonomics-of-new-pathlib-path-scandir/71721

Co-authored-by: Steve Dower <steve.dower@microsoft.com>
2024-12-05 21:39:43 +00:00
Barney Gale
328187cc4f
GH-127381: pathlib ABCs: remove PathBase.cwd() and home() (#127427)
These classmethods presume that the user has retained the original
`__init__()` signature, which may not be the case. Also, many virtual
filesystems don't provide current or home directories.
2024-11-30 18:39:39 +00:00
Barney Gale
38264a060a
GH-127381: pathlib ABCs: remove PathBase.lstat() (#127382)
Remove the `PathBase.lstat()` method, which is a trivial variation of
`stat()`.

No user-facing changes because the pathlib ABCs are still private.
2024-11-29 21:03:39 +00:00
Barney Gale
5e9168492f
pathlib ABCs: defer path joining (#126409)
Defer joining of path segments in the private `PurePathBase` ABC. The new
behaviour matches how the public `PurePath` class handles path segments.

This removes a hard-to-grok difference between the ABCs and the main
classes. It also slightly reduces the size of `PurePath` objects by
eliminating a `_raw_path` slot.
2024-11-05 21:19:36 +00:00
Barney Gale
9b7294c3a5
GH-126363: Speed up pattern parsing in pathlib.Path.glob() (#126364)
The implementation of `Path.glob()` does rather a hacky thing: it calls
`self.with_segments()` to convert the given pattern to a `Path` object, and
then peeks at the private `_raw_path` attribute to see if pathlib removed a
trailing slash from the pattern.

In this patch, we make `glob()` use a new `_parse_pattern()` classmethod
that splits the pattern into parts while preserving information about any
trailing slash. This skips the cost of creating a `Path` object, and avoids
some path anchor normalization, which makes `Path.glob()` slightly faster.
But mostly it's about making the code less naughty.

Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
2024-11-04 19:29:57 +00:00
Barney Gale
260843df1b
GH-125413: Add pathlib.Path.scandir() method (#126060)
Add `pathlib.Path.scandir()` as a trivial wrapper of `os.scandir()`. This
will be used to implement several `PathBase` methods more efficiently,
including methods that provide `Path.copy()`.
2024-11-01 01:19:01 +00:00
Barney Gale
cb8e5995d8
GH-125069: Fix inconsistent joining in WindowsPath(PosixPath(...)) (#125156)
`PurePath.__init__()` incorrectly uses the `_raw_paths` of a given
`PurePath` object with a different flavour, even though the procedure to
join path segments can differ between flavours.

This change makes the `_raw_paths`-enabled deferred joining apply _only_
when the path flavours match.

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2024-10-13 17:46:10 +00:00
Barney Gale
5002f17794
GH-119518: Stop interning strings in pathlib GH-123356)
Remove `sys.intern(str(x))` calls when normalizing a path in pathlib. This
speeds up `str(Path('foo/bar'))` by about 10%.
2024-09-02 18:14:09 +02:00
Daniel Hollas
2304774465
gh-118761: Speedup pathlib import by deferring shutil (#123520)
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
2024-09-01 15:44:48 +01:00
Barney Gale
033d537cd4
GH-73991: Make pathlib.Path.delete() private. (#123315)
Per feedback from Paul Moore on GH-123158, it's better to defer making
`Path.delete()` public than ship it with under-designed error handling
capabilities.

We leave a remnant `_delete()` method, which is used by `move()`. Any
functionality not needed by `move()` is deleted.
2024-08-26 16:26:34 +01:00
Barney Gale
a6644d4464
GH-73991: Rework pathlib.Path.copytree() into copy() (#122369)
Rename `pathlib.Path.copy()` to `_copy_file()` (i.e. make it private.)

Rename `pathlib.Path.copytree()` to `copy()`, and add support for copying
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `delete()` methods (which will also
accept any type of file.)

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2024-08-11 22:43:18 +01:00
Barney Gale
98dba73010
GH-73991: Rework pathlib.Path.rmtree() into delete() (#122368)
Rename `pathlib.Path.rmtree()` to `delete()`, and add support for deleting
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `copy()` methods (which will also
accept any type of file.)
2024-08-07 01:34:44 +01:00
Barney Gale
094375b9b7
GH-73991: Add pathlib.Path.rmtree() (#119060)
Add a `Path.rmtree()` method that removes an entire directory tree, like
`shutil.rmtree()`. The signature of the optional *on_error* argument
matches the `Path.walk()` argument of the same name, but differs from the
*onexc* and *onerror* arguments to `shutil.rmtree()`. Consistency within
pathlib is probably more important.

In the private pathlib ABCs, we add an implementation based on `walk()`.

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2024-07-20 20:14:13 +00:00
Barney Gale
88fc0655d4
GH-73991: Support preserving metadata in pathlib.Path.copy() (#120806)
Add *preserve_metadata* keyword-only argument to `pathlib.Path.copy()`, defaulting to false. When set to true, we copy timestamps, permissions, extended attributes and flags where available, like `shutil.copystat()`. The argument has no effect on Windows, where metadata is always copied.

Internally (in the pathlib ABCs), path types gain `_readable_metadata` and `_writable_metadata` attributes. These sets of strings describe what kinds of metadata can be retrieved and stored. We take an intersection of `source._readable_metadata` and `target._writable_metadata` to minimise reads/writes. A new `_read_metadata()` method accepts a set of metadata keys and returns a dict with those keys, and a new `_write_metadata()` method accepts a dict of metadata. We *might* make these public in future, but it's hard to justify while the ABCs are still private.
2024-07-06 17:18:39 +01:00
Barney Gale
f09d184821
GH-73991: Support copying directory symlinks on older Windows (#120807)
Check for `ERROR_INVALID_PARAMETER` when calling `_winapi.CopyFile2()` and
raise `UnsupportedOperation`. In `Path.copy()`, handle this exception and
fall back to the `PathBase.copy()` implementation.
2024-07-03 04:30:29 +01:00
Barney Gale
20d5b84f57
GH-73991: Add follow_symlinks argument to pathlib.Path.copy() (#120519)
Add support for not following symlinks in `pathlib.Path.copy()`.

On Windows we add the `COPY_FILE_COPY_SYMLINK` flag is following symlinks is disabled. If the source is symlink to a directory, this call will fail with `ERROR_ACCESS_DENIED`. In this case we add `COPY_FILE_DIRECTORY` to the flags and retry. This can fail on old Windowses, which we note in the docs.

No news as `copy()` was only just added.
2024-06-19 00:59:54 +00:00
Barney Gale
7c38097add
GH-73991: Add pathlib.Path.copy() (#119058)
Add a `Path.copy()` method that copies the content of one file to another.

This method is similar to `shutil.copyfile()` but differs in the following ways:

- Uses `fcntl.FICLONE` where available (see GH-81338)
- Uses `os.copy_file_range` where available (see GH-81340)
- Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`.

The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata.

Incorporates code from GH-81338 and GH-93152.

Co-authored-by: Eryk Sun <eryksun@gmail.com>
2024-06-14 17:15:49 +01:00
Barney Gale
7ff61f51b6
GH-119169: Implement pathlib.Path.walk() using os.walk() (#119573)
For silly reasons, pathlib's generic implementation of `walk()` currently
resides in `glob._Globber`. This commit moves it into
`pathlib._abc.PathBase.walk()` where it really belongs, and makes
`pathlib.Path.walk()` call `os.walk()`.
2024-05-29 20:51:04 +00:00
Barney Gale
e418fc3a6e
GH-82805: Fix handling of single-dot file extensions in pathlib (#118952)
pathlib now treats "`.`" as a valid file extension (suffix). This brings
it in line with `os.path.splitext()`.

In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method
that splits a path into a `(root, ext)` pair, like `os.path.splitext()`.
This method is called by `PurePathBase.stem`, `suffix`, etc. In a future
version of pathlib, we might make these base classes public, and so users
will be able to define their own `splitext()` method to control file
extension splitting.

In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes`
properties that don't use `splitext()`, which avoids computing the path
base name twice.
2024-05-25 21:01:36 +01:00
Kirill Podoprigora
31a28cbae0
gh-119049: Defer import warnings in pathlib._local (#119111) 2024-05-17 17:12:02 +01:00
Barney Gale
7d8725ac6f
GH-74033: Drop deprecated pathlib.Path keyword arguments (#118793)
Remove support for supplying keyword arguments to `pathlib.Path()`. This
has been deprecated since Python 3.12.
2024-05-14 20:14:07 +00:00
Barney Gale
fbe6a0988f
GH-101357: Suppress OSError from pathlib.Path.exists() and is_*() (#118243)
Suppress all `OSError` exceptions from `pathlib.Path.exists()` and `is_*()`
rather than a selection of more common errors as we do presently. Also
adjust the implementations to call `os.path.exists()` etc, which are much
faster on Windows thanks to GH-101196.
2024-05-14 17:53:15 +00:00
Barney Gale
f772d0d08a
GH-78707: Drop deprecated pathlib.PurePath.[is_]relative_to() arguments (#118780)
Remove support for supplying additional positional arguments to
`PurePath.relative_to()` and `is_relative_to()`. This has been deprecated
since Python 3.12.
2024-05-10 15:53:46 +00:00
Barney Gale
b4bdf83cc6
GH-116380: Revert move of pathlib globbing code to pathlib._glob (#118678)
The previous change made the `glob` module slower to import, because it
imported `pathlib._glob` and hence the rest of `pathlib`.

Reverts a40f557d7b.
2024-05-07 00:32:48 +00:00
Barney Gale
d8d94911e2
Move pathlib implementation out of __init__.py (#118582)
Use the `__init__.py` file only for imports that define the API, following the example of asyncio.
2024-05-05 20:57:19 +01:00