Option to resolve at a fixed timestamp with pip-compile --exclude-newer YYYY-MM-DD (#434)

This works by filtering out files with a more recent upload time, so if
the index you use does not provide upload times, the results might be
inaccurate. pypi provides upload times for all files. This is, the field
is non-nullable in the warehouse schema, but the simple API PEP does not
know this field.

If you have only pypi dependencies, this means deterministic,
reproducible(!) resolution. We could try doing the same for git repos
but it doesn't seem worth the effort, i'd recommend pinning commits
since git histories are arbitrarily malleable and also if you care about
reproducibility and such you such not use git dependencies but a custom
index.

Timestamps are given either as RFC 3339 timestamps such as
`2006-12-02T02:07:43Z` or as UTC dates in the same format such as
`2006-12-02`. Dates are interpreted as including this day, i.e. until
midnight UTC that day. Date only is required to make this ergonomic and
midnight seems like an ergonomic choice.

In action for `pandas`:

```console
$ target/debug/puffin pip-compile --exclude-newer 2023-11-16 target/pandas.in
Resolved 6 packages in 679ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    target/debug/puffin pip-compile --exclude-newer 2023-11-16 target/pandas.in
numpy==1.26.2
    # via pandas
pandas==2.1.3
python-dateutil==2.8.2
    # via pandas
pytz==2023.3.post1
    # via pandas
six==1.16.0
    # via python-dateutil
tzdata==2023.3
    # via pandas
$ target/debug/puffin pip-compile --exclude-newer 2022-11-16 target/pandas.in
Resolved 5 packages in 655ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    target/debug/puffin pip-compile --exclude-newer 2022-11-16 target/pandas.in
numpy==1.23.4
    # via pandas
pandas==1.5.1
python-dateutil==2.8.2
    # via pandas
pytz==2022.6
    # via pandas
six==1.16.0
    # via python-dateutil
$ target/debug/puffin pip-compile --exclude-newer 2021-11-16 target/pandas.in
Resolved 5 packages in 594ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    target/debug/puffin pip-compile --exclude-newer 2021-11-16 target/pandas.in
numpy==1.21.4
    # via pandas
pandas==1.3.4
python-dateutil==2.8.2
    # via pandas
pytz==2021.3
    # via pandas
six==1.16.0
    # via python-dateutil
```
This commit is contained in:
konsti 2023-11-16 20:46:17 +01:00 committed by GitHub
parent 0d455ebd06
commit e41ec12239
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
14 changed files with 218 additions and 38 deletions

View file

@ -14,6 +14,7 @@ pep440_rs = { path = "../pep440-rs", features = ["serde"] }
pep508_rs = { path = "../pep508-rs", features = ["serde"] }
puffin-normalize = { path = "../puffin-normalize" }
chrono = { workspace = true, features = ["serde"] }
mailparse = { workspace = true }
once_cell = { workspace = true }
regex = { workspace = true }

View file

@ -1,3 +1,4 @@
use chrono::{DateTime, Utc};
use pep440_rs::VersionSpecifiers;
use serde::{de, Deserialize, Deserializer, Serialize};
use std::str::FromStr;
@ -28,7 +29,7 @@ pub struct File {
#[serde(deserialize_with = "deserialize_version_specifiers_lenient")]
pub requires_python: Option<VersionSpecifiers>,
pub size: Option<usize>,
pub upload_time: String,
pub upload_time: Option<DateTime<Utc>>,
pub url: String,
pub yanked: Option<Yanked>,
}