Commit graph

177 commits

Author SHA1 Message Date
Andrew Gallant
c789dd50b9 fmt: disable snapshot tests on core-only
For whatever reason, these seem to take a hideously long time to run in
CI. They even take a long time to run locally, *relatively* speaking. In
core-only, `insta` doesn't support snapshotting at all, which is a huge
bummer. So we just tell insta to force the tests to pass and don't do
any updating. So these tests weren't really being run anyway.

I'm not sure what insta is doing here to be honest, and I don't really
understand why insta can't handle the core-only tests. I mean, I am
still importing the standard library when tests are run, even in
core-only mode. Maybe the insta macros assume the standard library
prelude is present or something? IDK.
2025-03-06 16:05:00 -05:00
Andrew Gallant
eb61588126 zoned: improve docs for Zoned::subsec_nanosecond
It is tempting to think of this method as just being a shortcut for
`zdt.timestamp().subsec_nanosecond()`, but it actually isn't! It's
returning the fractional seconds on the *civil* datetime, not the
timestamp. These are usually the same for times after the Unix epoch,
but can differ for times before it.

While the original request in #283 asks for `subsec_millisecond()` and
`subsec_microsecond()` on `Zoned` in order to be consistent with
`Timestamp`, I'm going to pass on those for now. In particular, since
these would return the fractional second value from the *civil*
datetime, `subsec_millisecond()` would always be equivalent to
`millisecond()`. `subsec_microsecond()` wouldn't be the same as
`microsecond()` (just like `subsec_nanosecond()` isn't the same as
`nanosecond()`), but I find that this just overall adds to the confusion
of the methods here. And if you do need `subsec_microsecond()`, you can
just do `subsec_nanosecond() / 1_000`.

The reason that these additional methods make sense for `Timestamp` is
that `Timestamp` doesn't have a civil datetime. So there are no
individual `millisecond()` or `microsecond()` units. A `Timestamp` is
closer to a `SignedDuration` than a `civil::DateTime`.

Closes #283
2025-03-06 16:05:00 -05:00
Andrew Gallant
77554cfb9b ci: split the "main" job into more pieces
This is the longest job by far, so I think it's worth splitting out some
of its tasks to see if we can reduce overall wall clock time on CI.
2025-03-06 10:09:11 -05:00
Andrew Gallant
98164d6263 tz: pack TZif civil datetime into i64
This leads to amazing speed-ups for TZ lookups on civil datetimes:

    tz/posix_datetime_to_offset/jiff               1.00     32.6±0.70ns        ? ?/sec    1.01     32.9±0.28ns        ? ?/sec
    tz/posix_timestamp_to_offset/jiff              1.00     21.8±0.17ns        ? ?/sec    1.04     22.6±0.15ns        ? ?/sec
    tz/tzif_bundled_datetime_to_offset/jiff        2.57     23.4±0.19ns        ? ?/sec    1.00      9.1±0.06ns        ? ?/sec
    tz/tzif_bundled_timestamp_to_offset/jiff       1.00      6.0±0.05ns        ? ?/sec    1.04      6.2±0.05ns        ? ?/sec
    tz/tzif_future_datetime_to_offset/jiff         1.35     50.5±0.60ns        ? ?/sec    1.00     37.4±0.67ns        ? ?/sec
    tz/tzif_future_timestamp_to_offset/jiff        1.00     21.2±0.15ns        ? ?/sec    1.00     21.2±0.17ns        ? ?/sec
    tz/tzif_historical_datetime_to_offset/jiff     2.68     23.4±0.17ns        ? ?/sec    1.00      8.7±0.08ns        ? ?/sec
    tz/tzif_historical_timestamp_to_offset/jiff    1.00      6.0±0.05ns        ? ?/sec    1.00      6.0±0.05ns        ? ?/sec

It turns out that comparing civil datetimes is actually quite
expensive. Getting them down to a single integer comparison makes
the binary search much quicker.
2025-03-05 18:48:44 -05:00
Andrew Gallant
f6a5cc6a22 tz: refactor TZif representation to use column storage
This makes binary search for TZ lookups substantially faster.

This is yet another brutal refactor. Changing anything in POSIX time
zones or TZif handling is now a monster pain in the ass because all
of that code is shared in a very awkward way with `jiff-static`.

Ref #271
2025-03-05 18:48:44 -05:00
Andrew Gallant
c9f72a5b56 tz: make tzif::Transition smaller
This is an easy win that uses 64-bit integers to represent a timestamp
instead of 96-bit integers. This is okay because this reflects what the
actual source IANA time zone database uses.

This makes the binary search lookup a fair a bit faster.

Next I'd like to split `Transition` into three sequences: timestamps,
civil datetimes and the local type index. This should make them as
small as possible and further improve binary search lookups (I hope).
2025-03-05 18:48:44 -05:00
Andrew Gallant
5af655e5ea tz: add new enabled-by-default tz-fat feature
When enabled, this feature will "fatten" TZif data by adding more time
zone transitions. This corresponds to what tzdb's `zic` program does
when `-b fat` is given, except Jiff does it at runtime. If the TZif data
has already been fattened, then this has no effect.

The reason for this is that it smooths out performance differences in
time zone runtime lookups between pre-fattened TZif data and "slim"
TZif data. It is unpredictable whether `/usr/share/zoneinfo` is
actually fat or not, so this helps makes performance more predictable
regardless of what the source TZif data looks like.

This uses about 25% more heap memory in my experiments. For a single
time zone, this is, in an absolute sense, likely insignificant. But if
you have thousands of time zones loaded into memory, it can add up. But
that's a somewhat niche use case. However, this can make binary sizes
bigger when the `jiff-static` proc macro is used.

So while unlikely to matter too much, the `tz-fat` feature can be
disabled if you want to prioritize memory usage and binary size.

Fixes #271
2025-03-05 18:48:44 -05:00
Andrew Gallant
092f05ff9f tz/posix: move most code to shared
This was yet another absolutely brutal refactor. But in order to
"fatten" up TZif data after parsing, we need to be able to actually use
POSIX time zones in order to compute missing transitions. And in order
to do that, basically the entire POSIX time zone implementation needs to
be in `shared`. And that means no ranged integers. Which in turn means
implementing several datetime algorithms on just primitives.

This was just overall brutal, and I am getting very close to ripping
out ranged integers.
2025-03-05 18:48:44 -05:00
Andrew Gallant
5817ed298b tz: add back POSIX time zone consistency check
This got lost in the most recently POSIX time zone refactor. We add
it back here.
2025-03-05 18:48:44 -05:00
Andrew Gallant
951a9a1567 span: fix TODO comment in error message
It looks like I never circled back around to fix the error message here
when I added the `SpanRelativeTo::days_are_24_hours()` functionality.
So fix that here.

It's hard to keep `SpanRelativeToKind`'s `Display` impl, because the
error message kinda needs to be rejiggered at a higher level.
Thankfully, this was only used in one place.
2025-03-05 18:48:44 -05:00
Andrew Gallant
fc993ca79a shared: move more date algorithms to itime
This isn't all of them, but I specifically prioritized the ones I
think I'll need for handling POSIX time zones.
2025-03-05 18:48:44 -05:00
Andrew Gallant
843adbea82 shared: define one Error type in shared
I think we're going to use this more. So we might as well put it in
one place.

And also, make it core-only compatible via degraded error messages.
2025-03-05 18:48:44 -05:00
Andrew Gallant
41cd2b8643 shared: move core datetime algorithms
Instead of just free functions operating on tuples, we actually give
them names.

Ranged integers keep pissing me off. The fact that I have "primitive"
datetime types and "ranged" datetime types is just absolutely
infuriating and creating a lot of dissonance.

But at least the new composite stuff makes moving back-and-forth a
little easier now. Of course, the composite stuff is also write-once
and read-never. *heavy sigh*

In the next commit, we're going to start moving some more of our
datetime algorithms to `shared`.

The ultimate goal here is to have enough in `shared` that we can handle
POSIX time zones.
2025-03-05 18:48:44 -05:00
Andrew Gallant
6275d26530 rangeint: add new Composite ranged type
This takes some brutalness out of writing routines for converting
to and from composite types over ranged integers (like `civil::Date`).

This whole mess is a consequence of using ranged integers and
simultaneously implementing our low level datetime algorithms on
primitive integers instead of ranged integers. It's a fucking mess.

I think I am steadily marching toward ripping out ranged integers.
Sigh. Very unfortunate.
2025-03-05 18:48:44 -05:00
Andrew Gallant
da67afe6f4 tz: remove use of String when parsing POSIX time zones
This was just really bugging me. And if we're going to move more of the
POSIX time zone implementation into `shared`, we might as well just bite
the bullet and do this too.

Now I believe the only parts of POSIX time zones that require `alloc`
are parsing the `:blah` implementation defined strings for the `TZ`
environment variable and error messages. Not that it really matters
much I think.
2025-03-05 18:48:44 -05:00
Andrew Gallant
7b724ba380 shared: move ArrayStr into shared module
So that we can use it in `jiff` and `jiff-static`.

This will avoid using a `String` at all when creating or using POSIX
time zones.
2025-03-05 18:48:44 -05:00
Andrew Gallant
cb75cb7a57 shared: move some things around
We're likely going to be moving more `crate::util` code into
`crate::shared::util` (unfortunately), so split `shared/util` apart
to make room for it.
2025-03-05 18:48:44 -05:00
Andrew Gallant
bf01f266ea tz: simplify POSIX time zone types
Now that we eagerly reject unreasonable POSIX time zones, we can
simplify our type definitions. There's no more split between
`PosixTimeZone` and `ReasonablePosixTimeZone`. Everything is
just reasonable.
2025-03-05 18:48:44 -05:00
Andrew Gallant
8c1517b308 tz: reject "unreasonable" POSIX time zones outright
This makes the POSIX time zone parser reject strings like `EST5EDT`.
That is, a time zone with daylight saving time, but without an explicit
rule stating when daylight saving time becomes active/inactive.

We were already doing this, but more explicitly by calling
`PosixTimeZone::reasonable`, so there is no public API breakage here.
The only difference is that `EST5EDT` will be treated as invalid and
will instead be attempted to be used as an IANA time zone identifier.
(Which, incidentally enough, actually exists. Odd, but I suppose
technically more correct than the current behavior of just rejecting it
outright.)

I did this because it makes the type definitions simpler. There was a
lot of cognitive energy on my part being devoted to parsing unreasonable
POSIX time zones successfully and only later asserting that they are
reasonable through a fallible API. But I don't think this was really
buying us anything, and we should just reject them outright.

Interestingly, PostgreSQL does support these "unreasonable" POSIX time
zones[1]:

> If a daylight-savings abbreviation is given but the transition
> rule field is omitted, the fallback behavior is to use the rule
> M3.2.0,M11.1.0, which corresponds to USA practice as of 2020 (that
> is, spring forward on the second Sunday of March, fall back on
> the first Sunday of November, both transitions occurring at 2AM
> prevailing time). Note that this rule does not give correct USA
> transition dates for years before 2007.

But POSIX has literally nothing to say about it[2], despite providing a
grammar that clearly makes the DST transition rule optional even when a
DST abbreviation is provided. Like it doesn't even mention that it's
unspecified, despite bloviating about how certain abbreviation lengths
lead to unspecified behavior. Why does POSIX suck so bad?

Anyway, it seems like there are really only two choices here. We could
either reject unreasonable time zones as invalid POSIX time zone
strings, or we could just "helpfully" assume a particular DST transition
rule. Jiff isn't legacy software (yet), so maybe don't try to be so
helpful that we assume one country's DST transition rules silently for
everyone in the world.

This commit does the bare minimum to reject these time zones.
The next commit will be the payoff.

[1]: https://www.postgresql.org/docs/current/datetime-posix-timezone-specs.html
[2]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html
2025-03-05 18:48:44 -05:00
Andrew Gallant
26719d539f tz/posix: remove Offset on DstInfo
This became superfluous with a prior refactor and can just be removed.
Previously, we weren't filling in default values eagerly, so this was
set to the default value when the DST offset was absent. But now we fill
in the default values eagerly, so this just is not needed any more.
2025-03-05 18:48:44 -05:00
daalfox
12ca7aaba7
signed_duration: implement Sum
Fixes #274, PR #282
2025-02-28 08:05:34 -05:00
Andrew Gallant
9aa9afe532 doc: clarify error conditions in Timestamp constructors
... and also add a "Crate features" section to the crate docs of
`jiff-sqlx` and `jiff-diesel`.

Fixes #273, Fixes #277
2025-02-27 17:17:44 -05:00
Andrew Gallant
ba44975f23 jiff-tzdb: switch to rearguard IANA data
This also documents _why_ we do this and other relevant settings
involved in generating the bundled tzdb data.

To give a sense of how this changes things, consider this Rust program:

```rust
use jiff::{tz::TimeZoneDatabase, Timestamp};

fn main() -> anyhow::Result<()> {
    let winter: Timestamp = "2524-01-05T00Z".parse()?;
    let summer: Timestamp = "2524-07-05T00Z".parse()?;

    let tzdb = TimeZoneDatabase::from_dir("/usr/share/zoneinfo")?;
    let tz = tzdb.get("Europe/Dublin")?;
    let info = tz.to_offset_info(winter);
    println!("winter time from zoneinfo: {info:?}");
    let info = tz.to_offset_info(summer);
    println!("summer time from zoneinfo: {info:?}");

    let tzdb = TimeZoneDatabase::bundled();
    let tz = tzdb.get("Europe/Dublin")?;
    let info = tz.to_offset_info(winter);
    println!(" winter time from bundled: {info:?}");
    let info = tz.to_offset_info(summer);
    println!(" summer time from bundled: {info:?}");

    Ok(())
}
```

Before this PR, on my Archlinux system, I get this output:

```
winter time from zoneinfo: TimeZoneOffsetInfo { offset: 00:00:00, dst: Yes, abbreviation: Borrowed("GMT") }
summer time from zoneinfo: TimeZoneOffsetInfo { offset: 01:00:00, dst: No, abbreviation: Borrowed("IST") }
 winter time from bundled: TimeZoneOffsetInfo { offset: 00:00:00, dst: Yes, abbreviation: Borrowed("GMT") }
 summer time from bundled: TimeZoneOffsetInfo { offset: 01:00:00, dst: No, abbreviation: Borrowed("IST") }
```

That is, the tzdb from `/usr/share/zoneinfo` on my system matches what
the tzdb from `jiff-tzdb` does. However, on my macOS system, I get
this output:

```
winter time from zoneinfo: TimeZoneOffsetInfo { offset: 00:00:00, dst: No, abbreviation: Borrowed("GMT") }
summer time from zoneinfo: TimeZoneOffsetInfo { offset: 01:00:00, dst: Yes, abbreviation: Borrowed("IST") }
 winter time from bundled: TimeZoneOffsetInfo { offset: 00:00:00, dst: Yes, abbreviation: Borrowed("GMT") }
 summer time from bundled: TimeZoneOffsetInfo { offset: 01:00:00, dst: No, abbreviation: Borrowed("IST") }
```

That's because `/usr/share/zoneinfo` on macOS (2025-02-27) uses
rearguard data. This PR makes `jiff-tzdb` match macOS, so that the
output of the above program _with_ this PR on my Linux system is now:

```
winter time from zoneinfo: TimeZoneOffsetInfo { offset: 00:00:00, dst: Yes, abbreviation: Borrowed("GMT") }
summer time from zoneinfo: TimeZoneOffsetInfo { offset: 01:00:00, dst: No, abbreviation: Borrowed("IST") }
 winter time from bundled: TimeZoneOffsetInfo { offset: 00:00:00, dst: No, abbreviation: Borrowed("GMT") }
 summer time from bundled: TimeZoneOffsetInfo { offset: 01:00:00, dst: Yes, abbreviation: Borrowed("IST") }
```

The reason that Jiff is switching to rearguard data is a bit subtle and has to
do with a difference in how the IANA Time Zone Database treats its internal
"daylight saving time" flag and what people in the "real world" consider
"daylight saving time." For example, in the standard distribution of the IANA
Time Zone Database, `Europe/Dublin` has its daylight saving time flag set to
_true_ during Winter and set to _false_ during Summer. The actual time shifts
are the same as, e.g., `Europe/London`, but which one is actually labeled
"daylight saving time" is not.

The IANA Time Zone Database does this for `Europe/Dublin`, presumably, because
_legally_, time during the Summer in Ireland is called `Irish Standard Time`,
and time during the Winter is called `Greenwich Mean Time`. These legal names
are reversed from what is typically the case, where "standard" time is during
the Winter and daylight saving time is during the Summer. The IANA Time Zone
Database implements this tweak in legal language via a "negative daylight
saving time offset." This is somewhat odd, and some consumers of the IANA Time
Zone Database cannot handle it. Thus, the rearguard format was born for,
seemingly, legacy programs.

Jiff can handle negative daylight saving time offsets just fine, but we use the
rearguard format anyway so that the underlying data more accurately reflects
on-the-ground reality for humans living in `Europe/Dublin`. In particular,
using the rearguard data enables [localization of time zone names] to be done
correctly.

Closes #258

[localization of time zone names]: https://github.com/BurntSushi/jiff/issues/258
2025-02-27 16:16:35 -05:00
Andrew Gallant
426a9b6111 tz: switch to tagged pointer representation
Previously, a `TimeZone` was represented by, essentially, an
`Option<Arc<TimeZoneKind>>`. The `None` value was used to represent
the special `UTC` time zone and the niche created by `Arc` allowed the
`TimeZone` to use only a single word of memory. All time zone kinds
other than `UTC` had to allocate an `Arc`.

With #256, this representation has become agitated. In particular, to
support the use of TZif data in core-only environments, we need to
represent variable length data without dynamic memory allocation. And,
moreover, without the indirection of `Arc` that permits a `TimeZone`
to be a single word. Keeping `TimeZone` small is important because a
`Zoned` embeds a `TimeZone`, and a `Zoned` is already somewhat chonky.

So... we have no way to allocate to create indirection. But we want
`TimeZone` to stay small. What do we do? Well, despite time zone data
being variable length, it is usually invariant. And it is, in many
cases, acceptable to embed it into your binary. Hence why the previous
commits spent a bunch of effort doing exactly that: to make all data we
need organized and accessible as `static` data.

But, we still need a way to stuff this new Arc-less representation for
a time zone into our `TimeZone`. Well, we can do pointer tagging! This
is a little tricky to get right, but the recent strict provenance
stabilization (despite us needing a polyfill here for MSRV reasons) has
crystallized this to a point where I'm pretty comfortable with it. The
one hiccup here is that we can't actually soundly look at the address of
a `&'static` pointer in a `const` context. We work around that by making
the one `&'static` pointer correspond to the tag `0`, so that it doesn't
require any explicit tagging.

Moreover, we get other benefits. While `UTC` previously did not require
`Arc` shenanigans, the `Etc/Unknown` and fixed offset time zones did.
But they no longer do. This makes fixed offset time zones faster to work
with, since they don't require the `Arc`-clone/drop dance. More to the
point, time zones embedded with the proc macro don't require the
`Arc`-clone/drop dance either, which makes them faster in some
circumstances as well.

While Jiff did previously use `unsafe`, I think this is its first
helping of non-trivial `unsafe`. Previously, we only used it for FFI and
in one case for skipping a UTF-8 validity check that was pretty easy to
reason about. I did my best to follow std's strict provenance docs and
documented SAFETY conditions as best as I could.
2025-02-26 16:55:17 -05:00
Andrew Gallant
04d895345a jiff-tz-static: add initial version of new proc macro
This isn't quite done, but it does parse TZif and emits the correct
Jiff code to construct a `TimeZone` in a const context.

The main thing missing here is a fair bit of polish and a change
to the TimeZone internals to actually support this method of
construction in core-only environments without increasing the size
of `TimeZone` (i.e., pointer tagging).
2025-02-26 16:55:17 -05:00
Andrew Gallant
68af809598 shared: add new exposed-but-internal jiff::shared module
See the module comments in `shared` for a bit more of an explanation
for why I ended up with this design. The summary is that this new
module will be copied to the proc macro, which will enable jiff to
depend on and re-export the proc macro.

This was a pretty gnarly refactor, because this required separating
the TZif and POSIX time zone parsers out from their internal data
types.
2025-02-26 16:55:17 -05:00
Andrew Gallant
89f4946780 util: make more utility functions const
I believe I'm going to need these in order to maintain
some semblance of encapsulation and convert data handed
to Jiff by the macro into the internal representations
used by Jiff.
2025-02-26 16:55:17 -05:00
Andrew Gallant
4b245db34a tz: remove storage of leap second corrections
Usually TZif files don't even have these and Jiff doesn't use
them anyway. Previously, there wasn't much cost. But since we're
embarking on statically embedding *parsed* TZif data in core-only
environments, we might as well just get rid of them.
2025-02-26 16:55:17 -05:00
Andrew Gallant
e364c9f46c civil: optimize Date addition with -1 or +1 days
I originally didn't do this because it seemed kinda bush
league, but this turns out to be a pretty common case via
POSIX time zones. So let's just use `yesterday()` and
`tomorrow()`, which will avoid a round-trip through Unix
epoch days.
2025-02-26 16:55:17 -05:00
Andrew Gallant
9151fb1653 tz: simplify and optimize ReasonablePosixTimeZone
It's still not quite as fast as I would hope, but both of the
primary TZ operations are now about 2x as fast.

We also simplify the data structure internals. This helped
make it faster since more stuff is now pre-computed. But now
we can't roundtrip a parsed POSIX TZ exactly. But I think this
is not a big deal and there is no specific requirement for
needing a POSIX TZ to be formatted exactly how it was given.
(Of course, semantically, the POSIX TZ will round-trip.) For
example, a POSIX time zone that uses `+` explicitly is allowed,
but Jiff will never format a POSIX TZ with a `+` anywhere.

This was partially motivated by optimization, but I was initially
moved to do this because I plan to expose some of these internals
for use with proc macros. And I wanted the exposed type definitions
to be simpler. A `ReasonablePosixTz` was pretty unwieldy before.
But now it's much tighter.

Ref #256
2025-02-26 16:55:17 -05:00
Andrew Gallant
fd3eeecbf5 rangeint: add const constructor
I believe this will be useful for shunting data between Jiff and proc
macros.
2025-02-26 16:55:17 -05:00
Andrew Gallant
1ef1715b35 util: make ArrayStr::new be const
This is prep for experimenting with a proc macro to create
a static `TimeZone`.

We're going to need to expose a bunch of internal data types
in order to permit a proc macro to create a `TimeZone`. I am
very annoyed by this, but it's probably the least bad option.
In theory, if Rust `const` were better, we wouldn't need to do
this. But `const` Rust is just nowhere near where we need it
to be.

Ref #256
2025-02-26 16:55:17 -05:00
Andrew Gallant
6f1e421c7d tests: sprinkle cfg(not(miri)) in various spots
This doesn't even make it possible to run `cargo miri test` for
Jiff, but you can at least now do `cargo miri test tz`. I want
this because I anticipate writing some tricky `unsafe` to do
pointer tagging for `TimeZone`.

The main problem with miri is unfortunately that `insta` doesn't
support it. I feel like it should work fine for inline snapshots,
but I guess it doesn't. Most of our insta usage is inline snapshots.

This is borderline a reason to switch off of insta (in favor of
some other snapshotting thing, possibly homegrown) if it proves
impractical for insta to support it.

Ref https://github.com/mitsuhiko/insta/issues/429
2025-02-26 16:55:17 -05:00
Andrew Gallant
23ea9559b7 error: make it possible to build an Error in a const context
The resulting error is total shit, but at least one can be
constructed. We could improve the messages to arbitrary strings
using pointer tagging, but let's not get ahead of ourselves.
2025-02-26 16:55:17 -05:00
Andrew Gallant
475a2ce010 tz: tighten up posix module
When I initially wrote the POSIX time zone parser, I had assumed
that we would use both strict POSIX time zones and IANA v3+ POSIX
time zones. But in practice, we just use the latter.

So, let's bake that into the API and tighten things up a bit. We
remove `IanaTz` and centralize everything on `ReasonablePosixTimeZone`.
2025-02-26 16:55:17 -05:00
Andrew Gallant
5059d3f973 doc: beef up docs for methods that set sub-second units
For example, there is `Time::nanosecond` and `Time::subsec_nanosecond`.
The former sets the individual nanosecond component and does not change
the millisecond or microsecond components. But the latter sets the
entire fractional second component to nanosecond precision.

Fixes #261
2025-02-17 09:55:03 -05:00
Andrew Gallant
3c0c560cef tz: break apart src/tz/mod.rs
This file was getting a bit big. And it was also responsible for
assembling a public API for `jiff::tz`. Mixing these two together just
made it a little hard for me to deal with.

I'm doing this with an eye toward #256, which will likely require
rejiggering the representation of `TimeZone` to use pointer tagging.
This will require `unsafe`, so I'd like to lock down its representation
to a smaller scope.
2025-02-16 18:50:27 -05:00
Andrew Gallant
1d66fa9e68 perf: do a ton of optimizations
The big ones here are:

1. Using Neri-Schneider to convert to-and-from Unix epoch days.
2. Add a bit set to `Span` to make it cheap to determine which units
   are non-zero. We then use this bit set to enable fast paths in
   routines that do arithmetic with `Span`.

There's a host of other junk here too. For example, `Timestamp::series`
now converts its `Span` to a `SignedDuration` and uses that to do
arithmetic instead of `Span`. And adding a `SignedDuration` to a
`Timestamp` is now faster because we avoid doing redundant checks by
skipping `Timestamp`'s constructor.

... and a lot more of that in a similar vein.

This overall results in better performance than both `chrono` and `time`
in *most* cases.

Fixes #235, Fixes #255
2025-02-16 15:50:26 -05:00
Josh Triplett
fd884445e1
doc: fix typo
PR #264
2025-02-14 14:58:34 -05:00
Andrew Gallant
e862edd884 timestamp: add Timestamp::constant constructor
Closes #263
2025-02-13 22:18:36 -05:00
Andrew Gallant
0ba2c63c85
doc: final touch-ups before 0.2.0 2025-02-10 21:25:29 -05:00
Andrew Gallant
c587fddbb7 fmt: add table of different formats supported by Jiff
I think someone asked for this when `jiff 0.1` was released, and I
thought it was a good idea then. But I hadn't gotten around to it. With
the number of formats Jiff supports having expanded a bit since then, I
think now is a good time to get this documented.
2025-02-10 21:24:55 -05:00
Andrew Gallant
6661b0ca3f doc: beef up the ICU4X docs a bit
Basically, anywhere we linked to the `icu` crate, we now also include a
mention of `jiff-icu` to facilitate conversions. And we update our
comparison example to include use of this crate.
2025-02-10 21:24:55 -05:00
Andrew Gallant
7a7cbbedb5 tz: add TimeZone::unknown() as a fallback in TimeZone::system()
Basically, instead of just logging a warning and falling back to `UTC`,
we now still log a warning and fall back to a time zone that _behaves_
as `UTC`, but is identified as `Etc/Unknown`. This avoids unrecoverable
failure while still also surfacing the fact that "something" has gone
wrong somewhere.

While doing this, I also fixed a bug where Jiff would perform offset
conflict resolution for `Z` offsets. But it shouldn't do that, since `Z`
reflects an unknown offset. In that case, we always respect the `Z`
offset (as numerically equivalent to `+00:00`) in order to resolve the
instant, and then use the time zone to compute the offset for that
instant unambiguously.

Ref https://github.com/tc39/proposal-canonical-tz/pull/25
Closes #230
2025-02-02 13:39:15 -05:00
Andrew Gallant
20e29e66c1 tz/db/bundled: remove unused imports 2025-02-02 13:39:15 -05:00
Andrew Gallant
ad53f14f92 span: treat weeks as invariant when days are invariant
Basically, if callers opt into days being invariant---and thus no
relative date is required to resolve its length---then weeks are also
treated as invariant.

Temporal doesn't do this. As I understand it, I think the reasoning is
that they might some day support calendars that don't use 7 day weeks.
However, I think it's still pretty unlikely for Jiff to support
non-Gregorian calendars (other than things like the ISO 8601 week date
calendar). And moreover, I believe the only calendars to use weeks that
aren't 7 days are ancient calendars. I believe all actively used
calendars use 7 day weeks. So for this assumption to be wrong, Jiff
would not only need to support non-Gregorian calendars but _ancient_
non-Gregorian calendars. I think that's probably never going to happen
and is best left to specialty crates.

Because of that, I think it's say to support invariant weeks *when* the
caller opts into invariant days.

Closes #136
2025-02-02 13:39:15 -05:00
Andrew Gallant
5db4a0f2d5 span: require explicit opt-in for invariant 24-hour days
This makes the `Span` APIs more consistent with the rest of Jiff in that
days are never silently assumed to be 24 hours. Instead, callers are
forced to either provide a relative date or a special
`SpanRelativeTo::days_are_24_hours()` marker to opt into invariant
24-hour days.

This is meant to make the API more uniform and reduce the possibility of
bugs. In particular, in #48, it was brought up that it is very easy to
assume that Jiff APIs will never let you silently assume that days are
always 24 hours. This is actually quite easy to do if you're just
dealing with `Span` APIs.

Fixes #48
2025-02-02 13:39:15 -05:00
Andrew Gallant
bf3d31d40d tz: split up TimeZone::to_offset
This started with wanting to get rid of the `(offset, dst, abbrev)`
triple in the `to_offset` API. Specifically, I thought that returning
a `&str` tied to the time zone for the abbreviation was somewhat
limiting from an API evolution perspective, particularly in core-only
environments. Specifically, in core-only environments, we only support
fixed offset time zones, and in order to provide the `&str` API, we had
to pre-compute the string representation of that offset and store it on
the `TimeZone`. Since there's no indirection in `TimeZone` in core-only
environments, this bloated the size of it, and consequently `Zoned` as
well.

By changing this to an opaque type with a more restrictive lifetime, we
can compute the string "later."

But... we really don't want to be doing this every single time we want
the offset for an instant. And moreover, even in non-core environments,
this code path was doing more work than is necessary (albeit not much)
to return the DST status and time zone abbreviation strings. It's also
just more data to shuffle around.

So I split `to_offset` into two APIs: one for common use cases and one
that returns the data that `to_offset` returned previously. This
required a fair bit of rejiggering, but nothing substantial changed.

Also, I changed the lifetime of the abbreviation returned by
`TimeZoneTransition::abbreviation` to be tied to the transition. This
also affords us some API flexibility in the future for similar reasons
as above.

Fixes #222
2025-02-02 13:39:15 -05:00
Andrew Gallant
738540ef8b tz: change item yielded by TimeZoneNameIter to opaque type
This should be more friendly to future API evolution here. I previously
did the simplest thing possible and wasn't really thinking about
core-only environments.

We still don't support any tzdb in core only environments, but we
theoretically could in the future. (Although there are a number of
issues to figure out. This merely fixes one of them.)

Fixes #221
2025-02-02 13:39:15 -05:00
Andrew Gallant
c8d145b585 tz: change arrangement of internal time zone databases
Jiff has three different kinds of databases that it can draw time zone
transitions from: the standard Unix zoneinfo database, a bundled or
embedded zoneinfo database (compiled into the binary), or the special
Android concatenated zoneinfo database. Previously, we would try to
create as many of these databases as possible, and then look all time
zones up in each.

I think that this type of semantic is very messy, because you can wind
up drawing from one db for one time zone and another db for another
(although in theory this shouldn't happen if they're all in sync). It
also requires that you always look for all three, which feels wrong.

Instead, we now just look for one and stop when we find it. Effectively,
we changed the internals from a product to a sum.

On Android, we check for a concatenated tzdb first, since that's likely
what we'll find.

Fixes #213
2025-02-02 13:39:15 -05:00