Commit graph

308 commits

Author SHA1 Message Date
Martin von Zweigbergk
da6c4b61b3 backend: remove unused TreeValue::Conflict and read/write methods
Some checks are pending
binaries / Build binary artifacts (push) Waiting to run
website / prerelease-docs-build-deploy (ubuntu-24.04) (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
We no longer use the `TreeValue::Conflict` constructor or the
`Backend::read_conflict()` and `Backend::write_conflict()` methods.
2025-09-04 16:26:44 +00:00
Graham Christensen
1b150d12bd test: use a fifo in test_local_working_copy to avoid too-long unix domain sockets on macOS 2025-08-23 13:48:25 +00:00
Yuya Nishihara
ed931996d0 fileset: switch to globset
Unlike string patterns, backslash-escape syntax isn't forcibly enabled. Fileset
globs are constructed from platform-native path inputs.
2025-07-25 08:22:35 +00:00
Yuya Nishihara
182daa1dfa str_util: switch to globset
Since we already have globset in transitive dependencies, this change helps
reduce the amount of dependencies. Another reason is that globset provides a
function to convert glob to regex. This is nice because we use globs to match
against strings or internal repository paths instead of platform-native paths.

The GlobPattern wrapper is boxed because globset::Glob type is relatively big.
2025-07-25 08:22:35 +00:00
Yuya Nishihara
68ead52c5c hex_util: roll our own decode/encode_hex() functions
Since we have "reverse hex" functions, it's easy to implement the same set of
functions for "forward hex". I believe our implementation is slower than
highly-optimized versions provided by e.g. faster-hex, but we don't use hex
encoding/decoding where the performance matters.
2025-07-02 01:56:40 +00:00
Ilya Grigoriev
6323764f69 lib: add rustversion, turn off a clippy lint in nightly in tests
This turns off the `clippy::cloned_ref_to_slice_refs` lint in some tests
and fixes it in others, for Rust 1.89+. This seems to make `cargo clippy
--workspace --all-targets --all-features` work in stable, beta, and
nightly (1.89).

This depends on the `rustversion` crate. Other than that, it's based on
Austin's https://github.com/jj-vcs/jj/pull/6705.

Co-authored-by:  Austin Seipp <aseipp@pobox.com>
2025-06-24 01:01:25 +00:00
Yuya Nishihara
d3e5f9d827 tests: add proptest that compares optimized revset output
Only the expression trees are generated, which I just needed to copy the doc
example. The commit DAG could also be generated, but doing that would mean the
commits would have to be created within a proptest loop, I think, and would make
shrinking process super slow.
2025-06-22 01:55:50 +00:00
Martin von Zweigbergk
b970939804 backend: make read_file() return a AsyncRead
The `Backend::read_file()` method is async but it returns a `Box<dyn
Read>` and reading from that trait is blocking. That's fine with the
local Git backend but it can be slow for remote backends. For example,
our backend at Google reads file chunks 1 MiB at a time from the
server. What that means is that reading lots of small files
concurrently works fine since the whole file contents are returned by
the first `Read::read()` call (it was fetched when
`Backend::read_file()` was issued). However, when reading files that
are larger than one chunk, we end up blocking on the next
`Read::read()` call. I haven't verified that this actually is a
problem at Google, but fixing this blocking is something we should do
eventually anyway.

This patch makes `Backend::read_file()` return a `Pin<Box<dyn
AsyncRead>>` instead, so implementations can be async in the read part
too.

Since `AsyncRead` is not yet standardized, we have to choose between
the one from `futures` and the one from `tokio`. I went with the one
from `tokio`. I picked that because an earlier version of this patch
used `tokio::fs` for some reads. Then I realized that doing that means
that we have to use a tokio runtime, meaning that we can't safely keep
our existing `pollster::FutureExt::block_on()` calls. If we start
depending on tokio's specific runtime, I think we would first want to
remove all the `block_on()` calls. I'll leave that for later. I think
at this point, we could equally well use `futures::io::AsyncRead`, but
I also don't know if there's a reason to prefer that.
2025-05-20 13:23:36 +00:00
Emily
7bcc5cca8f git: remove git2 feature 2025-05-07 19:29:20 +00:00
Emily
e5478bbf7b cargo: use gix/zlib-rs feature
This uses `zlib-rs`, a native Rust library that is comparable in
performance to `zlib-ng`. Since there’s no complicated C build
and gitoxide only has one hashing backend now, this lets us drop our
`packaging` feature without adding any awkward build requirements.

`zlib-rs` is generally faster at decompression than
`zlib-ng`, and faster at compression on levels 6 and 9; see
<https://trifectatech.org/blog/zlib-rs-is-faster-than-c/>
for details.

I couldn’t get reliable‐looking benchmark results out of my
temperamental laptop; `hyperfine` seemed to think that some random
`jj` workloads I tested might be slightly slower than with `zlib-ng`,
but it wasn’t unambiguously distinguishable from noise, so I’d
like to see measurements from others.

It’s certainly a lot faster than the previous default, and I
think it’s likely that `zlib-rs` will continue to get faster
and that it’s more than worth avoiding the headaches of a native
library with a CMake build dependency. (Though on the other hand,
if distributions move in the direction of shipping `zlib-ng` by
default, maybe there will be more motivation to make `libz-ng-sys`
support system libraries.)
2025-04-08 22:12:25 +00:00
Emily
71a619c19f tests: don’t use git2 in testutils 2025-04-03 19:03:44 +00:00
Martin von Zweigbergk
e392448288 time_util: replace use of chrono-english by interim
https://rustsec.org/advisories/RUSTSEC-2024-0395 recommends switching
from `chrono-english` to `interim` because the former is unmaintained.
2025-03-24 23:59:21 +00:00
Emily
a56b78bdb6 git: make git2 support optional
This helps us prepare for removing the functionality down the line and
makes things easier for people building or packaging their own Jujutsu.
2025-03-16 06:07:28 +00:00
Austin Seipp
992066c60c lib: remove use of zstd
`zstd` is only used to write files in the native backend right now. For now,
jettison it, to unbundle some C code that we don't really need.

(Ideally, a future compression library would be pure Rust, but we'll cross that
bridge when we get to it...)

Signed-off-by: Austin Seipp <aseipp@pobox.com>
2025-01-08 22:02:21 +00:00
Yuya Nishihara
fcac7ed39c config: load system host/user name by CLI and insert as env-base layer
Since the default now falls back to "", we can simply override the default by
CLI. We don't have to touch the environment in jj-lib.
2025-01-04 17:54:28 +09:00
Waleed Khan
c7b677b86d build: switch back to gix/max-performance-safe feature by default
Using `gix/max-performance` requires `cmake` as a build-time dependency, which could be a significant barrier for contributors (including existing ones, like me, who already work on jj but didn't have `cmake` installed thus far).

This commit switches back to using `gix/max-performance-safe`, which doesn't have the `cmake` dependency, and adds `gix/max-performance` behind the `gix-max-performance` feature for `jj-lib`.

It also adds `gix-max-performance` to the `packaging` feature group, since I'm assuming that packagers will want maximum performance, and are more likely to have `cmake` at hand.

Tested with

```
$ cargo build --workspace
$ cargo build --workspace --features packaging
```

(and the `--features packaging` build failed until I installed `cmake`)
2024-12-31 17:07:52 -06:00
Yuya Nishihara
78b5766993 repo, workspace: use dunce::canonicalize() to normalize paths
These paths may be printed, compared with user inputs, or passed to external
programs. It's probably better to avoid unusual "\\?\C:\" paths on Windows.

Fixes #5143
2024-12-22 09:45:37 +09:00
Yuya Nishihara
0975cb5374 cargo: drop dependency on config crate 2024-12-10 16:08:50 +09:00
Yuya Nishihara
5ad0f3dcf6 config: extract ConfigNamePathBuf to jj-lib
I'm planning to rewrite config store layer by leveraging toml_edit instead of
the config crate. It will allow us to merge config overlays in a way that
deprecated keys are resolved within a layer prior to merging, for example.

This patch moves ConfigNamePathBuf to jj-lib where new config API will be
hosted. We'll probably extract LayeredConfigs to this module, but we'll first
need to split environment dependencies from it.
2024-11-23 10:20:27 +09:00
Yuya Nishihara
eed32954b2 cargo: import fix_filter from gix::filter::plumbing
I don't think we need to declare these dependencies separately because both
are the optional dependencies enabled by the git feature.

We'll also need the gix's "attributes" feature at some point.
> attributes - Query attributes and excludes. Enables access to pathspecs,
> worktree checkouts, filter-pipelines and submodules.
https://docs.rs/gix/0.67.0/gix/index.html#feature-flags
2024-11-14 22:38:04 +09:00
Yuya Nishihara
168f07283a cargo: whitespace cleanup, sort dependencies 2024-11-14 22:38:04 +09:00
Martin von Zweigbergk
5b844f630c cargo: upgrade to sapling-renderdag
There is now an updated version of `esl01-renderdag` published from
the Sapling repo, so let's use that. That lets us remove the
`bitflags` 1.x dependency and the `itertools` 0.10.x non-dev
dependency.
2024-11-13 09:38:25 -08:00
Martin von Zweigbergk
02486dc064 fallback: replace use of backoff crate of by own implementation
https://rustsec.org/advisories/RUSTSEC-2024-0384 says to migrate off
of the `instant` crate because it's unmaintained. We depend on it only
via the `backoff` crate. That crate also seems unmaintained. So this
patch replaces our use of `backoff` by a custom implementation.

I initially intended to migrate to the `backon` crate, but that made
`lock::tests::lock_concurrent` tests fail. The test case spawns 72
threads (on my machine) and lets them all lock a file, and then it
waits 1 millisecond before releasing the file lock. I think the
problem is that their version of jitter is implemented as a random
addition of up to the initial backoff value. In our case, that means
we would add at most a millisecond. The `backoff` crate, on the other
hand does it by adding -50% to +50% of the current backoff value. I
think that leads to a much better distribution of backoffs, while
`backon`'s implementation means that only a few threads can lock the
file for each backoff exponent.
2024-11-11 07:04:21 -08:00
Yuya Nishihara
eaafde7119 cargo: add same-file dependency 2024-11-06 15:03:41 -08:00
Yuya Nishihara
dfaa52c88a cargo: add hashbrown dependency
We'll use low-level HashTable to customize Eq/Hash without implementing newtype
wrappers.

Unneeded default features are disabled for now. Note that the new default
hasher, foldhash, is released under the Zlib license, which isn't currently
included in the allow list.
2024-10-05 08:12:30 +09:00
Samuel Tardieu
3f0703ca2c cargo: inherit lints configuration from workspace 2024-10-04 22:29:13 +02:00
Yuya Nishihara
424623ba91 cargo: add "clru" dependency 2024-08-29 23:33:37 +09:00
Benjamin Tan
9c1b627f9b jj_lib: include indexmap as dependency
This is in preparation for shifting of `move_commits` function to
`jj_lib::rewrite`.
2024-08-12 21:48:17 +08:00
Stephen Jennings
ff9e739798 revset: create DatePattern type
Creates a DatePattern type that can be created by parsing a string in any
format supported by the chrono-english crate, including:

- 2024-03-25
- 2024-03-25T00:00:00
- 2024-03-25T00:00:00-08:00
- 2 weeks ago
- 5 minutes ago
- yesterday
- yesterday 5pm
- yesterday 10:30
- yesterday 15:30
- tomorrow

A `kind` can be specified to indicate whether the pattern should match dates at
or after (`after`) or strictly before (`before`) the given instant.

chrono-english supports US and UK dialects to disambiguate mm/dd/yy from
dd/mm/yy, but for now we default to US. This should probably be a config
setting.
2024-08-01 09:04:07 -07:00
Yuya Nishihara
5601fb40f8 cargo: add "bstr" dependency
I'm going to replace some Debug impls with BStr, and we already depend on
"bstr" through "gix".
2024-07-14 23:26:29 +09:00
Yuya Nishihara
ac2bddbc3d cargo: remove unused "bytes" dependency 2024-07-14 23:26:29 +09:00
Matt Kulukundis
067d37aa3c copy-tracking: cargo add gix-format and gix diff-blob feature 2024-07-03 20:26:30 -04:00
Jonathan Tan
4cca84515a In lib, make "testing" feature depend on "git"
If not, `cargo build --no-default-features --features testing` will fail.
2024-06-11 17:14:57 -07:00
Martin von Zweigbergk
0fd1969e8f lib: make git support optional via crate feature
I've wanted to make the Git support optional for a long time. However,
since everyone uses the Git backend (and we want to support it even in
the custom binary at Google), there hasn't been much practical reason
to make Git support optional.

Since we now use jj-lib on the server at Google, it does make sense to
have the server not include Git support. In addition to making the
server binary smaller, it would make it easier for us (jj team at
Googlle) to prove that our server is not affected by some libgit2 or
Gitoxide vulnerability. But to be honest, neither of those problems
have come up, so it's more of an excuse to make the Git support
optional at this point.

It turned out to be much simpler than I expected to make Git support
in the lib crate optional. We have done a pretty good job of keeping
Git-related logic separated there.

If we make Git support optional in the lib crate, it's going to make
it a bit harder to move logic from the CLI crate into the lib crate
(as we have planned to do). Maybe that's good, though, since it helps
remind us to keep Git-related logic separated.
2024-06-11 22:03:20 +09:00
Martin von Zweigbergk
8ce099470b cargo: explicitly indicate paths to publish
Running `cargo publish` from a non-colocated repo (such as my usual
repo) is currently quite scary because it uploads all non-hidden
files, even if they're ignored by `.gitignore`
(https://github.com/rust-lang/cargo/issues/2063). I noticed this a
while ago and have always run the command from a fresh clone since
then. To avoid the need for that, let's use the workaround mentioned
on the bug, which is to explicitly list patterns we want to publish.
2024-04-15 20:37:00 -07:00
Ilya Grigoriev
02a04d0d37 test_conflicts and test_resolve_command: use indoc! to indent conflict markers in tests
Apart from (IMO) looking nicer, this will also sidestep the potential problem
that if the file contains actual jj conflict markers (`>>>>>>>` in the beginning
of a line, for example), jj would currently have trouble materializing and
subsequently parsing conflicts in the file if it actually became conflicted.

I'll demo this bug in either this or a subsequent PR. It's the kind of bug that
sounds serious in theory but might never cause a problem in practice.

After this PR, only `docs/tutorial.md` has a conflict marker that's not indented.
There's only one there, so hopefully it won't be too much of a pain to deal with.

I also indented other strings in `test_conflicts.rs`. IMO, this looks nice and
more consistent with the `insta::assert_snapshot` output. I didn't spend the
time to do the same for `test_resolve_command`.
2024-03-22 23:27:25 -07:00
Thomas Castiglione
d661f59f9d working_copy: implement symlinks on windows with a helper function
enables symlink tests on windows, ignoring failures due to disabled developer mode,
and updates windows.md
2024-03-05 15:16:38 +08:00
Daehyeok Mun
a9f489ccdf Switch to ignore crate for gitignore handling.
Co-authored-by: Waleed Khan <me@waleedkhan.name>
2024-02-20 09:12:46 -08:00
Evan Mesterhazy
965d6ce4e4 Implement a procedural macro to derive the ContentHash trait for structs
This is a no-op in terms of function, but provides a nicer way to derive the
ContentHash trait for structs using the `#[derive(ContentHash)]` syntax used
for other traits such as `Debug`.

This commit only adds the macro. A subsequent commit will replace uses of
`content_hash!{}` with `#[derive(ContentHash)]`.

The new macro generates nice error messages, just like the old macro:

```
error[E0277]: the trait bound `NotImplemented: content_hash::ContentHash` is not satisfied
   --> lib/src/content_hash.rs:265:16
    |
265 |             z: NotImplemented,
    |                ^^^^^^^^^^^^^^ the trait `content_hash::ContentHash` is not implemented for `NotImplemented`
    |
    = help: the following other types implement trait `content_hash::ContentHash`:
              bool
              i32
              i64
              u8
              u32
              u64
              std::collections::HashMap<K, V>
              BTreeMap<K, V>
            and 38 others
```

This commit does two things to make proc macros re-exported by jj_lib useable
by deps:

1. jj_lib needs to be able refer to itself as `jj_lib` which it does
   by adding an `extern crate self as jj_lib` declaration.

2. jj_lib::content_hash needs to re-export the `digest::Update` type so that
   users of jj_lib can use the `#[derive(ContentHash)]` proc macro without
   directly depending on the digest crate. This is done by re-exporting it
   as `DigestUpdate`.


#3054
2024-02-20 11:29:05 -05:00
jyn
d66fcf2ca0 compile integration tests as a single binary
this greatly speeds up the time to run all tests, at the cost of slightly larger recompile times for individual tests.

this unfortunately adds the requirement that all tests are listed in `runner.rs` for the crate.
to avoid forgetting, i've added a new test that ensures the directory is in sync with the file.

 ## benchmarks

before this change, recompiling all tests took 32-50 seconds and running a single test took 3.5 seconds:

```
; hyperfine 'touch lib/src/lib.rs && cargo t --test test_working_copy'
  Time (mean ± σ):      3.543 s ±  0.168 s    [User: 2.597 s, System: 1.262 s]
  Range (min … max):    3.400 s …  3.847 s    10 runs
```

after this change, recompiling all tests take 4 seconds:
```
;  hyperfine 'touch lib/src/lib.rs ; cargo t --test runner --no-run'
  Time (mean ± σ):      4.055 s ±  0.123 s    [User: 3.591 s, System: 1.593 s]
  Range (min … max):    3.804 s …  4.159 s    10 runs
```
and running a single test takes about the same:
```
; hyperfine 'touch lib/src/lib.rs && cargo t --test runner -- test_working_copy'
  Time (mean ± σ):      4.129 s ±  0.120 s    [User: 3.636 s, System: 1.593 s]
  Range (min … max):    3.933 s …  4.346 s    10 runs
```

about 1.4 seconds of that is the time for the runner, of which .4 is the time for the linker. so
there may be room for further improving the times.
2024-02-06 18:19:41 -08:00
Jonathan Tan
0bc1341fd0 revset: add count_estimate() to Revset trait
The count() function in this trait is used by "jj branch" to determine
(and then report) how many commits a certain branch is ahead/behind
another branch. This is currently implemented by walking all commits
in the revset, counting how many were encountered. But this could be
improved: if the number is large, it is probably sufficient to report
"at least N" (instead of walking all the way), and this does not scale
well to jj backends that may not have all commits present locally (which
may prefer to return an estimate, rather than access the network).

Therefore, add a function that is explicitly documented to be O(1)
and that can return a range of values if the backend so chooses.

Also remove count(), as it is not immediately obvious that it is an
expensive call, and callers that are willing to pay the cost can obtain
the exact same functionality through iter().count() anyway. (In this
commit, all users of count() are migrated to iter().count() to preserve
all existing functionality; they will be migrated to count_estimate() in
a subsequent commit.)

"branch" needed to be updated due to this change. Although jj
is currently only available in English, I have attempted to keep
user-visible text from being assembled piece by piece, so that if we
later decide to translate jj into other languages, things will be easier
for translators.
2024-01-22 15:07:00 -08:00
Yuya Nishihara
7a44e590dc lock: remove byteorder dependency from tests, use fs helper functions
This is the last use of Read/WriteBytesExt. The byteorder crate is great, but
we don't need an abstraction of endianness. Let's simply use the std functions.
2023-12-23 00:14:17 +09:00
Yuya Nishihara
b5b01f4dd7 cargo: add ref-cast dependency
It helps to implement transparent conversion from &str to &Wrapped(str). We
could instead wrap the reference as Wrapped<'a>(&'a str), but it has various
drawbacks. Notably we can't implement Borrow and Deref because these traits
require a reference in return position.

Since the unsafe bits are pretty small, we can instead implement cast functions
without using the ref-cast crate. However, I believe we'll trust ref-cast more
than hand-crafted unsafe code.

https://crates.io/crates/ref-cast
https://docs.rs/ref-cast/1.0.20/ref_cast/attr.ref_cast_custom.html
2023-11-26 18:21:40 +09:00
Martin von Zweigbergk
9be24db051 tree: make TreeEntriesDirItem not self-referential
This removes the last use of `ouroboros`. Since `TreeEntriesDirItem`
is only used in "legacy trees" (before tree-level conflicts), I didn't
bother to check the performance impact. I also didn't bother to check
the matcher before adding the entries to the list, instead leaving
that where it is in `Iterator::next()`.
2023-11-17 03:50:34 -08:00
Waleed Khan
a60733f632 tree: remove unsafe with ouroboros for self-referential iterators 2023-11-09 21:50:29 -08:00
Yuya Nishihara
59640496aa cargo: sort dependencies list alphabetically 2023-11-07 23:46:05 +09:00
Martin von Zweigbergk
24b706641f async: switch to pollster's block_on()
During the transition to using more async code, I keep running into
https://github.com/rust-lang/futures-rs/issues/2090. Right now, I want
to convert `MergedTree::diff()` into a `Stream`. I don't want to
update all call sites at once, so instead I'm adding a
`MergedTree::diff_stream()` method, which just wraps
`MergedTree::diff()` in a `Stream. However, since the iterator is
synchronous, it needs to block on the async `Backend::read_tree()`
calls. If we then also block on the `Stream` in the CLI, we run into
the panic.
2023-11-03 08:15:10 -07:00
Yuya Nishihara
b48569b104 cargo: add gitoxide (or gix) dependency
I've enabled the "index" component from the "basic" feature set, which would
be needed to implement colocated repo functionality. The doc suggests that
a library shouldn't activate "max-performance-safe", but our crate is also
an application so it would be okay to enable the feature. We'll need "parallel"
anyway to make GitBackend Sync.

https://docs.rs/gix/latest/gix/#feature-flags
2023-11-02 19:33:06 +09:00
Yuya Nishihara
cfcc76571c revset: add support for glob:pattern 2023-10-21 09:55:01 +09:00
Martin von Zweigbergk
5174489959 backend: make read functions async
The commit backend at Google is cloud-based (and so are the other
backends); it reads and writes commits from/to a server, which stores
them in a database. That makes latency much higher than for disk-based
backends. To reduce the latency, we have a local daemon process that
caches and prefetches objects. There are still many cases where
latency is high, such as when diffing two uncached commits. We can
improve that by changing some of our (jj's) algorithms to read many
objects concurrently from the backend. In the case of tree-diffing, we
can fetch one level (depth) of the tree at a time. There are several
ways of doing that:

 * Make the backend methods `async`
 * Use many threads for reading from the backend
 * Add backend methods for batch reading

I don't think we typically need CPU parallelism, so it's wasteful to
have hundreds of threads running in order to fetch hundreds of objects
in parallel (especially when using a synchronous backend like the Git
backend). Batching would work well for the tree-diffing case, but it's
not as composable as `async`. For example, if we wanted to fetch some
commits at the same time as we were doing a diff, it's hard to see how
to do that with batching. Using async seems like our best bet.

I didn't make the backend interface's write functions async because
writes are already async with the daemon we have at Google. That
daemon will hash the object and immediately return, and then send the
object to the server in the background. I think any cloud-based
solution will need a similar daemon process. However, we may need to
reconsider this if/when jj gets used on a server with a custom backend
that writes directly to a database (i.e. no async daemon in between).

I've tried to measure the performance impact. That's the largest
difference I've been able to measure was on `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0` in the Linux repo,
which increases from 749 ms to 773 ms (3.3%). In most cases I've
tested, there's no measurable difference. I've tried diffing from the
root commit, as well as `jj --ignore-working-copy log --no-graph -r
'::v3.0 & author(torvalds)' -T 'commit_id ++ "\n"'` (to test a
commit-heavy load).
2023-10-08 23:36:49 -07:00