Commit graph

3638 commits

Author SHA1 Message Date
Ian Wrzesinski
2768441d80 working_copy: Factor out new_state function
This constructor is just large enough to be worth extracting.
2025-08-18 10:52:12 +00:00
Ivan Petkov
1df3c5d3b7 lib/git: support temporarily fetching all tags as part of a fetch
By default `git clone` will fetch all tags on a remote (unless
`--no-tags` is specified). Eventually we'll want to support the same
behavior, though since we first configure a remote and then fetch, we'll
need to configure the remote with the correct *general* tag fetching
behavior, but still perform the first fetch with all tags included.
2025-08-17 15:48:36 +00:00
Remo Senekowitsch
36ee36ce78 op: undo: rename to revert
This paves the way for the semantics of `jj undo` and `jj op revert` to
evolve independently. `jj op revert` is going to stay the low-level
command to apply the inverse of any operation. The new name is
consistent with `jj revert`, which applies the inverse of a commit.

`jj undo` on the other hand is planned to become a higher-level command,
which is more similar to, say, Ctrl+Z in typical GUI applications.
Running `jj undo` repeatedly will revert progressively older operations,
allowing the user to walk backwards in time. At the same time, `jj undo`
will lose the abilitly to revert arbitrary operations, to keep its
semantics simple and intuitive.

Related feature request "jj undo ergonomics":
https://github.com/jj-vcs/jj/issues/3700
2025-08-15 21:31:15 +00:00
Yuya Nishihara
2391764ab7 revset: integrate change-path index
The current implementation does linear search, which I think is good assuming
the size of the changed paths set is usually small. The next costly part of "jj
log PATH" is commit.empty() template on merge commits (#5411). We'll need a
public API to query changed-path index to optimize it.
2025-08-15 11:46:49 +00:00
Yuya Nishihara
8999e08d2e index: add function to build changed-path index at certain operation 2025-08-15 11:46:49 +00:00
Gaëtan Lehmann
db0e43608d lib: rename mut_repo() to repo_mut() in the rewriter
to be consistent with repo_mut() in the transaction.
2025-08-15 09:15:35 +00:00
Martin von Zweigbergk
05a74963cd protos: rename op_store to simple_op_store
Similar to the previous commit, `op_store` is specific to
`SimpleOpStore`, so its name should match that.
2025-08-14 14:15:17 +00:00
Martin von Zweigbergk
e13a8649f3 protos: rename working_copy to local_working_copy
The `working_copy` proto is specific to `LocalWorkingCopy`, so I think
it should match that name.
2025-08-14 14:15:17 +00:00
Yuya Nishihara
ad14c7bcc0 index: concatenate changed-path index on merge_in()
The logic is different from MutableCommitIndexSegment::merge_in() because we
need to deduplicate changed-path entries based on the commit entries.
2025-08-14 08:44:55 +00:00
Yuya Nishihara
a2155dcc95 index: squash changed-path segments before saving
The logic is the same as MutableCommitIndexSegment.
2025-08-14 08:44:55 +00:00
Yuya Nishihara
01e2a85555 commit_builder: make .generate_new_change_id() not imply commit is duplicated
This will allow us to "touch" change id without duplicating commits.

The caller should also do .generate_new_change_id() to not make commits
divergent. This function could do that automatically, but I'm not sure if that's
good. Alternatively, we can add mut_repo.duplicate_commit(predecessor), but
we'll need to refactor CommitRewriter.
2025-08-14 08:44:32 +00:00
Ian Wrzesinski
1255eb6143 working_copy: Remove expect(dead_code) by adding underscore to field name 2025-08-14 05:29:12 +00:00
Martin von Zweigbergk
823f4fccb4 PathError: rename error field to source
`source` seems to be the more idiomatic name for the error source.
2025-08-13 17:45:52 +00:00
Kaiyi Li
f8eeed2b67 eol conversion: treat files with lone CRs as binary
When checking whether a file is binary, we check if the file exists a CR
character that isn't followed by a LF character. If so, we consider this
file as binary and don't apply EOL conversion unconditionally.

This commit is related to #7010.
2025-08-12 18:00:10 +00:00
Ilya Grigoriev
9aeed9a39d clippy: run autofix to use Self more often
For some reason, running `cargo update` makes clippy notice these can
be replaced with `Self`. I put this commit first, though, since it is
helpful regardless of the update.

This is the result of the following command after the `cargo update`
from the next commit (even though the refactor is valid regardless of
the `cargo update`).

```
cargo +nightly clippy  --workspace --all-targets --all-features --fix
```
2025-08-12 09:47:39 +00:00
Yuya Nishihara
e31069cd3c index: index changed paths by add_commit()
Tests of the indexed contents will be added with the revset engine integration.
We don't have a public interface to get the indexed changed paths right now.
2025-08-11 11:50:39 +00:00
Yuya Nishihara
6bb26d8e73 index: include changed-path index in stats() 2025-08-11 11:50:39 +00:00
Yuya Nishihara
132d74a234 index: read/write changed-path index segments 2025-08-11 11:50:39 +00:00
Yuya Nishihara
3902d88b3e index: migrate operation link file to protobuf
I'm going to add changed-path index, and the operation link file will store a
list of segment files and a starting commit position. Suppose the link file is
small, we wouldn't need our own serialization format.

This patch adds new directory for proto-based operation link files. We could
reuse the existing directory, but that would make debugging a bit harder.
2025-08-11 11:50:39 +00:00
Yuya Nishihara
3f0126191d index: don't use persist_content_addressed_temp_file() to write operation link
A link file isn't addressed by its content but by the associated operation id.
There's no guarantee that the stored content never changes.
2025-08-11 11:50:39 +00:00
Ilya Grigoriev
3d91d6e21b clippy: auto-fixes of clippy::implicit-clone with latest nightly 2025-08-09 03:44:26 +00:00
Ivan Petkov
bf974d6911 git: support specifying tag fetch behavior when adding a remote 2025-08-08 15:44:22 +00:00
Ivan Petkov
c3574f99f1 lib/git: treat tagOpt as a standard option
In a future commit we'll add support for controlling a git remote's tag
fetch behavior, which may result in a `tagOpt = --[no-]tags` entry in
the remote configuration.

Doing this change here means we will avoid incorrectly flagging remote
configurations as unsupported.
2025-08-08 15:44:22 +00:00
Austin Seipp
afb1c1446d git: add git.colocate to colocate repos by default
Most users colocate all of their repositories or none of them. A config
option is more convenient in that situation.

There are also plans to make colocated repos the default. This change
paves the way to flip the default easily.

Closes #2507.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
2025-08-08 07:08:46 +00:00
Yuya Nishihara
36134a5f96 index: ensure that bit set panics on integer underflow
We'll get out-of-bounds read error later, but the panic message would be a bit
difficult to reason about.
2025-08-07 14:46:08 +00:00
Yuya Nishihara
331bd4e859 revset_graph: fix out-of-bounds bit set lookup in remove_transitive_edges()
The problem was spotted by Martin. Since we've made remove_transitive_edges()
omit "missing" edges from the set of nodes to visit at ad7c42e04b
"revset_graph: ignore missing edges thoroughly in remove_transitive_edges()", we
should also skip them in the input set.

This patch updates all test cases to run at bit-set boundary to detect other
potential issues.
2025-08-07 14:46:08 +00:00
Yuya Nishihara
1edcd79af7 index: add data structure for changed-path index
This patch implements data format, serialization, and deserialization. Actual
indexing functions and disk I/O will be added later.

The data format is simple. It's basically a sorted table of paths + pointers to
the table entries per commit. Git employs Bloom filter for this purpose, but I
don't think we need a probabilistic data structure. The size of the serialized
index segments isn't big compared to the commit index segments, and the lookup
performance seems good. It's also important that we don't need to merge parent
trees when a path matches the indexed changed paths.

With git repo (77410 commits):
- changed-path segment file size: ~1.1MB
- jj log --ignore-working-copy README.md: ~0.2sec wall

Indexing takes minutes. That's not surprising because we have to merge parent
trees to get diffs.

#4674
2025-08-07 02:04:23 +00:00
Nigthknight
ad5a4b3172 copy-tracking: fix path tracking for nested directories with equal names
The problem was that the calculation of the suffix was overlapping into
the the prefix for nested directories with the same name. We skipped the
prefix to avoid this issue.

Issue: #6853

Co-authored-by: Tobias Markus <tobias@miglix.eu>
2025-08-06 06:14:57 +00:00
Martin von Zweigbergk
3a7ce87f44 rewrite: make duplicate_commits() async 2025-08-06 03:12:05 +00:00
Martin von Zweigbergk
073a1dea74 rewrite: make CommitRewriter::rebase() async 2025-08-06 03:12:05 +00:00
Martin von Zweigbergk
82d7182ef7 repo: take async callback to transform_descendants()
Maybe we can make `transform_descendants()` transform siblings
concurrently later.
2025-08-06 03:12:05 +00:00
Martin von Zweigbergk
6a7c0fb5ea merged_tree: respect Backend::concurrency() in merge_trees() 2025-08-05 14:29:57 +00:00
Martin von Zweigbergk
b14fad00be merged_tree: rewrite merge_trees() to be non-recursive and concurrent
We ran into a stack overflow in `merge_trees()` at Google due to
limited stack space and large stack frames caused by async code. This
patch fixes that by making `merge_trees()` non-recursive. Since I was
rewriting the algortithm anyway, I also made it concurrent, addressing
a TODO.
2025-08-05 14:29:57 +00:00
Martin von Zweigbergk
ca839f9d50 merged_tree: separate out trivially resolved entries when resolving
The goal of this change is to have the cheap, non-async code separated
from the async code.
2025-08-05 14:29:57 +00:00
Martin von Zweigbergk
2e249babe0 merged_tree: move construction of backend trees onto new type
Once we make the tree-merging concurrent, we are going to keep once
instance of this type per unfinished directory merge.
2025-08-05 14:29:57 +00:00
Martin von Zweigbergk
97ee86d513 merged_tree: build conflicted resolved trees from sorted conflicts
This addresses the TODO about creating the full tree entries by
merging the sorted list of conflicts with (one side of) the sorted
list of conflicts.
2025-08-05 14:29:57 +00:00
Yuya Nishihara
1447791a74 revset_graph: use Rc to propagate edges without cloning
This saves memory usage.

In small repo:
- jj-1: with original remove_transitive_edges()
- jj-4: previous patch
- jj-5: this patch
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-1,jj-2,jj-3,jj-4,jj-5 \
  'target/release-with-debug/{bin} -R ~/mirrors/git --ignore-working-copy log -r "tags()"'
Benchmark 1: target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
  Time (mean ± σ):      1.511 s ±  0.091 s    [User: 1.296 s, System: 0.214 s]
  Range (min … max):    1.432 s …  1.674 s    10 runs

Benchmark 4: target/release-with-debug/jj-4 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
  Time (mean ± σ):      1.142 s ±  0.055 s    [User: 1.043 s, System: 0.099 s]
  Range (min … max):    1.106 s …  1.287 s    10 runs

Benchmark 5: target/release-with-debug/jj-5 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
  Time (mean ± σ):      1.201 s ±  0.082 s    [User: 1.101 s, System: 0.100 s]
  Range (min … max):    1.095 s …  1.299 s    10 runs

Relative speed comparison
        1.32 ±  0.10  target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
        1.00          target/release-with-debug/jj-4 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
        1.05 ±  0.09  target/release-with-debug/jj-5 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
```

In mid-size repo:
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-1,jj-2,jj-3,jj-4,jj-5 \
  'target/release-with-debug/{bin} -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"'
Benchmark 1: target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
  Time (mean ± σ):     937.7 ms ±  68.8 ms    [User: 672.1 ms, System: 265.4 ms]
  Range (min … max):   868.3 ms … 1025.4 ms    10 runs

Benchmark 4: target/release-with-debug/jj-4 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
  Time (mean ± σ):     979.9 ms ±  65.4 ms    [User: 802.6 ms, System: 177.2 ms]
  Range (min … max):   905.6 ms … 1055.0 ms    10 runs

Benchmark 5: target/release-with-debug/jj-5 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
  Time (mean ± σ):     879.4 ms ±  62.1 ms    [User: 754.0 ms, System: 125.4 ms]
  Range (min … max):   823.1 ms … 960.4 ms    10 runs

Relative speed comparison
        1.07 ±  0.11  target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
        1.11 ±  0.11  target/release-with-debug/jj-4 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
        1.00          target/release-with-debug/jj-5 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
```
2025-08-05 01:31:32 +00:00
Yuya Nishihara
d8166b46f0 revset_graph: remove transitive edges of both internal and external commits
With changed-path index, I noticed "jj log PATH" in Linux repo freezes because
of out of memory. The problem can be mitigated by sharing edges values by Rc,
but the computation is still slow. It seems better to keep edges to propagate
small.

In small repo:
- jj-1: with original remove_transitive_edges()
- jj-4: this patch
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-1,jj-2,jj-3,jj-4,jj-5 \
  'target/release-with-debug/{bin} -R ~/mirrors/git --ignore-working-copy log -r "tags()"'
Benchmark 1: target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
  Time (mean ± σ):      1.511 s ±  0.091 s    [User: 1.296 s, System: 0.214 s]
  Range (min … max):    1.432 s …  1.674 s    10 runs

Benchmark 4: target/release-with-debug/jj-4 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
  Time (mean ± σ):      1.142 s ±  0.055 s    [User: 1.043 s, System: 0.099 s]
  Range (min … max):    1.106 s …  1.287 s    10 runs

Relative speed comparison
        1.32 ±  0.10  target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
        1.00          target/release-with-debug/jj-4 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
```

In mid-size repo:
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-1,jj-2,jj-3,jj-4,jj-5 \
  'target/release-with-debug/{bin} -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"'
Benchmark 1: target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
  Time (mean ± σ):     937.7 ms ±  68.8 ms    [User: 672.1 ms, System: 265.4 ms]
  Range (min … max):   868.3 ms … 1025.4 ms    10 runs

Benchmark 4: target/release-with-debug/jj-4 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
  Time (mean ± σ):     979.9 ms ±  65.4 ms    [User: 802.6 ms, System: 177.2 ms]
  Range (min … max):   905.6 ms … 1055.0 ms    10 runs

Relative speed comparison
        1.07 ±  0.11  target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
        1.11 ±  0.11  target/release-with-debug/jj-4 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
```
2025-08-05 01:31:32 +00:00
Yuya Nishihara
a523657525 revset_graph: add fast path for linear ranges
This patch itself isn't significant win, but will help us optimize both CPU and
memory use.

In small repo:
- jj-2: previous patch
- jj-3: this patch
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-1,jj-2,jj-3,jj-4,jj-5 \
  'target/release-with-debug/{bin} -R ~/mirrors/git --ignore-working-copy log -r "tags()"'
Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
  Time (mean ± σ):      2.467 s ±  0.142 s    [User: 2.271 s, System: 0.196 s]
  Range (min … max):    2.246 s …  2.588 s    10 runs

Benchmark 3: target/release-with-debug/jj-3 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
  Time (mean ± σ):      2.254 s ±  0.169 s    [User: 2.070 s, System: 0.184 s]
  Range (min … max):    2.055 s …  2.395 s    10 runs

Relative speed comparison
        2.16 ±  0.16  target/release-with-debug/jj-2 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
        1.97 ±  0.18  target/release-with-debug/jj-3 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
```

In mid-size repo:
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-1,jj-2,jj-3,jj-4,jj-5 \
  'target/release-with-debug/{bin} -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"'
Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
  Time (mean ± σ):      1.185 s ±  0.066 s    [User: 0.920 s, System: 0.265 s]
  Range (min … max):    1.132 s …  1.310 s    10 runs

Benchmark 3: target/release-with-debug/jj-3 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
  Time (mean ± σ):     909.7 ms ±  46.7 ms    [User: 683.6 ms, System: 226.1 ms]
  Range (min … max):   868.3 ms … 997.3 ms    10 runs

Relative speed comparison
        1.35 ±  0.12  target/release-with-debug/jj-2 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
        1.03 ±  0.09  target/release-with-debug/jj-3 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
```
2025-08-05 01:31:32 +00:00
Yuya Nishihara
b9cbfdedb8 revset_graph: make remove_transitive_edges() not recurse into edges_from*()
I'll make RevsetGraphWalk eliminate transitive edges from intermediate edges to
mitigate out-of-memory issue. This means that edges_from*() functions will call
remove_transitive_edges(). Maybe we could rewrite the whole process to not use
machine stack, but doing that would be complicated. The process wouldn't be
trivially-deterministic on where the look-ahead can terminate either.

The performance problem introduced by this patch will be fixed by later patches.

In small repo:
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-1,jj-2,jj-3,jj-4,jj-5 \
  'target/release-with-debug/{bin} -R ~/mirrors/git --ignore-working-copy log -r "tags()"'
Benchmark 1: target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
  Time (mean ± σ):      1.511 s ±  0.091 s    [User: 1.296 s, System: 0.214 s]
  Range (min … max):    1.432 s …  1.674 s    10 runs

Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
  Time (mean ± σ):      2.467 s ±  0.142 s    [User: 2.271 s, System: 0.196 s]
  Range (min … max):    2.246 s …  2.588 s    10 runs

Relative speed comparison
        1.32 ±  0.10  target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
        2.16 ±  0.16  target/release-with-debug/jj-2 -R ~/mirrors/git --ignore-working-copy log -r "tags()"
```

In mid-size repo:
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-1,jj-2,jj-3,jj-4,jj-5 \
  'target/release-with-debug/{bin} -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"'
Benchmark 1: target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
  Time (mean ± σ):     937.7 ms ±  68.8 ms    [User: 672.1 ms, System: 265.4 ms]
  Range (min … max):   868.3 ms … 1025.4 ms    10 runs

Benchmark 2: target/release-with-debug/jj-2 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
  Time (mean ± σ):      1.185 s ±  0.066 s    [User: 0.920 s, System: 0.265 s]
  Range (min … max):    1.132 s …  1.310 s    10 runs

Relative speed comparison
        1.07 ±  0.11  target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
        1.35 ±  0.12  target/release-with-debug/jj-2 -R ~/mirrors/linux --ignore-working-copy log -r "tags(v5)"
```
2025-08-05 01:31:32 +00:00
Yuya Nishihara
b2d109634c revset_graph: make remove_transitive_edges() not visit uninteresting ancestors
This is similar to "entry.generation_number() < min_generation", but is cheaper
to test.
2025-08-04 11:20:49 +00:00
Yuya Nishihara
f386e9fa29 revset_graph: omit missing edges early in remove_transitive_edges()
We aren't interested in edge types here.
2025-08-04 11:20:49 +00:00
Yuya Nishihara
ad7c42e04b revset_graph: ignore missing edges thoroughly in remove_transitive_edges() 2025-08-04 11:20:49 +00:00
Yuya Nishihara
13a2bbe1b8 graph: add GraphEdge::is_<edge_type>() for convenience 2025-08-04 11:20:49 +00:00
Theo Buehler
525e795889 git: avoid division by zero in overall progress
When no progress has been made, indicate that as 0.0 progress rather
than dividing by zero. Avoids in particular a display issue where the
progress bar indicates NaN% progress when fetching from slow remotes.

Fixes #7155

Change-Id: I6a6a6964d6572d1c98b5fa25285d26c07ee27a40
2025-08-03 13:48:12 +00:00
Scott Taylor
c5b0b0b68f revset: insert HeadsRange in Ancestors for nested range/filter
This allows evaluating ancestors/ranges involving filters significantly
faster. For instance, naively evaluating `::mine()` requires reading
every commit in the repo to check `mine()` on each commit, then finding
the ancestors of that set. This optimization rewrites `::mine()` to
`::heads(mine())`, since `heads(mine())` can be evaluated more
efficiently by only reading commits until the first successful match.

If someone is unaware of how revsets are implemented, this case can come
up pretty easily, such as by including `~mine()` in `immutable_heads()`
to make other people's commits immutable. In my local `jj` repo, this
optimization reduces the runtime of `jj log` with `~mine()` in
`immutable_heads()` from about 800ms down to about 50ms.

Benchmark results on Git repo:

```
::(v1.0.0..v2.40.0)     55.5% faster
author(peff)..          96.8% faster
```
2025-08-02 01:40:23 +00:00
Scott Taylor
a60b2c6025 revset: extract to_heads_range() function 2025-08-02 01:40:23 +00:00
Scott Taylor
f9476426d5 revset: fold ancestor unions before filters
Folding ancestor unions creates new unions, and the filter pass lifts
`AsFilter` through unions, so this order allows better optimizations.
2025-08-02 01:40:23 +00:00
Yuya Nishihara
756078803a index: add segment type field to ReadonlyIndexLoadError
ReadonlyIndexLoadError could be a wrapper of PathError instead, but that isn't
compatible with the abstraction of the current load functions. The inner load
functions may be called with an in-memory buffer.
2025-08-02 01:27:55 +00:00
Yuya Nishihara
dbb660acf0 index: attach file path to DefaultIndexStoreError 2025-08-02 01:27:55 +00:00