Commit graph

4043 commits

Author SHA1 Message Date
Martin von Zweigbergk
721daef0b4 store: inline tree_builder() function to callers
`Store::tree_builder()` returns a `TreeBuilder`. Almost all callers
should be using the `MergedTreeBuilder` these days. This patch
therefore removes `tree_builder()` to reduce the risk of accidentally
using it.
2025-07-18 21:36:13 +00:00
Kaiyi Li
f1f1556731 local working copy: add support for EOL conversion 2025-07-17 15:36:28 +00:00
Kaiyi Li
74fb5a6096 working copy: pass UserSettings to WorkingCopyFactory
... so that later `TreeState` can query the EOL settings on
construction.
2025-07-17 15:36:28 +00:00
Martin von Zweigbergk
e982db8fd0 test_merged_tree: clear store caches before calling diff_stream()
This catches the bug introduced in 1b1edc7a90 (fixed in patch just
before this one).
2025-07-14 16:09:41 +00:00
Martin von Zweigbergk
25723d6956 merged_tree: finish polling tree before emitting path
In 1b1edc7a90, I missed the importance of this comment:

```
// Whenever we add an entry to `self.pending_trees`, we also add an Ok() entry
// to `self.items`.
```

The `self.items` entry was there to make sure that we wait for the
pending tree to be polled to completion, thus resulting in its entries
getting added to `self.items`. After my commit, we no longer always
add an entry to `items`, which meant that we can end up emitting
entries from a parent tree before entries in a child tree, such as
`foo/baz` before `foo/bar/qux` even though `baz` comes after `bar`.

This patch fixes the bug by instead checking in `self.pending_trees`
that there are no directories that we need to emit first. Thanks to
@yuja for the suggestion to do it this way instead.

The next patch will update the tests to catch regressions.
2025-07-14 16:09:41 +00:00
Martin von Zweigbergk
5d35eadd4e tests: make TestBackend async for more realistic testing
The `TestBackend` methods currently return their data immediately (on
the first poll), which means that if multiple futures are created and
then they're polled "concurrently", they will always return their data
in the order they're being polled. That leads to poor testing of
algortihms that poll futures concurrently, such as `TreeDiffStream`.

This patch makes `TestBackend` spawn async work to run in a tokio
runtime instead. That's enough to show a bug I introduced with my
recent refactoring of `TreeDiffStream`, except that it's also covered
up by the caching we do in `Store`. I'll fix the bug and update tests
to work around the caching next.

This slows down the jj-lib tests from 2.8 s to 3.1 s. I don't think
that matter much, given that the jj-cli tests takes > 30 s.

I tried to add a small `tokio::time::sleep()` (random up to 5 ms) but
that slowed down the property-based tests of the diff editor very
significantly (took over a minute). Maybe we could have two different
kinds of test backend or maybe make the sleep configurable in some
way. We can improve that later. The async-ness added in this patch is
sufficient for catching the diff-stream bug.
2025-07-14 16:09:41 +00:00
Martin von Zweigbergk
1a89ac8d53 merged_tree: poll trees in the order we're going to emit them
It should genenerally be better to prioritize polling trees in the
order we're going to emit their entries. For example, if we have
pending trees `zzz/` and `dir/aaa/`, it's better to poll the latter
even though we inserted the former first.

This also prepares for fixing a bug related to the order we emit. We
will then want to look up in `pending_trees` by key found in `items`.
2025-07-14 16:09:41 +00:00
Martin von Zweigbergk
ed6fa71835 merged_tree: avoid destructuring only to construct the same values 2025-07-14 16:09:41 +00:00
Martin von Zweigbergk
1fa3010807 merged_tree: remove an unnecessary async block 2025-07-14 16:09:41 +00:00
Yuya Nishihara
dfddf6c431 tests: use forwarding DefaultMutableIndex::num_commits() function in tests 2025-07-12 00:04:02 +00:00
Yuya Nishihara
794d4bade9 index: remove redundant named lifetime parameter from Index::evaluate_revset() 2025-07-12 00:04:02 +00:00
Yuya Nishihara
468c18a850 object_id: make from_hex() not require utf-8 string
Also removed redundant .as_ref() from ChangeId::try_from_reverse_hex().
2025-07-12 00:03:52 +00:00
Yuya Nishihara
642bedd67f object_id: fix name resolution of derive(ContentHash) in macro expansion 2025-07-12 00:03:52 +00:00
Yuya Nishihara
ea3c1791a6 revset: soft-deprecate "all:" modifier syntax 2025-07-11 17:15:26 +00:00
Yuya Nishihara
87aff55e7c index: inline MutableIndexSegment::add_commit()
If we add changed-files index, MutableIndex::add_commit() will have to update
both MutableIndexSegment of commits and MutableChangedFileIndexSegment or
something. It won't make much sense to add high-level .add_commit() function to
MutableIndexSegment.
2025-07-10 12:40:13 +00:00
Yuya Nishihara
f93ec31a29 index: hide CompositeIndex, IndexEntry, and IndexPosition types
These types are implementation details, and it was relatively easy to remove
test dependencies.
2025-07-10 12:40:13 +00:00
Yuya Nishihara
84a456cc1f index: replace external users of revset_engine, make it private 2025-07-10 12:40:13 +00:00
Yuya Nishihara
e4f0942f26 index: replace external users of CompositeIndex 2025-07-10 12:40:13 +00:00
Bryce Berger
a471eb9cb1 template/revset: mention tree-sitter grammar in pest file 2025-07-10 09:54:27 +00:00
Yuya Nishihara
362ab91972 index: refactor merge_in() to use monadic iterator methods
I think this makes it clear that we visit a bigger side to find the common
ancestor segment.
2025-07-10 00:29:35 +00:00
Yuya Nishihara
39bb608a43 cleanup: remove &'static lifetime that can be implied 2025-07-08 01:52:44 +00:00
Pavan Kumar Sunkara
c038ef4bc3 workspaces: Add templating support to workspace list 2025-07-07 19:14:07 +00:00
Martin von Zweigbergk
af621c7dbe cleanup: replace once_cell::sync::Lazy by std::sync::LazyLock
This makes it a little easier to switch away from the `once_cell`
crate once `get_or_try_init()` has been stabilized.
2025-07-07 14:24:25 +00:00
Yuya Nishihara
be094ef76e revset: don't resolve symbol expression to multiple revisions
It's surprising that a symbol expression may be resolved to multiple revisions,
and that's one of the reason we require all: modifier in some places. Let's make
a symbol resolution fail in that case so we can deprecate the all: syntax.

The new error hints are a bit less informative, but I don't want to implement
ad-hoc formatting for resolve_some_revsets_default_single(). The user will have
to review the graph anyway in order to resolve divergence/conflicts.

Closes #5632
2025-07-07 14:11:29 +00:00
Yuya Nishihara
34820cf2d7 index: pack length parameters into struct
Struct with named fields should be better than usize parameters.
2025-07-07 14:08:37 +00:00
Yuya Nishihara
cb19fc5a82 index: simplify scope of private methods and constants
We no longer need pub(crate) since the revset engine has been internalized.
2025-07-07 14:08:37 +00:00
Yuya Nishihara
d378e011fe cleanup: use .is_sorted_by() to check if elements are sorted and unique 2025-07-07 14:08:25 +00:00
Yuya Nishihara
38fb765fe2 backend: store tree entries in sorted Vec instead of BTreeMap
This reduces the overhead when loading a Tree object. I was doing an experiment
on indexing changed paths per commit, and noticed that backend::Tree::set() had
significant cost comparable to zlib decompression of Git tree objects.
2025-07-07 08:17:18 +00:00
Yuya Nishihara
e56a989755 backend: make backend::Tree immutable, collect entries by caller
This helps change the underlying type from BTreeMap to sorted Vec.
2025-07-07 08:17:18 +00:00
Yuya Nishihara
63246018df merged_tree: clone non-conflicting tree data first when merging trees
This makes it clearer that the base non-conflicting tree data is the same. Since
there should be no duplicated entries, we don't need to remove "absent" entry
from the tree data.
2025-07-07 08:17:18 +00:00
Yuya Nishihara
5bda87e480 content_hash: implement ContentHash for tuples up to 4 items
I'll add Vec<(K, V)> field to backend::Tree. The tuple_impls() macro is copied
from templater.rs.
2025-07-07 08:17:18 +00:00
Yuya Nishihara
fa145d395e git_backend: cache shallow roots
git_repo.shallow_commits() had measurable cost when reindexing. I don't think
read_commit() should strip off parents matching the shallow roots, but we'll
need some caching to guarantee stable outputs.
2025-07-07 08:13:27 +00:00
Martin von Zweigbergk
4a02ff27ca async: introduce async version of Commit::tree()
I updated the callers that were already async.
2025-07-04 13:26:08 +00:00
Martin von Zweigbergk
cceafcf297 async: add async version of get_root_tree()
This introduces an async version of `get_root_tree()` which reads
trees that are part of a conflict concurrently. I haven't noticed a
need for that at Google or elsewhere but it could be significant.

I updated the only caller I could find that was already in an async
context.
2025-07-04 13:26:08 +00:00
Martin von Zweigbergk
bf0026a67d rewrwite: make merge_commit_trees_no_resolve_without_repo() async 2025-07-04 13:26:08 +00:00
Martin von Zweigbergk
3211e9c05c merged_tree: make diff_stream() sort foo before foo/bar
For most callers, the special sorting of directories before paths for
directory->file transitions is not needed. This patch changes
`diff_stream()` to not do that, and instead adds a new method
specifically for that behavior. Only `local_working_copy` uses it.
2025-07-04 01:12:18 +00:00
Martin von Zweigbergk
1b1edc7a90 merged_tree: move directory->file handling in diff to stream adapter
This refactors `MergedTree::diff_stream()` so `TreeDiffIterator` and
`TreeDiffStreamImpl` no longer handle directory->file transitions
specially. That's instead handled by a stream adapter afterwards. To
allow the adapter to handle such transitions, the inner streams now
include tree entries if the other side of the diff is a non-tree. I
think this makes the code easier to follow. It also makes it easier to
add a version of `diff_stream()` that doesn't emit directories after
files. I think that's a more natural order for most use cases.

I timed this patch by running `jj diff --ignore-working-copy -s --from
v5.0 --to v6.0` in the Linux repo. I didn't see any significant
difference.
2025-07-04 01:12:18 +00:00
Martin von Zweigbergk
470a6e65c6 merge: add a method for checking for present non-tree value
I'm going to add some more callers of this method next.
2025-07-04 01:12:18 +00:00
Martin von Zweigbergk
98af8dcf97 merged_tree: consider paths inside conflicted directories hidden
This makes `path_value()` and related functions more consistent with
how the merged tree is presented to the user in the working copy and
when showing a diff.
2025-07-04 01:12:18 +00:00
Yuya Nishihara
ea8aa1e17c index: don't preserve commits not referred to by operations/views
In this implementation, we assume that predecessor commits created by old jj are
reachable from at least one of the historical views. However, there are a couple
of commands which create transitive predecessors. For example, "jj squash" into
grandparent will rebase a rewritten source, so the pre-rebase source commit
won't be visible to any views. To work around the problem, all immediate
predecessors of historically visible commits are also preserved.

Note that this change should be considered forward-incompatible change. The
stored commits may have unreachable predecessors once we run "jj op abandon &&
jj util gc".

WalkPredecessors::flush_commits() doesn't need to guard against unreachable
commits. I was wondering whether values (or old ids) of op.commit_predecessors
map should be preserved, and I decided to keep both keys and values. It's nice
that we can get rid of index.has_id() calls when we drop support for legacy
commit.predecessors.
2025-07-03 09:06:21 +00:00
Yuya Nishihara
68ead52c5c hex_util: roll our own decode/encode_hex() functions
Since we have "reverse hex" functions, it's easy to implement the same set of
functions for "forward hex". I believe our implementation is slower than
highly-optimized versions provided by e.g. faster-hex, but we don't use hex
encoding/decoding where the performance matters.
2025-07-02 01:56:40 +00:00
Yuya Nishihara
f1b29510d3 object_id: rename HexPrefix::new() to ::try_from_hex() for consistency 2025-07-02 01:56:40 +00:00
Yuya Nishihara
3103369681 object_id: add HexPrefix::from_id() for convenience 2025-07-02 01:56:40 +00:00
Cyril Plisko
f5a470af49 Update trait Backend::name() comment to match reality
`.jj/repo/store/backend` -> `.jj/repo/store/type`
2025-07-01 20:46:56 +00:00
Yuya Nishihara
8db40c7fa2 revset: add change_id/commit_id(prefix) predicates
Basically, these functions work in the same way as bookmarks()/tags(). They
restrict the namespace to search the specified symbol, and unmatched symbol
isn't an error. One major difference is that ambiguous prefix triggers an error.
That's because ambiguous prefix should logically select all matching entries,
whereas the underlying functions don't provide this behavior. It's also unclear
whether we would want to get all matching commits by commit_id(prefix:'').

#5632
2025-06-30 14:38:50 +00:00
Yuya Nishihara
16d50929b1 revset: extract prefix resolution functions
These functions will be called from change_id/commit_id() predicates, which will
decode hex strings while parsing the revset expression.

AmbiguousCommit/ChangeIdPrefix errors now include lowercase prefix strings
instead of the user-specified texts, which I don't think would matter.
2025-06-30 14:38:50 +00:00
Yuya Nishihara
18e985f892 object_id: implement custom Debug for HexPrefix
This is useful in snapshot tests.
2025-06-30 14:38:50 +00:00
Yuya Nishihara
20ed25f8c4 object_id: add HexPrefix::reverse_hex()
This will be used in order to format ChangeId prefix.
2025-06-30 14:38:50 +00:00
Yuya Nishihara
65255444a2 hex_util: add encode_reverse_hex*() instead of to_forward_hex()
Since we have to implement our own hex -> u8 translation, it doesn't make much
sense to convert u8 back to hex char. Let's directly construct decoded bytes.
2025-06-30 14:38:50 +00:00
Yuya Nishihara
e6914c2459 object_id: extract ChangeId/HexPrefix constructors for "reverse" hex
I'm going to reimplement to_forward_hex() as (&str) -> Vec<u8> function, and the
current HexPrefix::new() isn't compatible with odd-length decoded bytes.
2025-06-30 14:38:50 +00:00