If we add changed-files index, MutableIndex::add_commit() will have to update
both MutableIndexSegment of commits and MutableChangedFileIndexSegment or
something. It won't make much sense to add high-level .add_commit() function to
MutableIndexSegment.
It's surprising that a symbol expression may be resolved to multiple revisions,
and that's one of the reason we require all: modifier in some places. Let's make
a symbol resolution fail in that case so we can deprecate the all: syntax.
The new error hints are a bit less informative, but I don't want to implement
ad-hoc formatting for resolve_some_revsets_default_single(). The user will have
to review the graph anyway in order to resolve divergence/conflicts.
Closes#5632
This reduces the overhead when loading a Tree object. I was doing an experiment
on indexing changed paths per commit, and noticed that backend::Tree::set() had
significant cost comparable to zlib decompression of Git tree objects.
This makes it clearer that the base non-conflicting tree data is the same. Since
there should be no duplicated entries, we don't need to remove "absent" entry
from the tree data.
git_repo.shallow_commits() had measurable cost when reindexing. I don't think
read_commit() should strip off parents matching the shallow roots, but we'll
need some caching to guarantee stable outputs.
This introduces an async version of `get_root_tree()` which reads
trees that are part of a conflict concurrently. I haven't noticed a
need for that at Google or elsewhere but it could be significant.
I updated the only caller I could find that was already in an async
context.
For most callers, the special sorting of directories before paths for
directory->file transitions is not needed. This patch changes
`diff_stream()` to not do that, and instead adds a new method
specifically for that behavior. Only `local_working_copy` uses it.
This refactors `MergedTree::diff_stream()` so `TreeDiffIterator` and
`TreeDiffStreamImpl` no longer handle directory->file transitions
specially. That's instead handled by a stream adapter afterwards. To
allow the adapter to handle such transitions, the inner streams now
include tree entries if the other side of the diff is a non-tree. I
think this makes the code easier to follow. It also makes it easier to
add a version of `diff_stream()` that doesn't emit directories after
files. I think that's a more natural order for most use cases.
I timed this patch by running `jj diff --ignore-working-copy -s --from
v5.0 --to v6.0` in the Linux repo. I didn't see any significant
difference.
This makes `path_value()` and related functions more consistent with
how the merged tree is presented to the user in the working copy and
when showing a diff.
In this implementation, we assume that predecessor commits created by old jj are
reachable from at least one of the historical views. However, there are a couple
of commands which create transitive predecessors. For example, "jj squash" into
grandparent will rebase a rewritten source, so the pre-rebase source commit
won't be visible to any views. To work around the problem, all immediate
predecessors of historically visible commits are also preserved.
Note that this change should be considered forward-incompatible change. The
stored commits may have unreachable predecessors once we run "jj op abandon &&
jj util gc".
WalkPredecessors::flush_commits() doesn't need to guard against unreachable
commits. I was wondering whether values (or old ids) of op.commit_predecessors
map should be preserved, and I decided to keep both keys and values. It's nice
that we can get rid of index.has_id() calls when we drop support for legacy
commit.predecessors.
Since we have "reverse hex" functions, it's easy to implement the same set of
functions for "forward hex". I believe our implementation is slower than
highly-optimized versions provided by e.g. faster-hex, but we don't use hex
encoding/decoding where the performance matters.
Basically, these functions work in the same way as bookmarks()/tags(). They
restrict the namespace to search the specified symbol, and unmatched symbol
isn't an error. One major difference is that ambiguous prefix triggers an error.
That's because ambiguous prefix should logically select all matching entries,
whereas the underlying functions don't provide this behavior. It's also unclear
whether we would want to get all matching commits by commit_id(prefix:'').
#5632
These functions will be called from change_id/commit_id() predicates, which will
decode hex strings while parsing the revset expression.
AmbiguousCommit/ChangeIdPrefix errors now include lowercase prefix strings
instead of the user-specified texts, which I don't think would matter.
Since we have to implement our own hex -> u8 translation, it doesn't make much
sense to convert u8 back to hex char. Let's directly construct decoded bytes.
I'm going to reimplement to_forward_hex() as (&str) -> Vec<u8> function, and the
current HexPrefix::new() isn't compatible with odd-length decoded bytes.
In merge-heavy history, the size of collected transitive parents can easily
explode. I simply changed the data type to IndexSet as we use IndexSet for the
mapping of external parents.
Fixes#6850
These functions existed because we had to deal with AliasExpanded nodes, which I
think is better handled by catch_aliases() + expect_<construct>() without
callback.
Some redundant error nodes are flattened.