This is slower than traversing commit.predecessor_ids, but allows us to show
associated operations alongside the commits. Since operations are (mostly)
traversed in chronological order, the evolution graph is also emitted in that
order.
Operation ids in test_op_abandon*() are changed because reparented operations
now preserve the predecessors mapping.
Since this patch will affect whether "jj evolog" can show the associated
operations, I added a changelog entry. It's still a bit vague, but I think we
can add a more detailed explanation later.
There are two major goals:
1. garbage-collect predecessor commits referenced by immutable commits.
2. show operations alongside predecessors in "jj evolog".
The predecessors field will be removed from the Commit object. Maybe we can
also remove (writer side of) the extras table from GitBackend.
"jj evolog" will traverse the operation history to build an evolution
graph. This will be slower than the current implementation, but it seems
tolerable for mid-size repository stored in a local disk.
(when the page cache is warm)
% time jj op log -T'"."' --no-graph | wc -c
50459
jj op log -T'"."' --no-graph 0.12s user 0.31s system 103% cpu 0.418 total
Suppose we're interested in recent modifications, the traversal can often be
terminate early. I also have an idea for indexing originating operations.
https://github.com/jj-vcs/jj/pull/6405
For old operations which didn't record predecessors, "jj evolog" will fall back
to commit.predecessor_ids(). That's why commit_predecessors is Option<_>.
This is mostly for consistency with `read_file()` at this point. I'm
not sure if we need this for Google in the near future.
For now, I wrapped the file-reading in `local_working_copy.rs` in a
new `BlockingAsyncReader`. We should switch to using async I/O in the
future.
Instead of calling `move_commits`, callers now have the option to call
`compute_move_commits` and then customize the rebase before calling
`ComputedMoveCommits::apply`. `ComputedMoveCommits::record_to_abandon`
can be used to abandon commits while rebasing.
The `Backend::read_file()` method is async but it returns a `Box<dyn
Read>` and reading from that trait is blocking. That's fine with the
local Git backend but it can be slow for remote backends. For example,
our backend at Google reads file chunks 1 MiB at a time from the
server. What that means is that reading lots of small files
concurrently works fine since the whole file contents are returned by
the first `Read::read()` call (it was fetched when
`Backend::read_file()` was issued). However, when reading files that
are larger than one chunk, we end up blocking on the next
`Read::read()` call. I haven't verified that this actually is a
problem at Google, but fixing this blocking is something we should do
eventually anyway.
This patch makes `Backend::read_file()` return a `Pin<Box<dyn
AsyncRead>>` instead, so implementations can be async in the read part
too.
Since `AsyncRead` is not yet standardized, we have to choose between
the one from `futures` and the one from `tokio`. I went with the one
from `tokio`. I picked that because an earlier version of this patch
used `tokio::fs` for some reads. Then I realized that doing that means
that we have to use a tokio runtime, meaning that we can't safely keep
our existing `pollster::FutureExt::block_on()` calls. If we start
depending on tokio's specific runtime, I think we would first want to
remove all the `block_on()` calls. I'll leave that for later. I think
at this point, we could equally well use `futures::io::AsyncRead`, but
I also don't know if there's a reason to prefer that.
It doesn't make sense to use the `Read` instance after calling
`write_file()`, so the caller shouldn't need to hold on to the
instance.
This slightly simplifies a later patch replacing `Read` by
`AsyncRead`.
This way `read_conflict()` more clearly doesn't need to worry about
non-utf8 data. It also makes a subsequent change to make `read_file()`
return an `AsyncRead` simpler.
We don't want to derive `PartialEq` on `OpsetResolutionError` because we may need an
`Other` variant in the future that wraps a type without a `PartialEq` implementation.
Pattern-matching is used instead to assert the contents.
This serves a similar purpose to Git's patch ID mechanism, however it is
slightly different in that it only compares commits which have the same
change ID as each other (divergent changes), and it does a full
comparison of the commits to see if they would have identical trees if
rebased onto the same parents. Since most changes aren't divergent, I
believe this should have a negligible performance cost in most cases.
I think skipping these commits by default makes sense for `jj rebase`,
since usually this will be a helpful behavior for the user. With this
behavior, a safe first step when encountering divergent changes would be
to rebase one branch on top of the other, since that will abandon any
divergent changes that have identical contents to existing commits,
leaving behind any non-trivial divergent changes for the user to resolve
manually.
the feature reused as much as possible of the `jj rebase` code, with
minimum modifications, in order to avoid reimplementing or complexifying
the rebase code.
It leads to some cases where some commits may be rewritten twice.
For example:
❯ jj -s
@ ylrkuwto (empty) (no description set)
○ qsorlqym file4
│ A file4
○ mmkvzzpw file3
│ A file3
○ uttlrpzl file1&2
│ A file1
│ A file2
◆ zzzzzzzz root()
❯ jj split -r u -A m -m file1 file1
Rebased 3 descendant commits
First part: ulpvupsv aaf064bc file1
Second part: uttlrpzl 0d4358c2 file1&2
Working copy (@) now at: ylrkuwto 60d905c5 (empty) (no description set)
Parent commit (@-) : qsorlqym 8cfbe2d8 file4
❯ jj -s
@ ylrkuwto (empty) (no description set)
○ qsorlqym file4
│ A file4
○ ulpvupsv file1
│ A file1
○ mmkvzzpw file3
│ A file3
○ uttlrpzl file1&2
│ A file2
◆ zzzzzzzz root()
There is an intermediate state created as
@ ylrkuwto (empty) (no description set)
○ qsorlqym file4
│ A file4
○ mmkvzzpw file3
│ A file3
○ uttlrpzl file1&2
│ A file2
○ ulpvupsv file1
│ A file1
◆ zzzzzzzz root()
where the commits in `uttlrpzl::` have been rewritten, then `ulpvupsv`
is moved, effectively rewriting the commits in `ulpvupsv::`.
The number of user visible rewritten commits is accurately reported.
`jj split --parallel` is still done as a special case: implementing
it with the rebase code would put the first commits on the right
branch.
I'd like to make the custom Google binary log which backend is used
(typically the Google-specific backend, but can also be the Git
backend). It would be nice to able to use
`workspace.repo_loader().store().backend().name()` for that.
Apparently `gpg-agent` will fail to start if the `--homedir` path is too
long:
```
> gpg-agent --homedir=/private/tmp/nix-build-jujutsu-0.29.0-unstable-8769c5c.drv-0/jj-gpg-signing-test-4I0aI0/ --daemon
gpg-agent[6617]: directory '/private/tmp/nix-build-jujutsu-0.29.0-unstable-8769c5c.drv-0/jj-gpg-signing-test-4I0aI0/private-keys-v1.d' created
gpg-agent[6617]: socket name '/private/tmp/nix-build-jujutsu-0.29.0-unstable-8769c5c.drv-0/jj-gpg-signing-test-4I0aI0/S.gpg-agent.extra' is too long
```
I only discovered this after reading a comment on StackOverflow [0].
[0]: https://superuser.com/questions/1087554/when-adding-a-gpg-key-with-homedir-parameter-error#comment1684440_1130507
This makes it clear that the repo state isn't mutated during computation of the
destination and descendants. Still compute_move_commits() is big, so we might
want to split it further. If needed, maybe we can add variants of
apply_move_commits() to fix up intermediate predecessor chains?
compute_move_commits() needs (immutable) &MutableRepo because it depends on
repo.find_descendants_for_rebase().
It's tedious to update Windows code that isn't compiled locally.
In order to ensure conversion between bool and platform-native type, this patch
changes the type alias to real struct. I think this is also good if we decide to
retain executable bits on all platforms.
I was wondering whether this is presentation issue or not, and I think it is a
matter of DiffLineIterator. For matching hunks, DiffLineIterator flushes the
current_line buffer and bumps the line numbers for the next line. This should
guarantee that there are no blank DiffLine to be queued. However, for different
hunks, only the left line number can be bumped in the first loop, so there may
be an empty-looking hunk having the same right line number.
Closes#6471
We haven't had any reports of problems from people who opted in. Since
it's early in the release cycle now, let's now test it on everyone who
builds from head, so we get almost a month of testing from those
people before it's enabled by default in a released version.
This impacts lots of test cases because the change-id header is added
to the Git commit. Most are uninteresting. `test_git_fetch` now sees
some divergent changes where it used to see only divergent bookmarks,
which makes sense.