CommitRef now stores raw author/committer headers and parses them when needed.
Since parsing errors would have been detected at .try_to_commit_ref(), this
patch makes new decode errors propagate as before.
Fixes#8350, #8214
Now that we can put the "from" and "to" sides on separate lines, we can
use the normal "(no terminating newline)" comment on each side
separately. This should also be more clear, since previously the
"(no terminating newline)" comment could possibly be confused for
"(removes terminating newline)".
Before:
```
<<<<<<< conflict 1 of 1
+++++++ rtsqusxu 2768b0b9 "commit A" (no terminating newline)
grapefruit
%%%%%%% diff from: vpxusssl 38d49363 "merge base"
\\\\\\\ to: ysrnknol 7a20f389 "commit B" (adds terminating newline)
-grape
+grape
>>>>>>> conflict 1 of 1 ends
```
After:
```
<<<<<<< conflict 1 of 1
+++++++ rtsqusxu 2768b0b9 "commit A" (no terminating newline)
grapefruit
%%%%%%% diff from: vpxusssl 38d49363 "merge base" (no terminating newline)
\\\\\\\ to: ysrnknol 7a20f389 "commit B"
-grape
+grape
>>>>>>> conflict 1 of 1 ends
```
Since conflict labels are generally going to be long, this introduces a
new "note" conflict marker (`\\\\\\\`), which can be used to split the
"diff" conflict marker (`%%%%%%%`) across two lines:
```
<<<<<<< conflict 1 of 1
%%%%%%% diff from: vpxusssl 38d49363 "description of base"
\\\\\\\ to: rtsqusxu 2768b0b9 "description of left"
-base
+left
+++++++ ysrnknol 7a20f389 "description of right"
right
>>>>>>> conflict 1 of 1 ends
```
Conflict labels will generally start with lowercase change IDs, so
making all of the text lowercase makes it more consistent. I also
removed the "contents of" text, since conflict labels will already be
long enough, and this text doesn't add anything. Similarly, I removed
the "(conflict 1 of 1)" note from the Git conflict markers since Git
doesn't include this information, and including it would result in extra
long lines once we add conflict labels.
This is an easy part of Git extra table GC. The implementation is quite similar
to SimpleOpStore::gc(). Since we don't delete unreachable "commit" entries from
the table segments, this wouldn't improve runtime performance. Directory lookup
might get slightly faster thanks to fewer file entries, though.
#12, #8312
It's usually going to be easier for a user to run the same command again
but with a change offset appended, so I think these are more helpful
than commit IDs.
I'm going to add a way to select a specific commit from the set of
commits with a given change ID. We could sort these by committer
timestamp or something else, but index position is convenient because it
doesn't require reading the commits. The exact order isn't too
important, but giving newer commits lower offset numbers is nice because
newer commits are used more frequently.
The auto-tracking-bookmarks settings will be parsed as string matcher
expressions. Since it isn't trivial to parse string expressions, the [remotes]
table should be loaded only when needed.
New GitImportOptions type has no from_settings() constructor. That's mainly
because parsing function can potentially emit warnings, and these warnings will
have to be printed.
The old method is renamed to `MergedTree::merge_unlabeled` to make it
easy to find unmigrated callers. The goal is that almost all callers
will eventually use `MergedTree::merge` to add labels, unless the
resulting tree is never visible to the user.
To implement simplification of conflict labels, I decided to add more
functions such as `zip` and `unzip` to `Merge`. I think these functions
could be useful in other situations so I thought this was a nice
solution, but an alternative solution could be to make
`get_simplified_mapping` and `apply_simplified_mapping` public and
manually apply the same mapping to both merges.
The default patterns are still saved to and loaded from .git/config. Maybe we
can add default fetch patterns to jj's configuration, but I'm not sure whether
we should deprecate .git/config fallback.
If a submodule was created in a commit C on a remote repo, switching from any
commit after C to any commit before C (eg. `jj new C-`) will result in jj
starting to track the files introduced in the submodule.
This issue has popped up very frequently for chromium developers, who
get issues when attempting to check out an older version of chromium.
Fixes#4349
`test_init_load_non_utf8_path` and
`test_init_additional_workspace_non_utf8_path` now early-return on
strict UTF-8 filesystems because there's no way to report a test as
"skipped" at runtime.
Closes https://github.com/jj-vcs/jj/issues/8118
After filing https://github.com/jj-vcs/jj/issues/7685 I ran some perf traces to try to understand just what was taking so long during these slow operations. The changes in this PR reduces clone time for my large repo from about 10 minutes to 4m30s.
You can see my thought process in the comments of the above task but to summarize:
During checkout we check files/directories being created to ensure that we are not attempting to write to a reserved directory (`.jj/`, `.git/`). `same_file::is_same_file()` is an expensive check that invokes _at least 4_ syscalls when called in a naive manner (`open()` and `close()` for each path -- plus possibly more for getting file info? I haven't counted).
There are a few optimization gaps here that are causing significant slowdowns. The following checklist reflects what I've optimized in this PR, and what still remains:
- [x] `create_parent_dirs` will be called for each file/directory and for each parent dir in a path **try to create it and check if the dir is an illegal name via `reject_reserved_existing_path()`**. There is no caching of directories which have already been created.
- [ ] `reject_reserved_existing` calls `same_file::is_same_file()` in a loop for all reserved names, but the path which _has maybe been created_ isn't going to change, so its handle could be cached.
- [ ] `can_create_new_file` attempts to create the file then just uses the result as an indicator of whether or not the file is created. However, since we _have a `File`_ that `File` can be directly converted to a `same_file::Handle` and avoid a syscall that currently occurs when converting the `Path` to a `same_file::Handle`.
- [ ] `can_create_new_file` deletes the file immediately after. There's probably an opportunity here to **not** delete the file and re-use it for file write operations.
- [ ] Say we have 1000 files in `foo/`. For each file that's written, `reject_reserved_existing` is going to make at least `RESERVED_DIR_NAMES.len() * 1000` syscalls constructing `foo/{reserved_dir_name}` paths, testing their existence, etc. Maybe `jj` might create this dir? But I don't think that should ever happen -- so why not cache the handle **if** it's created and use a lookup table in `reject_reserved_existing` to only conduct these types of checks if the handle is resolved? Or alternatively cache that the file _does not_ exist after the first check.
Here are some perf traces of running a `jj git clone` of my large repo before:
Release: https://share.firefox.dev/4oiSTBw
Debug: https://share.firefox.dev/4qmJBX1
And after:
Release: https://share.firefox.dev/4nK66mH
Debug: https://share.firefox.dev/470W1ed
Glob patterns will be enabled by default globally. Since this will be a big
breaking change in revsets, this patch adds a config knob to turn the new
default on/off.
Deprecation warnings will be emitted for default "substring:" patterns. This
change will suppress them. Since "glob:" will be the new default, I made these
tests use "glob:" when both "exact:" and "glob:" work.
Tests for the revset filter functions aren't updated.
Suppose the default is changed to "glob:", literal strings would be parsed by
glob() function. It's still better to treat trivial strings as "exact" patterns.
str_util::is_glob_char() includes backslash unconditionally because we enable
backslash escapes in string patterns.
This paves the way to deprecate `git.auto-local-bookmark` without
adding lots of deprecation warnings to test output snapshots.
The behavior of some tests is slightly changed, because
auto-track-bookmarks also tracks bookmarks that were created locally.
I think it just shows up in output snapshots as absent-tracked
bookmarks, without affecting what the test is about.
After the previous commit, `MergedTree` and `MergedTreeId` are almost
identical, with the only difference being that `MergedTree` is attached
to a `Store` instance. `MergedTreeId` is also equivalent to
`Merge<TreeId>`, since it is just a wrapper around it.
In the future, `MergedTree` might contain additional metadata like
conflict labels. Therefore, I replaced `MergedTreeId` with `MergedTree`
wherever I think it would be required to pass this additional metadata,
or where the additional methods provided by `MergedTree` would be
useful. In any remaining places, I replaced it with `Merge<TreeId>`.
I also renamed some of the `tree_id()` methods to `tree_ids()` for
consistency, since now they return a merge of individual tree IDs
instead of a single "merged tree ID". Similarly, `MergedTree` no longer
has an `id()` method, since tree IDs won't fully identify a `MergedTree`
once it contains additional metadata.
Currently, creating a `MergedTree` requires reading all of its root
trees from the store. However, this is often not actually required. For
instance, if the only reason to read the trees is to call
`MergedTree::merge`, and the merge is trivial, then there was no need to
read the trees. Changing `MergedTree` to only require a `Merge<TreeId>`
instead of a `Merge<Tree>` will make it possible to avoid reading trees
unnecessarily in these cases.
One benefit of this approach is that `Commit::tree` no longer requires
reading from the store, so it can be made synchronous and infallible,
which simplifies a lot of code.
Colocation is about sharing the working copy between jj and git. It's
less important where the repo is stored. I therefore think we should
not call it "colocated repo". I considered renaming it to "colocated
working copy" but that sounded awkward in many places because we often
talk about the whole workspace (repo + working copy), so "In colocated
workspaces with a very large number of branches or other refs" sounds
better than "In colocated working copies with a very large number of
branches or other refs".
Once we support colocate workspaces in non-main Git worktrees, I think
this rename will be even more relevant because then all those
workspaces share the same repo but only some of them may be colocated.
The default remote parameter of remote_bookmarks() will be derived from this
parameter. It doesn't make sense to exclude @git bookmarks if the backend isn't
Git. It's also nice that parsing tests don't depend on the feature flag.
A typical use case is to query bookmarked revisions ignoring auto-generated
bookmarks. `bookmarks() ~ bookmarks(x)` doesn't work because a revision may have
multiple bookmarks. It's also nice that we can document the default of
`remote_bookmarks()` as `remote_bookmarks(remote=~exact:"git")`.
Closes#7665
This is part of a series of changes to make most methods on index traits
(i.e. `ChangeIdIndex`, `MutableIndex`, `ReadonlyIndex`, `Index`)
fallible. This will enable networked implementations of these traits,
which may produce I/O errors during operation. See #7825 for more
information.
- Introduced a few more instances of the existing anti-pattern, `TODO:
indexing error shouldn't be a "BackendError"`. We're tracking this
known issue in #7849.
- Converted `MutableRepo::merge_view` to return a `RepoLoaderError`
instead of a `BackendError`. The only caller, `MutableRepo::merge`,
already returns a `RepoLoaderError`.
- Added three "`fallible_`" iterator helpers to reduce the amount of
noise at `is_ancestor` call sites due to the method now being
fallible. Using these helpers seem to produce code that's a little
more readable than using `process_results` from itertools. One
consideration in this trade-off is that these helpers do not
themselves return iterators: if we find that we need more support for
fallible combinators mid-"chain" of iterator combinators, we might
either want to use `process_results` only in those cases, or switch to
use of `process_results` across the board (in lieu of these new
helpers).