The default patterns are still saved to and loaded from .git/config. Maybe we
can add default fetch patterns to jj's configuration, but I'm not sure whether
we should deprecate .git/config fallback.
If a submodule was created in a commit C on a remote repo, switching from any
commit after C to any commit before C (eg. `jj new C-`) will result in jj
starting to track the files introduced in the submodule.
This issue has popped up very frequently for chromium developers, who
get issues when attempting to check out an older version of chromium.
Fixes#4349
`test_init_load_non_utf8_path` and
`test_init_additional_workspace_non_utf8_path` now early-return on
strict UTF-8 filesystems because there's no way to report a test as
"skipped" at runtime.
Closes https://github.com/jj-vcs/jj/issues/8118
After filing https://github.com/jj-vcs/jj/issues/7685 I ran some perf traces to try to understand just what was taking so long during these slow operations. The changes in this PR reduces clone time for my large repo from about 10 minutes to 4m30s.
You can see my thought process in the comments of the above task but to summarize:
During checkout we check files/directories being created to ensure that we are not attempting to write to a reserved directory (`.jj/`, `.git/`). `same_file::is_same_file()` is an expensive check that invokes _at least 4_ syscalls when called in a naive manner (`open()` and `close()` for each path -- plus possibly more for getting file info? I haven't counted).
There are a few optimization gaps here that are causing significant slowdowns. The following checklist reflects what I've optimized in this PR, and what still remains:
- [x] `create_parent_dirs` will be called for each file/directory and for each parent dir in a path **try to create it and check if the dir is an illegal name via `reject_reserved_existing_path()`**. There is no caching of directories which have already been created.
- [ ] `reject_reserved_existing` calls `same_file::is_same_file()` in a loop for all reserved names, but the path which _has maybe been created_ isn't going to change, so its handle could be cached.
- [ ] `can_create_new_file` attempts to create the file then just uses the result as an indicator of whether or not the file is created. However, since we _have a `File`_ that `File` can be directly converted to a `same_file::Handle` and avoid a syscall that currently occurs when converting the `Path` to a `same_file::Handle`.
- [ ] `can_create_new_file` deletes the file immediately after. There's probably an opportunity here to **not** delete the file and re-use it for file write operations.
- [ ] Say we have 1000 files in `foo/`. For each file that's written, `reject_reserved_existing` is going to make at least `RESERVED_DIR_NAMES.len() * 1000` syscalls constructing `foo/{reserved_dir_name}` paths, testing their existence, etc. Maybe `jj` might create this dir? But I don't think that should ever happen -- so why not cache the handle **if** it's created and use a lookup table in `reject_reserved_existing` to only conduct these types of checks if the handle is resolved? Or alternatively cache that the file _does not_ exist after the first check.
Here are some perf traces of running a `jj git clone` of my large repo before:
Release: https://share.firefox.dev/4oiSTBw
Debug: https://share.firefox.dev/4qmJBX1
And after:
Release: https://share.firefox.dev/4nK66mH
Debug: https://share.firefox.dev/470W1ed
Glob patterns will be enabled by default globally. Since this will be a big
breaking change in revsets, this patch adds a config knob to turn the new
default on/off.
Deprecation warnings will be emitted for default "substring:" patterns. This
change will suppress them. Since "glob:" will be the new default, I made these
tests use "glob:" when both "exact:" and "glob:" work.
Tests for the revset filter functions aren't updated.
Suppose the default is changed to "glob:", literal strings would be parsed by
glob() function. It's still better to treat trivial strings as "exact" patterns.
str_util::is_glob_char() includes backslash unconditionally because we enable
backslash escapes in string patterns.
This paves the way to deprecate `git.auto-local-bookmark` without
adding lots of deprecation warnings to test output snapshots.
The behavior of some tests is slightly changed, because
auto-track-bookmarks also tracks bookmarks that were created locally.
I think it just shows up in output snapshots as absent-tracked
bookmarks, without affecting what the test is about.
After the previous commit, `MergedTree` and `MergedTreeId` are almost
identical, with the only difference being that `MergedTree` is attached
to a `Store` instance. `MergedTreeId` is also equivalent to
`Merge<TreeId>`, since it is just a wrapper around it.
In the future, `MergedTree` might contain additional metadata like
conflict labels. Therefore, I replaced `MergedTreeId` with `MergedTree`
wherever I think it would be required to pass this additional metadata,
or where the additional methods provided by `MergedTree` would be
useful. In any remaining places, I replaced it with `Merge<TreeId>`.
I also renamed some of the `tree_id()` methods to `tree_ids()` for
consistency, since now they return a merge of individual tree IDs
instead of a single "merged tree ID". Similarly, `MergedTree` no longer
has an `id()` method, since tree IDs won't fully identify a `MergedTree`
once it contains additional metadata.
Currently, creating a `MergedTree` requires reading all of its root
trees from the store. However, this is often not actually required. For
instance, if the only reason to read the trees is to call
`MergedTree::merge`, and the merge is trivial, then there was no need to
read the trees. Changing `MergedTree` to only require a `Merge<TreeId>`
instead of a `Merge<Tree>` will make it possible to avoid reading trees
unnecessarily in these cases.
One benefit of this approach is that `Commit::tree` no longer requires
reading from the store, so it can be made synchronous and infallible,
which simplifies a lot of code.
Colocation is about sharing the working copy between jj and git. It's
less important where the repo is stored. I therefore think we should
not call it "colocated repo". I considered renaming it to "colocated
working copy" but that sounded awkward in many places because we often
talk about the whole workspace (repo + working copy), so "In colocated
workspaces with a very large number of branches or other refs" sounds
better than "In colocated working copies with a very large number of
branches or other refs".
Once we support colocate workspaces in non-main Git worktrees, I think
this rename will be even more relevant because then all those
workspaces share the same repo but only some of them may be colocated.
The default remote parameter of remote_bookmarks() will be derived from this
parameter. It doesn't make sense to exclude @git bookmarks if the backend isn't
Git. It's also nice that parsing tests don't depend on the feature flag.
A typical use case is to query bookmarked revisions ignoring auto-generated
bookmarks. `bookmarks() ~ bookmarks(x)` doesn't work because a revision may have
multiple bookmarks. It's also nice that we can document the default of
`remote_bookmarks()` as `remote_bookmarks(remote=~exact:"git")`.
Closes#7665
This is part of a series of changes to make most methods on index traits
(i.e. `ChangeIdIndex`, `MutableIndex`, `ReadonlyIndex`, `Index`)
fallible. This will enable networked implementations of these traits,
which may produce I/O errors during operation. See #7825 for more
information.
- Introduced a few more instances of the existing anti-pattern, `TODO:
indexing error shouldn't be a "BackendError"`. We're tracking this
known issue in #7849.
- Converted `MutableRepo::merge_view` to return a `RepoLoaderError`
instead of a `BackendError`. The only caller, `MutableRepo::merge`,
already returns a `RepoLoaderError`.
- Added three "`fallible_`" iterator helpers to reduce the amount of
noise at `is_ancestor` call sites due to the method now being
fallible. Using these helpers seem to produce code that's a little
more readable than using `process_results` from itertools. One
consideration in this trade-off is that these helpers do not
themselves return iterators: if we find that we need more support for
fallible combinators mid-"chain" of iterator combinators, we might
either want to use `process_results` only in those cases, or switch to
use of `process_results` across the board (in lieu of these new
helpers).
Although rare, it would be very confusing to a user if they encountered
a conflict with a `-------` marker while using the "diff" style. This
addresses a TODO in the test.
We were sometimes seeing crashes at Google when a commit was rewritten
very quickly without any changes. This should prevent the crashes and
return an error to the user instead.
This is part of a series of changes to make most methods on index traits
(i.e. `ChangeIdIndex`, `MutableIndex`, `ReadonlyIndex`, `Index`)
fallible. This will enable networked implementations of these traits,
which may produce I/O errors during operation. See #7825 for more
information.
This is part of a series of changes to make most methods on index traits
(i.e. `ChangeIdIndex`, `MutableIndex`, `ReadonlyIndex`, `Index`)
fallible. This will enable networked implementations of these traits,
which may produce I/O errors during operation. See #7825 for more
information.
This is part of a series of changes to make most methods on index traits
(i.e. `ChangeIdIndex`, `MutableIndex`, `ReadonlyIndex`, `Index`)
fallible. This will enable networked implementations of these traits,
which may produce I/O errors during operation. See #7825 for more
information.
This is part of a series of changes to make most methods on index traits
(i.e. `ChangeIdIndex`, `MutableIndex`, `ReadonlyIndex`, `Index`)
fallible. This will enable networked implementations of these traits,
which can produce I/O errors during operation. See #7825 for more
information.
This is part of a series of changes to make most methods on index traits
(i.e. `ChangeIdIndex`, `MutableIndex`, `ReadonlyIndex`, `Index`)
fallible. This will enable networked implementations of these traits,
which can produce I/O errors during operation. See #7825 for more
information.
I'm planning to try to add conflict labels to `MergedTree` and
`MergedTreeId`, and it will be easier to add them if both are structs
with similar methods. Since we don't support reading/writing legacy
conflicts anymore (as far as I'm aware), I think it should be safe to
delete the `MergedTreeId::Legacy` variant now.
When cloning with the branch option:
- Only the specified branch will be fetched
- The trunk alias is only set if the specified branch happens to be the default branch
- The clone fails if the branch does not exist in the remote
This will help "jj bookmark track" know whether absent remote ref can be created
for the specified remote. "jj bookmark" subcommands shouldn't depend on
gix::Repository API.
Although we don't have "jj git tag set"/"delete" commands, this fixes weird undo
behavior #6325. The discussion in #6325 is derailed, but there would be another
issue for the "abandon unreachable" behavior.
Fixes#6325
Since "tracked" state doesn't exist in Git, git::import_refs() should ignore
absent->absent "changes". git::export_refs() just works because it compares only
ref targets.
The migration logic is basically the same as 717d0d3d6d "git: on
deserialize/import/export, copy refs/heads/* to remote named git." Now
git::import_refs() processes bookmarks and tags in the same way.
git::export_refs() is unchanged because we don't have any commands that would
move local tags internally.
Git-tracking tags will be stored there. I don't have a concrete plan for proper
remote tags support, but I think tags fetched/pushed internally can be recorded
as remote tags.
I think this helps readability a bit. I updated `TreeDiffEntry` and a
few related places. I think there are many other places where we could
use the new type.