I originally considered adding deny-list-based implementation, but the Windows
compatibility rules are super confusing and I don't have a machine to find out
possible aliases. This patch instead adds directory equivalence tests.
In order to test file entity equivalence, we first need to create a file or
directory of the requested name. It's harmless to create an empty .jj or .git
directory, but materializing .git file or symlink can temporarily set up RCE
situation. That's why new empty file is created to test the path validity. We
might want to add some optimization for safe names (e.g. ASCII, not contain
"git" or "jj", not contain "~", etc.)
That being said, I'm not pretty sure if .git/.jj in sub directory must be
checked. It's not safe to cd into the directory and run "jj", but the same
thing can be said to other tools such as "cargo". Perhaps, our minimum
requirement is to protect our metadata (= the root .jj and .git) directories.
Despite the crate name (and internal use of std::fs::File),
same_file::is_same_file() can test equivalence of directories. This is
documented and tested, so I've removed my custom implementation, which was
slightly simpler but lacks Windows support.
If new file would overwrite an existing regular file, the file path is skipped.
It makes sense to apply the same rule to existing symlinks. Without this patch,
check out would fail if an existing path was a dead symlink or a symlink to
a directory.
I'm not sure if this was attackable before, but it should be better to not
try to remove file across symlinks.
The disk_path is now returned from create_parent_dirs() to clarify that the
path is identical.
This should be safer than relying on file open error. It's scary to continue
processing if the file was a symlink.
I'll add a few more sanity checks to remove_old_file(), so it's extracted as a
function.
I'm going to add "checked" version of to_fs_path(), but all callers can't be
migrated to it. For example, an error message should be produced even if the
path is malformed.
This patch also adds error variants to propagate InvalidRepoPathError. They
don't use ::Other { .. } so the errors can be distinguished in tests.
I'm going to replace the current .evaluate_programmatic() which does minimal
commit-ref resolution. The new .evaluate_programmatic() will be implemented on
a "resolved" expression.
For the same reason as the previous patch. It's nice if root() is considered
a "resolved" expression. With this change, most of the evaluate_programmatic()
callers won't have to do symbol resolution at all.
I'm going to add RevsetExpression<State = Resolved|User> type parameter to
detect API misuse at compile time. VisibleHeads is similar to All, and appears
in generic expression substitution function where a concrete State type
shouldn't be known.
This ensures that a symbol-resolved at_operation() expression won't be resolved
again when it's intersected with another expression, for example.
# in CLI
let expr1 = parse("at_operation(..)").resolve_user_symbol();
# in library
let expr2 = RevsetExpression::ancestors().intersection(&expr1);
expr2.evaluate_programmatic()
This will help construct file content based on diff hunks. For example, "jj
absorb" will first calculate annotation of the source parent (within mutable
ancestors), calculate diff, then "squash" hunks into ancestor commits of the
surrounding ranges.
This unblocks the use of TestBackend in long-running processes such as fuzzer.
It should also be safer because TempDir doesn't guarantee that the path is never
reused.
TestBackendData instances persist in memory right now, but they should be
discarded when the corresponding temp_dir gets dropped. The added struct will
manage the TestBackendData mapping.
We had documented that we support `git.auto-local-bookmark` but we
don't. The documentation has been incorrect since d9c68e08b1. This
patch fixes it by adding support for `git.auto-local-bookmark` with
fallback to the old/current `git.auto-local-branch`.
.
We might want to calculate (commit_id, range) pairs of consecutive lines in
order to "absorb" changes, for example.
This should also be cheaper since Vec<u8> doesn't have to be allocated per line.
Perhaps, get_same_line_map() could return an iterator, but implementing an
iterator to be "pull"-ed is much harder than writing a function to "push",
especially when lifetime is involved.
This function was short, and this change makes it clear that !.is_empty() was
redundant. Duplicated doc comment is also removed. I feel the inline comment is
easier to follow here.
It no longer makes sense to initialize Source line_map and build
HashMap<Commit, Source> in one function. Let's extract the line_map
initialization to a function instead.
All intermediate nodes are changed to RevWalk of Result<IndexPosition, _> type
to pass BackendError around from filter predicates. Leaf ancestors/descendants
computation is unchanged, and mapped to Result at revset_engine layer. This is
simpler than converting all RevWalk impls to Result<_, _>.
We'll need to propagate error from predicate function, so .filter() will no
longer be usable. .map() will be used in order to wrap infallible ancestry
lookup with Ok(_).
Some RevsetImpl methods are migrated to .map() as example.
I don't see measurable performance difference, but VecDeque is theoretically
simpler than BTreeSet. The input is sorted, so we never do random insertion.
This also allows some minor optimizations to be performed, such as
avoiding recomputation of the connected target set when
`MoveCommitsTarget::Roots` is used since the connected target set is
identical to the target set (all descendants of the roots).
is_empty() could also return Result<bool, _>, but I think the current definition
is also good. If an error occurred, revset.iter() would return at least one
item, so it's not empty.