Commit graph

195 commits

Author SHA1 Message Date
Yuya Nishihara
3890c28493 merge: implement IntoIterator for references of Merge<T>
Some checks are pending
binaries / Build binary artifacts (push) Waiting to run
website / prerelease-docs-build-deploy (ubuntu-24.04) (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
It seemed odd that owned values can be looped over with "for", but borrowed
values can't.
2025-12-09 23:28:23 +00:00
Scott Taylor
2e18f59bc9 backend: add conflict labels to Commit 2025-12-09 14:23:43 +00:00
Scott Taylor
d4668bf3fc merged_tree: add conflict labels to MergedTree 2025-12-09 14:23:43 +00:00
xtqqczze
6d8bf975f6 rustc_lint: enable redundant-imports lint
https://doc.rust-lang.org/rustc/lints/listing/allowed-by-default.html#redundant-imports
2025-12-09 13:30:44 +00:00
xtqqczze
c49a60e5eb clippy: enable unnecessary_literal_bound lint
Some checks are pending
binaries / Build binary artifacts (push) Waiting to run
website / prerelease-docs-build-deploy (ubuntu-24.04) (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
2025-12-05 17:07:44 +00:00
Mitchell Skaggs
2af8bf94a3 testutils: add check for strict UTF-8 filesystems
Some checks are pending
binaries / Build binary artifacts (push) Waiting to run
website / prerelease-docs-build-deploy (ubuntu-24.04) (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
`test_init_load_non_utf8_path` and
`test_init_additional_workspace_non_utf8_path` now early-return on
strict UTF-8 filesystems because there's no way to report a test as
"skipped" at runtime.

Closes https://github.com/jj-vcs/jj/issues/8118
2025-11-27 00:36:39 +00:00
Lucio Franco
d9f2772988 cli: add jj file track --include-ignored flag
Some checks are pending
binaries / Build binary artifacts (push) Waiting to run
website / prerelease-docs-build-deploy (ubuntu-24.04) (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
This adds support for tracking ignored and oversized files with `jj file track`.

Previously, `jj file track` would silently fail to track files that were ignored by
`.gitignore` or larger than `snapshot.max-new-file-size`. This commit introduces an
`--include-ignored` flag that allows users to explicitly track these files.

## Implementation

Added a `force_tracking_matcher` field to `SnapshotOptions` that overrides ignore rules
and size limits. When `--include-ignored` is specified, the file pattern matcher is
passed as `force_tracking_matcher`, allowing three checks in `FileSnapshotter` to bypass
their usual restrictions for directory ignores, file ignores, and file size limits.

## Tests

- `test_track_ignored_with_flag`: Verifies `.gitignore`d files can be tracked
- `test_track_large_file_with_flag`: Verifies oversized files can be tracked
- `test_track_ignored_directory`: Verifies ignored directories can be tracked recursively

# Checklist

If applicable:

- [ ] I have updated `CHANGELOG.md`
- [x] I have updated the documentation (`README.md`, `docs/`, `demos/`)
- [ ] I have updated the config schema (`cli/src/config-schema.json`)
- [x] I have added/updated tests to cover my changes
2025-11-14 03:14:37 +00:00
Scott Taylor
da1921ead5 testutils: dump_tree: print contents of conflicted trees
Some checks are pending
binaries / Build binary artifacts (push) Waiting to run
website / prerelease-docs-build-deploy (ubuntu-24.04) (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
Previously, `assert_tree_eq!` would give a confusing panic message if
one of the trees had a conflict.
2025-11-09 17:48:25 +00:00
Scott Taylor
5aa71d59a9 lib: replace MergedTreeId with MergedTree and Merge<TreeId>
After the previous commit, `MergedTree` and `MergedTreeId` are almost
identical, with the only difference being that `MergedTree` is attached
to a `Store` instance. `MergedTreeId` is also equivalent to
`Merge<TreeId>`, since it is just a wrapper around it.

In the future, `MergedTree` might contain additional metadata like
conflict labels. Therefore, I replaced `MergedTreeId` with `MergedTree`
wherever I think it would be required to pass this additional metadata,
or where the additional methods provided by `MergedTree` would be
useful. In any remaining places, I replaced it with `Merge<TreeId>`.

I also renamed some of the `tree_id()` methods to `tree_ids()` for
consistency, since now they return a merge of individual tree IDs
instead of a single "merged tree ID". Similarly, `MergedTree` no longer
has an `id()` method, since tree IDs won't fully identify a `MergedTree`
once it contains additional metadata.
2025-11-08 14:06:58 +00:00
Scott Taylor
79fc20e856 merged_tree: use Merge<TreeId> instead of Merge<Tree>
Currently, creating a `MergedTree` requires reading all of its root
trees from the store. However, this is often not actually required. For
instance, if the only reason to read the trees is to call
`MergedTree::merge`, and the merge is trivial, then there was no need to
read the trees. Changing `MergedTree` to only require a `Merge<TreeId>`
instead of a `Merge<Tree>` will make it possible to avoid reading trees
unnecessarily in these cases.

One benefit of this approach is that `Commit::tree` no longer requires
reading from the store, so it can be made synchronous and infallible,
which simplifies a lot of code.
2025-11-08 14:06:58 +00:00
Martin von Zweigbergk
d8acbec3fc dag_walk: propagate error when cycle is detected
A sibling team of my team sometimes runs into panics caused by cycles
in the commit graph. This patch removes the panics from `dag_walk` by
having all the callers pass in a `cycle_fn`. For now, the callers
panic instead.
2025-10-27 14:16:34 +00:00
Scott Taylor
141832414a backend: remove MergedTreeId::Legacy variant
I'm planning to try to add conflict labels to `MergedTree` and
`MergedTreeId`, and it will be easier to add them if both are structs
with similar methods. Since we don't support reading/writing legacy
conflicts anymore (as far as I'm aware), I think it should be safe to
delete the `MergedTreeId::Legacy` variant now.
2025-10-19 13:14:27 +00:00
Martin von Zweigbergk
beda1381bd working_copy: make potentially slow methods async
There's no pressing need, but we should do this eventually.
2025-10-15 03:27:06 +00:00
Yuya Nishihara
46d5555be4 cleanup: leverage trait upcasting, delete as_any*()
This patch also adds .downcast*() wrappers to prevent misuse of as &Any casting.
The compiler wouldn't help detect &Arc<T> as &Any, for example.

https://blog.rust-lang.org/2025/04/03/Rust-1.86.0/#trait-upcasting
2025-09-20 01:22:47 +00:00
Yuya Nishihara
ccd1373f1b working_copy: move SnapshotOptions::empty_for_test() to testutils 2025-09-14 03:55:09 +00:00
Martin von Zweigbergk
da6c4b61b3 backend: remove unused TreeValue::Conflict and read/write methods
Some checks are pending
binaries / Build binary artifacts (push) Waiting to run
website / prerelease-docs-build-deploy (ubuntu-24.04) (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
We no longer use the `TreeValue::Conflict` constructor or the
`Backend::read_conflict()` and `Backend::write_conflict()` methods.
2025-09-04 16:26:44 +00:00
Yuya Nishihara
7e80eb4dfa cargo: bump toml_edit to 0.23.3
Also updated deprecated ImDocument references to Document.
2025-08-23 03:46:11 +00:00
Martin von Zweigbergk
28562f1b10 tests: remove CommitGraphBuilder
The `CommitGraphBuilder` type doesn't seem to carry its weight
anymore.
2025-07-31 04:56:34 +00:00
Martin von Zweigbergk
ca6edfaab0 tests: add a helper for writing random commit with given parents 2025-07-31 04:56:34 +00:00
Austin Seipp
ba24140f1d cli, lib: move to Rust 2024 language edition
This applies a `cargo fmt` and fixes clippy lints to keep the build
properly working.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
2025-07-28 17:05:41 +00:00
Austin Seipp
99ab453790 testutils: add + use<> bounds for impl Strategy
Signed-off-by: Austin Seipp <aseipp@pobox.com>
2025-07-28 17:05:41 +00:00
Austin Seipp
bee574354b testutils: set_env is unsafe in Rust 2024
`set_env` is, for various reasons, fundamentally unsafe on approximately
~all modern unicies, and seems like it will never ever be fixed. The
long and short of this is that it will result in segfaults or UB. Rust
2024 therefore marks this function (correctly) as `unsafe`.

The correct solution for 98% of use cases is just to use `envp` during
calls to `execv`, but for our simple cases here of making Git hermetic
there shouldn't be an issue, and a larger refactoring would be needed
for an alternative anyway.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
2025-07-28 17:05:41 +00:00
adamnemecek
8a26df2897 cli lib: make use of Self consistent
Mostly done via `cargo clippy --fix -- -A clippy::all -W clippy::use_self`. Added a rule to clippy rules.
2025-07-27 00:12:02 +00:00
Martin von Zweigbergk
721daef0b4 store: inline tree_builder() function to callers
`Store::tree_builder()` returns a `TreeBuilder`. Almost all callers
should be using the `MergedTreeBuilder` these days. This patch
therefore removes `tree_builder()` to reduce the risk of accidentally
using it.
2025-07-18 21:36:13 +00:00
Kaiyi Li
f1f1556731 local working copy: add support for EOL conversion 2025-07-17 15:36:28 +00:00
Martin von Zweigbergk
5d35eadd4e tests: make TestBackend async for more realistic testing
The `TestBackend` methods currently return their data immediately (on
the first poll), which means that if multiple futures are created and
then they're polled "concurrently", they will always return their data
in the order they're being polled. That leads to poor testing of
algortihms that poll futures concurrently, such as `TreeDiffStream`.

This patch makes `TestBackend` spawn async work to run in a tokio
runtime instead. That's enough to show a bug I introduced with my
recent refactoring of `TreeDiffStream`, except that it's also covered
up by the caching we do in `Store`. I'll fix the bug and update tests
to work around the caching next.

This slows down the jj-lib tests from 2.8 s to 3.1 s. I don't think
that matter much, given that the jj-cli tests takes > 30 s.

I tried to add a small `tokio::time::sleep()` (random up to 5 ms) but
that slowed down the property-based tests of the diff editor very
significantly (took over a minute). Maybe we could have two different
kinds of test backend or maybe make the sleep configurable in some
way. We can improve that later. The async-ness added in this patch is
sufficient for catching the diff-stream bug.
2025-07-14 16:09:41 +00:00
Yuya Nishihara
68ead52c5c hex_util: roll our own decode/encode_hex() functions
Since we have "reverse hex" functions, it's easy to implement the same set of
functions for "forward hex". I believe our implementation is slower than
highly-optimized versions provided by e.g. faster-hex, but we don't use hex
encoding/decoding where the performance matters.
2025-07-02 01:56:40 +00:00
Ilya Grigoriev
6323764f69 lib: add rustversion, turn off a clippy lint in nightly in tests
This turns off the `clippy::cloned_ref_to_slice_refs` lint in some tests
and fixes it in others, for Rust 1.89+. This seems to make `cargo clippy
--workspace --all-targets --all-features` work in stable, beta, and
nightly (1.89).

This depends on the `rustversion` crate. Other than that, it's based on
Austin's https://github.com/jj-vcs/jj/pull/6705.

Co-authored-by:  Austin Seipp <aseipp@pobox.com>
2025-06-24 01:01:25 +00:00
Jonas Greitemann
94ba95bb4c merge-tools builtin: add property-based testing
This adds the proptest crate for property-based testing as well as the
proptest-state-machine crate as direct dev dependencies of jj-cli and as
dependencies of the internal testutils crate. 

Within testutils, a `proptest` module provides a reference state
machine which models the working copy as a map from path to `DirEntry`.
Directories are not represented explicitly, but are implicit in the
ancestors of entries.

The possible transitions of this state machine are for now limited to
the creation of new files (including replacements of existing files
or directories) and a `Commit` operation which the SUT can use to
snapshot a reference state. Additional transitions (moving files,
modifying file contents incrementally, ...) and states (symlinks,
submodules, conflicts, ...) may be added in the future.

This reference state machine is then applied to the builtin merge-tool's
test suite:
- The initial state is always an empty root directory.
- The `Commit` operation creates `MergedTree` from the current state.
- Each step of the way, the same test logic as in the manual
  `test_edit_diff_builtin*` tests is run to check that splitting off
  none or all of the changes results in the left or right tree,
  respectively. The "right" tree corresponds to the current state,
  whereas the "left" tree refers to the last "committed" tree.

Co-authored-by: Waleed Khan <me@waleedkhan.name>
2025-06-18 20:45:56 +00:00
Jonas Greitemann
da2835b86c merge-tools builtin: introduce assert_tree_eq!
This macro in the style of `assert_eq!` compares two trees based
on their `MergedTreeId`s. In case they do not compare equal, the
corresponding trees are dumped in the panic message.

Like `assert_eq!`, the macro accepts a custom format string which will
be included in the panic message.
2025-06-18 20:45:56 +00:00
Martin von Zweigbergk
c43ca3c07b address new mismatched_lifetime_syntaxes Clippy lint 2025-06-17 07:40:05 +00:00
Benjamin Brittain
97c7edcafc store: Add Send bounds to the AsyncRead future returned by read_file
This change enables the jj-lib api to be used in multi-threaded executor
contexts.
2025-06-09 21:08:10 +00:00
Ilya Grigoriev
ff63a7e9b3 git_backend: update gix, adapt to breaking changes
Update `gix` to 0.72.1, and adapt to its breaking changes.

1. The signature of `gix::reference::iter::Platform::prefixed` changed
   in a way that seems to confuse Rust compiler (and does confuse me).

2. `git_object::Tree::EntryMode` API changed; `entry.mode()` now has a
   `value()` method.

3. Most significantly, the meaning and API of `gix::actor::SignatureRef`
   changed.

## Details about `gix::actor::SignatureRef`

The API for `gix::actor::Signature` and `gix::actor::SignatureRef`
changedd. The latter now contains an unparsed string time field, while
the former still contains a parsed time.  So, the conversions between
`gix::actor::SignatureRef` and either `gix::actor::Signature` or jj's
`Signature` types can now fail.

We use the epoch for the time if the timestamp is unreadable, like gix
did before.

Cc: https://github.com/GitoxideLabs/gitoxide/pull/1935,
https://github.com/GitoxideLabs/gitoxide/pull/2038
2025-06-06 18:33:56 +00:00
Martin von Zweigbergk
a363e6b1e6 backend: add CopyId to TreeValue::File
This patch adds a `TreeValue::File::copy_id` field. The copy ids are
always empty for now. I preserved the copy id where it was easy to do
so, plus in a few non-trivial cases. In other places, however, I made
the code use a new copy id.  I added a `CopyId::placeholder()`
function for creating a new copy id where we need one. We should
eventually fix all callers to either preserve an existing copy or to
generate a new one.
2025-06-03 01:11:32 +00:00
Martin von Zweigbergk
c90e1396b2 backend: add methods for reading and writing copy objects
This patch implements the methods only in the test backend. That
should enough for us to start implementing diffing and merging on top,
including tests for that functionality. Support in the Git backend can
come once we've seen that the model works.
2025-06-03 01:11:32 +00:00
Jonas Greitemann
9c165d6db3 tests: supplement create_tree() with a builder-style API
The original form of `create_tree()` is limited to creating (valid
UTF-8) text files but cannot create binary files, executable
files, or symlinks. Dedicated helpers like `write_executable_file()` or
`write_symlink()` partially compensated for this, but required manually
assembling the tree in the test code.

This commit introduces `TestTreeBuilder` which provides an API to
successively add entries to a tree which can represent all of the above.
`TestTreeBuilder` can then create either a single `Tree`, or a resolved
`MergedTree`.

In addition to using `TestTreeBuilder` directly, `create_tree_with()`
and `create_single_tree_with()` accept a closure which receives a
`TestTreeBuilder`. This allows test code to quickly describe the tree
without requiring the a named builder at caller scope. Riffing off
the familiar function names should help in discovering the new builder
facilities. However, it is completely possible to use `TestTreeBuilder`
directly, if preferred.
2025-06-02 17:40:26 +00:00
Martin von Zweigbergk
1d1e0c9e2b backend: make write_file() take an AsyncRead
This is mostly for consistency with `read_file()` at this point. I'm
not sure if we need this for Google in the near future.

For now, I wrapped the file-reading in `local_working_copy.rs` in a
new `BlockingAsyncReader`. We should switch to using async I/O in the
future.
2025-05-22 15:33:33 +00:00
Martin von Zweigbergk
b970939804 backend: make read_file() return a AsyncRead
The `Backend::read_file()` method is async but it returns a `Box<dyn
Read>` and reading from that trait is blocking. That's fine with the
local Git backend but it can be slow for remote backends. For example,
our backend at Google reads file chunks 1 MiB at a time from the
server. What that means is that reading lots of small files
concurrently works fine since the whole file contents are returned by
the first `Read::read()` call (it was fetched when
`Backend::read_file()` was issued). However, when reading files that
are larger than one chunk, we end up blocking on the next
`Read::read()` call. I haven't verified that this actually is a
problem at Google, but fixing this blocking is something we should do
eventually anyway.

This patch makes `Backend::read_file()` return a `Pin<Box<dyn
AsyncRead>>` instead, so implementations can be async in the read part
too.

Since `AsyncRead` is not yet standardized, we have to choose between
the one from `futures` and the one from `tokio`. I went with the one
from `tokio`. I picked that because an earlier version of this patch
used `tokio::fs` for some reads. Then I realized that doing that means
that we have to use a tokio runtime, meaning that we can't safely keep
our existing `pollster::FutureExt::block_on()` calls. If we start
depending on tokio's specific runtime, I think we would first want to
remove all the `block_on()` calls. I'll leave that for later. I think
at this point, we could equally well use `futures::io::AsyncRead`, but
I also don't know if there's a reason to prefer that.
2025-05-20 13:23:36 +00:00
Martin von Zweigbergk
23de072c14 store: drop "_async" suffix from read_file()/read_symlink()
There's no sync version anymore.
2025-05-18 02:45:43 +00:00
Martin von Zweigbergk
12bcd04459 store: delete read_file(), update callers to use async version 2025-05-18 02:45:43 +00:00
Emily
7542fe94bc testutils: remove obsolete mention of libgit2 in comment 2025-05-07 19:29:20 +00:00
Emily
73739791be testutils: remove obsolete git2 hermeticity code 2025-05-07 19:29:20 +00:00
Yuya Nishihara
105c892ce4 tests: do not shell out taplo to gather forgotten test files
It would be annoying if forgotten tests wouldn't be reported locally.
2025-04-27 01:33:23 +00:00
Jonas Greitemann
7bb8e17e88 tests: factor out utility function is_external_tool_installed
A pattern has emerged where a integration tests check for the
availability of an external tool (`git`, `taplo`, `gpg`, ...) and skip
the test (by simply passing it) when it is not available. To check this,
the program is run with the `--version` flag.

Some tests require that the program be available at least when running
in CI, by calling `ensure_running_outside_ci` conditionally on the
outcome. The decision is up to each test, though, the utility merely
returns a `bool`.
2025-04-24 15:48:08 +00:00
Jonas Greitemann
8882f0016d tests: allow multiple integration tests in check for forgotton test files
The previous implementation of `assert_no_forgotten_test_files`
hard-coded the name of the `runner` integration test and required all
other source files to appear in matching `mod` declarations. Thus, this
approach cannot handle multiple integration tests.

However, additional integration tests may be desirable
- to support tests using a custom test harness (see upcoming commits)
- to balance the trade-off between test run time and compile time as
  the test suite grows in the future.

The new implementation first uses `taplo` to parse the `[[test]]`
sections of the manifest to identify integration test main modules,
and then searches in those for `mod` declarations. This is then compared
to the list of source files in the tests directory. Like the previous
implementation, the new one does not attempt to recurse into submodules
or to handle directory-style modules; just like before it only treats
source files without a module declaration as an error and relies on the
compiler to complain about the other way around.

When `taplo` is not installed, the check is skipped unless it is running
in CI where we require `taplo` to be available.
2025-04-24 15:48:08 +00:00
Martin von Zweigbergk
7cab444313 repo_path: remove assertion from constructors
We ran into a crash on our server at Google today because we
accidentally called `RepoPathBuf::from_internal_string()` with a
string starting with a '/', which resulted in a the assertion in that
function failing. This patch changes that constructor and its siblings
to return a `Result` instead.
2025-04-15 14:42:23 +00:00
Martin von Zweigbergk
f0545ee25c test: introduce test helpers for creating repo path types
I'm about to make the constructors return a `Result`. The helpers will
hide the unwrapping.
2025-04-15 14:42:23 +00:00
Emily
350da7d013 cargo: bump gix to 0.71.0
Fix GHSA-794x-2rpg-rfgr.

`gix::Repository::work_dir` was renamed to `workdir` (though strangely
not the `gix::ThreadSafeRepository` version), and `lossy_config`
is now off by default in all configurations.
2025-04-04 04:28:42 +00:00
Emily
71a619c19f tests: don’t use git2 in testutils 2025-04-03 19:03:44 +00:00
Emily
f14b3bf9a8 git: respect GIT_* environment variables in git2 tests
Previously, this was calling `git_repository_open()`,
which is equivalent to `git_repository_open_ext()` with
`flags = GIT_REPOSITORY_OPEN_NO_SEARCH` and `ceiling_dirs =
NULL`. This changes `ceiling_dirs` to an empty string, and adds
`GIT_REPOSITORY_OPEN_FROM_ENV` to `flags` when we’re in test code.

`GIT_REPOSITORY_OPEN_FROM_ENV` is used to respect the Git configuration
path environment variables, which is what we want for the test
hermeticity code. It works like this:

* `config_path_system` will use `$GIT_CONFIG_SYSTEM` because `use_env`
  will be set.
  
* `config_path_global` will use `$GIT_CONFIG_GLOBAL` because `use_env`
  will be set.
  
* `git_config__find_xdg` and `git_config__find_programdata` will find
  impure system paths and load them even when `$GIT_CONFIG_GLOBAL` is
  set, contrary to Git behaviour, so we need to set `$XDG_CONFIG_HOME`
  and `$PROGAMDATA`.

It has a few other effects, which I will exhaustively enumerate to
show that they are benign:

* It respects `$GIT_WORK_TREE` and `$GIT_COMMON_DIR`. These would
  already break our tests, I think, so we’re assuming they’re
  not set. (Possibly we should set them explicitly.)

* When opening a repository, it will:

  * Set the starting path for the search to `$GIT_DIR` if it’s
    `NULL`, but we do set it, so no change.

  * Initialize `ceiling_dirs` to `$GIT_CEILING_DIRECTORIES` if it’s
    `NULL`, but we do set it, so no change.

  * Respect `$GIT_DISCOVERY_ACROSS_FILESYSTEM` and set the
    `GIT_REPOSITORY_OPEN_CROSS_FS` flag appropriately. However,
    this is only checked on subsequent iterations of the loop in
    `find_repo_traverse`, and we set `GIT_REPOSITORY_OPEN_NO_SEARCH`
    which causes it to never enter a second iteration.

  * Use `ceiling_dirs` in `find_ceiling_dir_offset`, but the result is
    ignored when `GIT_REPOSITORY_OPEN_NO_SEARCH` is set, so changing
    from `NULL` to the empty string doesn’t affect behaviour. (It
    also would always return the same result for either value, anyway.)
2025-04-03 19:03:44 +00:00