Commit graph

32 commits

Author SHA1 Message Date
Yuya Nishihara
46d5555be4 cleanup: leverage trait upcasting, delete as_any*()
This patch also adds .downcast*() wrappers to prevent misuse of as &Any casting.
The compiler wouldn't help detect &Arc<T> as &Any, for example.

https://blog.rust-lang.org/2025/04/03/Rust-1.86.0/#trait-upcasting
2025-09-20 01:22:47 +00:00
Martin von Zweigbergk
da6c4b61b3 backend: remove unused TreeValue::Conflict and read/write methods
Some checks are pending
binaries / Build binary artifacts (push) Waiting to run
website / prerelease-docs-build-deploy (ubuntu-24.04) (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
We no longer use the `TreeValue::Conflict` constructor or the
`Backend::read_conflict()` and `Backend::write_conflict()` methods.
2025-09-04 16:26:44 +00:00
adamnemecek
8a26df2897 cli lib: make use of Self consistent
Mostly done via `cargo clippy --fix -- -A clippy::all -W clippy::use_self`. Added a rule to clippy rules.
2025-07-27 00:12:02 +00:00
Martin von Zweigbergk
c43ca3c07b address new mismatched_lifetime_syntaxes Clippy lint 2025-06-17 07:40:05 +00:00
Benjamin Brittain
97c7edcafc store: Add Send bounds to the AsyncRead future returned by read_file
This change enables the jj-lib api to be used in multi-threaded executor
contexts.
2025-06-09 21:08:10 +00:00
Martin von Zweigbergk
c90e1396b2 backend: add methods for reading and writing copy objects
This patch implements the methods only in the test backend. That
should enough for us to start implementing diffing and merging on top,
including tests for that functionality. Support in the Git backend can
come once we've seen that the model works.
2025-06-03 01:11:32 +00:00
Martin von Zweigbergk
1d1e0c9e2b backend: make write_file() take an AsyncRead
This is mostly for consistency with `read_file()` at this point. I'm
not sure if we need this for Google in the near future.

For now, I wrapped the file-reading in `local_working_copy.rs` in a
new `BlockingAsyncReader`. We should switch to using async I/O in the
future.
2025-05-22 15:33:33 +00:00
Martin von Zweigbergk
b970939804 backend: make read_file() return a AsyncRead
The `Backend::read_file()` method is async but it returns a `Box<dyn
Read>` and reading from that trait is blocking. That's fine with the
local Git backend but it can be slow for remote backends. For example,
our backend at Google reads file chunks 1 MiB at a time from the
server. What that means is that reading lots of small files
concurrently works fine since the whole file contents are returned by
the first `Read::read()` call (it was fetched when
`Backend::read_file()` was issued). However, when reading files that
are larger than one chunk, we end up blocking on the next
`Read::read()` call. I haven't verified that this actually is a
problem at Google, but fixing this blocking is something we should do
eventually anyway.

This patch makes `Backend::read_file()` return a `Pin<Box<dyn
AsyncRead>>` instead, so implementations can be async in the read part
too.

Since `AsyncRead` is not yet standardized, we have to choose between
the one from `futures` and the one from `tokio`. I went with the one
from `tokio`. I picked that because an earlier version of this patch
used `tokio::fs` for some reads. Then I realized that doing that means
that we have to use a tokio runtime, meaning that we can't safely keep
our existing `pollster::FutureExt::block_on()` calls. If we start
depending on tokio's specific runtime, I think we would first want to
remove all the `block_on()` calls. I'll leave that for later. I think
at this point, we could equally well use `futures::io::AsyncRead`, but
I also don't know if there's a reason to prefer that.
2025-05-20 13:23:36 +00:00
Martin von Zweigbergk
275504235b cli: replace ExitCode by u8
I'd like to make the internal `jj` binary at Google log the exit code,
but `ExitCode` doesn't seem to provide any way of getting the numeric
code out of it. This patch therefore replaces `ExitCode` by `u8` and
lets the `main()` function convert it to `ExitCode`. It's a bit
unfortunate but I don't see a better solution. It doesn't seem worth
it to create our own newtype for it.
2025-05-17 05:45:59 +00:00
Yuya Nishihara
127f7e23ac examples: use settings_for_new_workspace() when initializing workspace
Spotted while replacing callers of CommandHelper::settings().
2025-01-06 21:15:28 +09:00
Martin von Zweigbergk
8eb3d85b1c backend: make write methods async
This doesn't provide any benefit yet bit I think we've known for a
while that we want to make the backend write methods async. It's just
not been important to Google because we have the local daemon process
that makes our writes pretty fast. Regardless, this first commit just
changes the API and all callers immediately block for now, so it won't
help even on slow backends.
2024-09-04 18:34:11 -07:00
Matt Kulukundis
8ead72e99f formatting only: switch to Item level import ganularity 2024-08-22 14:52:54 -04:00
Matt Kulukundis
ee6b922144 copy-tracking: create CopyRecordMap and add it to diff summaries 2024-08-11 17:01:45 -04:00
Matt Kulukundis
e667a2b403 copy-tracking: adjust backend signature
- use a single commit instead of an array of them.  This simplifies the
  implementation.  A higher level api can wrap this when an array of
  commits is desired and those semantics are figured out.
- since this API is directly 1-1 on parents, there are no conflicts
- if we introduce a higher level API that handles lists of commits, we
  may need to restore the conflict/resolved distinction, but for now
  simplify
2024-08-11 17:01:45 -04:00
Matt Kulukundis
dab8a29683 copy-tracking: stub get_copy_records
- add the method and types for all backends
2024-07-03 20:26:30 -04:00
dploch
57a5d7dd64 cli_util: support multiple extensions consistently
If we ever implement some sort of ABI for dynamic extension loading, we'll need these underlying APIs to support multiple extensions, so we might as well do that first.
2024-04-12 14:07:33 -04:00
Yuya Nishihara
97024e5be4 cli: extract CommandError and helper functions to new module
The cli_util module is big enough to slow down Emacs, so let's split it up.
This change is an easy one.
2024-03-03 01:11:46 +09:00
Yuya Nishihara
351487b9f5 backend: pass Index and keep_newer timestamp parameters to gc()
GitBackend::gc() will need to check if a commit is reachable from any
historical operations. This could be calculated from the view and commit
objects, but the Index will do a better job.
2024-01-27 10:18:11 +09:00
Yuya Nishihara
4e54021930 backend: have gc() return BackendError instead of opaque error type
The gc() implementation is likely to call other backend functions, which
return BackendError.
2024-01-27 10:18:11 +09:00
Martin von Zweigbergk
1cc271441f gc: implement basic GC for Git backend
This adds an initial `jj util gc` command, which simply calls `git gc`
when using the Git backend. That should already be useful in
non-colocated repos because it's not obvious how to GC (repack) such
repos. In my own jj repo, it shrunk `.jj/repo/store/` from 2.4 GiB to
780 MiB, and `jj log --ignore-working-copy` was sped up from 157 ms to
86 ms.

I haven't added any tests because the functionality depends on having
`git` binary on the PATH, which we don't yet depend on anywhere
else. I think we'll still be able to test much of the future parts of
garbage collection without a `git` binary because the interesting
parts are about manipulating the Git repo before calling `git gc` on
it.
2023-12-03 07:40:12 -08:00
Martin von Zweigbergk
a0cbe7ced0 cli: rename *Commands enums to *Command
Each instance of the enum represents a single command, so singular
`*Command` seems better. That also seems to match the examples in
clap's documentation.
2023-12-01 16:53:54 -08:00
Yuya Nishihara
d747879aee signing: pass SigningFn by reference
write_commit() doesn't need ownership of the signing function.
2023-12-01 22:55:04 +09:00
Anton Bulakh
eb1c0ab4a2 sign: Implement a test signing backend and add a few basic tests 2023-11-30 23:36:56 +02:00
Anton Bulakh
5c3c0e9f6e sign: Implement generic commit signing on the backend 2023-11-23 22:52:20 +02:00
Yuya Nishihara
ea32c0cb9e git_backend: pass UserSettings to GitBackend constructors 2023-11-11 22:35:54 +09:00
Yuya Nishihara
8a2048a0e5 repo: pass UserSettings to store factories and initializers
GitBackend will use it to configure gix::Repository. I think UserSettings
is generally useful to pass store-specific parameters, so I've updated all
factory functions.
2023-11-11 22:35:54 +09:00
Martin von Zweigbergk
d989d4093d merged_tree: let backend influence whether to use new diff algo
Since the concurrent diff algorithm is significantly slower when using
the Git backend, I think we'll have to use switch between the two
algorithms depending on backend. Even if the concurrent version always
performed as well as the sequential version, exactly how concurrent it
should be probably still depends on the backend. This commit therefore
adds a function to the `Backend` trait, so each backend can say how
much concurrency they deal well with. I then use that number for
choosing between the sequential and concurrent versions in
`MergedTree::diff_stream()`, and also to decide the number of
concurrent reads to do in the concurrent version.
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
cfcdd71865 backend: make read_conflict synchronous again
This avoids https://github.com/rust-lang/futures-rs/issues/2090. I
don't think we need to worry about reading legacy conflicts
asynchronously - async is really only useful for Google's backend
right now, and we don't use the legacy format at Google. In
particular, I don't want `MergedTree::value()` to have to be async.
2023-10-28 16:45:40 -07:00
Martin von Zweigbergk
7c8a0a18f9 repo: define types for backend initializer functions
`ReadonlyRepo::init()` takes callbacks for initializing each kind of
backend. We called these things like `op_store_initializer`. I found
that confusing because it is not a `OpStoreFactory` (which is for
loading an existing backend). This patch tries to clarify that by
renaming the arguments and adding types for each kind of callback
function.
2023-10-16 22:33:44 -07:00
Martin von Zweigbergk
5174489959 backend: make read functions async
The commit backend at Google is cloud-based (and so are the other
backends); it reads and writes commits from/to a server, which stores
them in a database. That makes latency much higher than for disk-based
backends. To reduce the latency, we have a local daemon process that
caches and prefetches objects. There are still many cases where
latency is high, such as when diffing two uncached commits. We can
improve that by changing some of our (jj's) algorithms to read many
objects concurrently from the backend. In the case of tree-diffing, we
can fetch one level (depth) of the tree at a time. There are several
ways of doing that:

 * Make the backend methods `async`
 * Use many threads for reading from the backend
 * Add backend methods for batch reading

I don't think we typically need CPU parallelism, so it's wasteful to
have hundreds of threads running in order to fetch hundreds of objects
in parallel (especially when using a synchronous backend like the Git
backend). Batching would work well for the tree-diffing case, but it's
not as composable as `async`. For example, if we wanted to fetch some
commits at the same time as we were doing a diff, it's hard to see how
to do that with batching. Using async seems like our best bet.

I didn't make the backend interface's write functions async because
writes are already async with the daemon we have at Google. That
daemon will hash the object and immediately return, and then send the
object to the server in the background. I think any cloud-based
solution will need a similar daemon process. However, we may need to
reconsider this if/when jj gets used on a server with a custom backend
that writes directly to a database (i.e. no async daemon in between).

I've tried to measure the performance impact. That's the largest
difference I've been able to measure was on `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0` in the Linux repo,
which increases from 749 ms to 773 ms (3.3%). In most cases I've
tested, there's no measurable difference. I've tried diffing from the
root commit, as well as `jj --ignore-working-copy log --no-graph -r
'::v3.0 & author(torvalds)' -T 'commit_id ++ "\n"'` (to test a
commit-heavy load).
2023-10-08 23:36:49 -07:00
Martin von Zweigbergk
d575aaeca8 backend: move constant functions first
`root_commit_id()`, `root_change_id()`, and `empty_tree_id()` were
strangely ordered between `write_symlink()` and `read_tree().
2023-09-19 05:24:51 -07:00
Martin von Zweigbergk
cc335a9970 cargo: move examples/ into cli/ so they are part of the build again 2023-08-07 21:49:45 +00:00