Commit graph

7823 commits

Author SHA1 Message Date
Sylvestre Ledru
1ee8092826
Merge pull request #7525 from sylvestre/thiserror4
Move more programs to thiserror
2025-03-24 19:06:13 +01:00
Sylvestre Ledru
36231f7551
Merge pull request #7562 from drinkcat/seq-perf-1
seq: Directly write separator string, instead of using format
2025-03-24 19:05:14 +01:00
Nicolas Boichat
66745427cb seq: Directly write separator string, instead of using format
Doing `stdout.write_all(separator.as_bytes())?` is quite a bit
faster than using format to do the same operation:
`write!(stdout, "{separator}")?`.

This speeds up by about 10% on simple cases.

We do the same for the terminator even though this has no measurable
performance impact.
2025-03-24 18:02:06 +01:00
lbellomo
d561ee8f16 doc: escape RE with '`' 2025-03-24 12:01:16 -03:00
Sylvestre Ledru
ffe8762ee6 Fix the GNU test
Co-authored-by: Dorian Péron <72708393+RenjiSann@users.noreply.github.com>
2025-03-24 14:22:25 +01:00
Sylvestre Ledru
305be09403 ls: move to thiserror 2025-03-24 14:22:25 +01:00
Sylvestre Ledru
9d123febb3 install: move to thiserror 2025-03-24 14:22:25 +01:00
Sylvestre Ledru
c1bb57fd1e ln: move to thiserror 2025-03-24 14:22:25 +01:00
Sylvestre Ledru
d0e6a6271c join: move to thiserror 2025-03-24 14:22:25 +01:00
Sylvestre Ledru
8931d2c26e
Merge pull request #7521 from usamoi/ptx
ptx: fixes
2025-03-23 09:28:20 +01:00
Sylvestre Ledru
eed5c81060
Merge pull request #7463 from blyxxyz/clean-shuf
Make `shuf` OsStr-compliant and bring newline handling in line with GNU
2025-03-22 22:29:49 +01:00
Sylvestre Ledru
b540e18dec
Merge pull request #7519 from karlmcdowall/cat_perf
cat: Improve performance of formatting.
2025-03-22 21:54:13 +01:00
Nicolas Boichat
d678e5320f uucore: format: Fix uppercase hex floating point printing
Accidentally broke this use case when refactoring.

Added a test as well.
2025-03-22 21:13:18 +01:00
Nicolas Boichat
e6c24b245a uucore: format: Small optimizations in num_format for seq
In most common use cases:
 - We can bypass a lot of `write_output` when width == 0.
 - Simplify format_float_decimal when the input is an integer.

Also document another interesting case in src/uu/seq/BENCHMARKING.md.
2025-03-22 21:13:18 +01:00
Nicolas Boichat
f31ba2bd28 seq: Make use of uucore::format to print in all cases
Now that uucore format functions take in an ExtendedBigDecimal,
we can use those in all cases.
2025-03-22 21:13:18 +01:00
Nicolas Boichat
25c492ee19 uucore: format: Pad non-finite numbers with spaces, not zeros
`printf "%05.2f" inf` should print `  inf`, not `00inf`.

Add a test to cover that case, too.
2025-03-22 21:13:18 +01:00
Nicolas Boichat
ec450d602a uucode: format: format_float_hexadecimal: Take in &BigDecimal
Display hexadecimal floats with arbitrary precision.

Note that some of the logic will produce extremely large
BitInt as intermediate values: there is some optimization
possible here, but the current implementation appears to work
fine for reasonable numbers (e.g. whatever would previously
fit in a f64, and even with somewhat large precision).
2025-03-22 21:13:18 +01:00
Nicolas Boichat
f0e9b8621f uucode: format: format_float_shortest: Take in &BigDecimal
Similar logic to scientific printing. Also add a few more tests
around corner cases where we switch from decimal to scientific
printing.
2025-03-22 21:13:18 +01:00
Nicolas Boichat
7f0e5eb473 uucode: format: format_float_scientific: Take in &BigDecimal
No more f64 operations needed, we just trim (or extend) BigDecimal to
appropriate precision, get the digits as a string, then add the
decimal point.

Similar to what BigDecimal::write_scientific_notation does, but
we need a little bit more control.
2025-03-22 21:13:18 +01:00
Nicolas Boichat
edaccc88b9 uucode: format: format_float_decimal: Take in &BigDecimal
Also add a few unit tests to make sure precision is not lost anymore.
2025-03-22 21:13:18 +01:00
Nicolas Boichat
ce14d01da5 uucode: format: format_float_non_finite: Take in &ExtendedBigDecimal
First modify Format.fmt to extract absolute value and sign, then
modify printing on non-finite values (inf or nan).
2025-03-22 21:13:18 +01:00
Nicolas Boichat
8e11dab995 uucode: format: Change Formatter to take an &ExtendedBigDecimal
Only changes the external interface, right now the number is
casted back to f64 for printing. We'll update that in follow-up.
2025-03-22 21:13:18 +01:00
Nicolas Boichat
241e2291bd uucore: format: extendedbigdecimal: Implement From<f64>
Allows easier conversion.
2025-03-22 21:13:18 +01:00
Nicolas Boichat
9355200901 uucore: format: extendedbigdecimal: Add MinusNan
Some test cases require to handle "negative" NaN. Handle it
similarly to "positive" NaN.
2025-03-22 21:13:18 +01:00
Nicolas Boichat
69164688ad uucore: format: Make Formatter a generic
Using an associated type in Formatter trait was quite nice, but, in
a follow-up change, we'd like to pass a _reference_ to the Float
Formatter, while just passing i64/u64 as a value to the Int
formatters. Associated type doesn't allow for that, so we turn
it into a generic instead.

This makes Format<> a bit more complicated though, as we need
to specify both the Formatter, _and_ the type to be formatted.
2025-03-22 21:13:18 +01:00
Nicolas Boichat
2103646ff7 seq: Move extendedbigdecimal.rs to uucore/features/format
Will make it possible to directly print ExtendedBigDecimal in `seq`,
and gradually get rid of limited f64 precision in other tools
(e.g. `printf`).

Changes are mostly mechanical, we reexport ExtendedBigDecimal directly
in format to keep the imports slightly shorter.
2025-03-22 21:13:18 +01:00
usamoi
412d2b3b1f ptx: fixes 2025-03-22 19:25:19 +08:00
Karl McDowall
c84ee0ae0f cat: Improve performance of formatting.
Issue #7518
Add a BufWriter over stdout when cat outputs any kind of formattted
data. This improves performance considerably.
2025-03-21 19:03:34 -06:00
jmjoy
7bd90bb663
Implement Default for Options of mv and cp (#7506) 2025-03-20 16:08:44 +01:00
Sylvestre Ledru
187d3e58b5
Merge pull request #7495 from karlmcdowall/wc_perf
wc: Perf gains with the bytecount crate.
2025-03-20 09:52:15 +01:00
Karl McDowall
eea6c82305 wc: Perf gains with the bytecount crate.
Issue #7494
Improve performace of wc app.
 - Use the bytecount::num_chars API to count UTF-8 characters in a file.
 - Enable runtime-dispatch-simd feature in the bytecount crate.
2025-03-19 16:45:19 -06:00
Nicolas Boichat
6d3c0bee68 seq: Buffer writes to stdout
Use a BufWriter to wrap stdout: reduces the numbers of system calls,
improves performance drastically (2x in some cases).

Also document use cases in src/uu/seq/BENCHMARKING.md, and the
optimization we have just done here.
2025-03-19 10:36:48 +01:00
Dorian Péron
2e3da88b78
Merge pull request #7438 from karlmcdowall/head_perf2
head: rework handling of non-seekable files
2025-03-18 18:15:01 +01:00
karlmcdowall
e1275f4ccd
Update src/uu/head/src/take.rs
Co-authored-by: Dorian Péron <72708393+RenjiSann@users.noreply.github.com>
2025-03-18 09:08:21 -06:00
Terakomari
ae6d4dec28
base32/base64/basenc: add -D flag (#7479)
* base32/base64/basenc: add -D flag

* base32/base64/basenc: add test for -D flag

* update extensions.md

* remove redundant parameters

* merge  into a single category

* Update docs/src/extensions.md

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>

---------

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
2025-03-18 14:39:53 +01:00
Karl McDowall
29875312a1 head: rework handling of non-seekable files
Fix issue #7372
Rework logic for handling all-but-last-lines and all-but-last-bytes
for non-seekable files. Changes give large performance improvement.
2025-03-17 13:24:10 -06:00
Etienne Cordonnier
8f17113d61 fsext.rs: use type inference fsid_t / __fsid_t
Commit 2a0d58d060 (part of https://github.com/uutils/coreutils/pull/3396 which contains a description of the changes) changed this line from libc::fsid_t to nix::sys::statfs::fsid_t.
The pull-request description at https://github.com/uutils/coreutils/pull/3396 indicates that this was done in order to fix the android build, and indeed using a cast to nix::sys::statfs::fsid_t
takes advantage of the definition of nix::sys::statfs::fsid_t which abstracts away the different name on Android:

```
/// Identifies a mounted file system
#[cfg(target_os = "android")]
pub type fsid_t = libc::__fsid_t;
/// Identifies a mounted file system
#[cfg(not(target_os = "android"))]
pub type fsid_t = libc::fsid_t;
```

This cast works as long as the libc version used by nix is the same than the libc version used by coreutils.

This cast becomes invalid when using a local libc version for local debugging, and changing Cargo.toml to point to it:
```
-libc = "0.2.153"
+libc = { path = "../path/to/libc" }
```

The cast becomes invalid because self.f_fsid is of type libc::fsid_t (local version of
libc), whereas nix::sys::statfs::fsid_t still uses the libc version downloaded
by cargo from crates.io in this case.

I was getting this error:

```
coreutils$ cargo build
   Compiling libc v0.2.171 (/home/ecordonnier/dev/libc)
   Compiling uucore v0.0.30 (/home/ecordonnier/dev/coreutils/src/uucore)
error[E0606]: casting `&libc::fsid_t` as `*const nix::libc::fsid_t` is invalid
   --> src/uucore/src/lib/features/fsext.rs:816:25
    |
816 |             unsafe { &*(&self.f_fsid as *const nix::sys::statfs::fsid_t as *const [u32; 2]) };
    |                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For more information about this error, try `rustc --explain E0606`.
error: could not compile `uucore` (lib) due to 1 previous error
```

Let's rather use type inference to deal with libc::fsid_t vs libc::__fsid_t.

Signed-off-by: Etienne Cordonnier <ecordonnier@snap.com>
2025-03-17 11:05:02 +01:00
Jan Verbeek
e830dd45f0 shuf: Use impl return type in trait now that MSRV is high enough 2025-03-16 14:10:31 +01:00
Jan Verbeek
cdd1052cea shuf: Move more file operations into main()
This removes the need for some manually duplicated code and keeps
shuf_exec() (which is generic) smaller, for less binary bloat and
better build times.
2025-03-16 14:10:31 +01:00
Jan Verbeek
f562543b6c shuf: Use OS strings, don't split individual arguments, cleanup
- shuf now uses OS strings, so it can read from filenames that are
  invalid Unicode and it can shuffle arguments that are invalid
  Unicode. `uucore` now has an `OsWrite` trait to support this without
  platform-specific boilerplate.

- shuf no longer tries to split individual command line arguments,
  only bulk input from a file/stdin. (This matches GNU and busybox.)

- More values are parsed inside clap instead of manually, leading to
  better error messages and less code.

- Some code has been simplified or made more idiomatic.
2025-03-16 14:10:31 +01:00
Etienne Cordonnier
f084b7f168 make cargo fmt happy 2025-03-16 00:21:45 +01:00
Etienne Cordonnier
591bef3759 utmpx.rs: use correct constant names for musl libc
Unfortunately, the name of those constants are not standardized:
glibc uses __UT_HOSTSIZE, __UT_LINESIZE, __UT_NAMESIZE
musl uses UT_HOSTSIZE, UT_LINESIZE, UT_NAMESIZE

See:
1. https://git.musl-libc.org/cgit/musl/tree/include/utmpx.h
2. https://github.com/bminor/glibc/blob/master/sysdeps/gnu/bits/utmpx.h#L35

This is a partial fix for https://github.com/uutils/coreutils/issues/1361

Signed-off-by: Etienne Cordonnier <ecordonnier@snap.com>
2025-03-15 22:39:35 +01:00
Daniel Hofstetter
d34eb25251 all: use crate_version! from uucore 2025-03-15 16:03:17 +01:00
Daniel Hofstetter
12ab9c2c21 uucore: add crate_version macro 2025-03-15 16:03:16 +01:00
Nicolas Boichat
b7dcaa34da uucore: format: print absolute value of float, then add sign
Simplifies the code, but also fixes printing of negative and positive `NaN`:
`cargo run printf "%f %f\n" nan -nan`

Fixes part 2 of #7412.
2025-03-14 12:42:00 +01:00
Nicolas Boichat
e3872e8e8f uucore: format: force NaN back to lowercase
Fixes formatting of `NaN` to `nan`.

Fixes part 1 of #7412.
2025-03-14 12:42:00 +01:00
Nicolas Boichat
bfa8bf72c7 uucore: format: Fix default Float precision in try_from_spec
The default precision is 6, no matter the format. This applies
to all float formats, not just "%g" (aka FloatVariant::Shortest).

Fixes #7361.
2025-03-14 12:05:16 +01:00
Nicolas Boichat
0a8155b5c2 uucore: format: Fix capitalization of 0 in scientific formating
0.0E+00 was not capitalized properly when using `%E` format.

Fixes #7382.

Test: cargo test --package uucore --all-features float
Test: cargo run printf "%E\n" 0 => 0.000000E+00
2025-03-14 12:05:16 +01:00
M Bussonnier
7632acfc90 Fix touch -t with 2 digit years when YY > 68
When using `touch -t` with a 2 digit year, the year is interpreted as
a relative year to 2000.

When the year is 68 or less, it should be interpreted as 20xx.
When the year is 69 or more, it should be interpreted as 19xx.

This is the behavior of GNU `touch`.

fixes gh-7280

Arguably 2 digits years should be deprecated as we
are already closer to 2069, than 1969.
2025-03-14 11:00:08 +01:00
Sylvestre Ledru
d570512bdc
Merge pull request #7439 from dezgeg/ficlone
cp: Use FICLONE ioctl constant from linux-raw-sys
2025-03-14 09:59:36 +01:00