mirror of
https://github.com/uutils/coreutils.git
synced 2025-07-07 21:45:01 +00:00
document how to do good performance work (#7541)
* document how to do good performance work * doc: spell, ignore "taskset" Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com> --------- Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com>
This commit is contained in:
parent
ebe77c2555
commit
105042fb70
1 changed files with 100 additions and 0 deletions
100
docs/src/performance.md
Normal file
100
docs/src/performance.md
Normal file
|
@ -0,0 +1,100 @@
|
|||
<!-- spell-checker:ignore taskset -->
|
||||
|
||||
# Performance Profiling Tutorial
|
||||
|
||||
## Effective Benchmarking with Hyperfine
|
||||
|
||||
[Hyperfine](https://github.com/sharkdp/hyperfine) is a powerful command-line benchmarking tool that allows you to measure and compare execution times of commands with statistical rigor.
|
||||
|
||||
### Benchmarking Best Practices
|
||||
|
||||
When evaluating performance improvements, always set up your benchmarks to compare:
|
||||
|
||||
1. The GNU implementation as reference
|
||||
2. The implementation without the change
|
||||
3. The implementation with your change
|
||||
|
||||
This three-way comparison provides clear insights into:
|
||||
- How your implementation compares to the standard (GNU)
|
||||
- The actual performance impact of your specific change
|
||||
|
||||
### Example Benchmark
|
||||
|
||||
First, you will need to build the binary in release mode. Debug builds are significantly slower:
|
||||
|
||||
```bash
|
||||
cargo build --features unix --release
|
||||
```
|
||||
|
||||
```bash
|
||||
# Three-way comparison benchmark
|
||||
hyperfine \
|
||||
--warmup 3 \
|
||||
"/usr/bin/ls -R ." \
|
||||
"./target/release/coreutils.prev ls -R ." \
|
||||
"./target/release/coreutils ls -R ."
|
||||
|
||||
# can be simplified with:
|
||||
hyperfine \
|
||||
--warmup 3 \
|
||||
-L ls /usr/bin/ls,"./target/release/coreutils.prev ls","./target/release/coreutils ls" \
|
||||
"{ls} -R ."
|
||||
```
|
||||
|
||||
```
|
||||
# to improve the reproducibility of the results:
|
||||
taskset -c 0
|
||||
```
|
||||
|
||||
### Interpreting Results
|
||||
|
||||
Hyperfine provides summary statistics including:
|
||||
- Mean execution time
|
||||
- Standard deviation
|
||||
- Min/max times
|
||||
- Relative performance comparison
|
||||
|
||||
Look for consistent patterns rather than focusing on individual runs, and be aware of system noise that might affect results.
|
||||
|
||||
## Using Samply for Profiling
|
||||
|
||||
[Samply](https://github.com/mstange/samply) is a sampling profiler that helps you identify performance bottlenecks in your code.
|
||||
|
||||
### Basic Profiling
|
||||
|
||||
```bash
|
||||
# Generate a flame graph for your application
|
||||
samply record ./target/debug/coreutils ls -R
|
||||
|
||||
# Profile with higher sampling frequency
|
||||
samply record --rate 1000 ./target/debug/coreutils seq 1 1000
|
||||
```
|
||||
|
||||
## Workflow: Measuring Performance Improvements
|
||||
|
||||
1. **Establish baselines**:
|
||||
```bash
|
||||
hyperfine --warmup 3 \
|
||||
"/usr/bin/sort large_file.txt" \
|
||||
"our-sort-v1 large_file.txt"
|
||||
```
|
||||
|
||||
2. **Identify bottlenecks**:
|
||||
```bash
|
||||
samply record ./our-sort-v1 large_file.txt
|
||||
```
|
||||
|
||||
3. **Make targeted improvements** based on profiling data
|
||||
|
||||
4. **Verify improvements**:
|
||||
```bash
|
||||
hyperfine --warmup 3 \
|
||||
"/usr/bin/sort large_file.txt" \
|
||||
"our-sort-v1 large_file.txt" \
|
||||
"our-sort-v2 large_file.txt"
|
||||
```
|
||||
|
||||
5. **Document performance changes** with concrete numbers
|
||||
```bash
|
||||
hyperfine --export-markdown file.md [...]
|
||||
```
|
Loading…
Add table
Add a link
Reference in a new issue