Add --drop-exec option to filter warmup samples #120

aemerson · 2025-10-25T05:14:28Z

This commit introduces a new --drop-exec option that allows users to
drop the first N execution samples when running with --exec-multisample.
This is useful for mitigating warmup effects that can skew performance
measurements.

The option accepts an integer N specifying how many initial samples to
drop, and works with all execution modes (normal, --exec, and
--exec-interleaved-builds).

This commit introduces a new --drop-exec option that allows users to drop the first N execution samples when running with --exec-multisample. This is useful for mitigating warmup effects that can skew performance measurements. The option accepts an integer N specifying how many initial samples to drop, and works with all execution modes (normal, --exec, and --exec-interleaved-builds).

aemerson · 2025-10-25T05:14:43Z

Add --drop-exec option to filter warmup samples #120 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

ldionne

General question: one could argue that the handling of warmups should be done at the benchmark level. For example, GoogleBenchmark does that, they will discard some number of warmup runs and then iterate until there is a stable result. That way, when you run a benchmark, you get a result that is immediately usable.

Is that not the case for the benchmarks that are part of the LLVM test suite? Do you see an actual difference between results with and without dropping "warmup" runs?

ldionne · 2025-10-27T16:01:10Z

lnt/tests/test_suite.py

+            # Check for incompatible options
+            if opts.only_compile:
+                self._fatal("--drop-exec cannot be used with --only-compile")
+            if opts.build:


I don't think --build means that we're necessarily skipping the tests, right? Can't you have both --build and --exec, in which case it would make sense to have --build and --drop-exec at the same time?

I think --build and --exec is redundant as a concept because that's the default behavior. These options are only useful as a stopping point.

That said there's an issue which I don't think the current implementation is self consistent with. --exec (née --test-prebuilt) is supposed to skip previous steps, which is fine. But --build doesn't skip the --configure phase (and doing so seems like bad UX).

uv.lock

aemerson · 2025-10-29T05:23:25Z

General question: one could argue that the handling of warmups should be done at the benchmark level. For example, GoogleBenchmark does that, they will discard some number of warmup runs and then iterate until there is a stable result. That way, when you run a benchmark, you get a result that is immediately usable.

Is that not the case for the benchmarks that are part of the LLVM test suite? Do you see an actual difference between results with and without dropping "warmup" runs?

For some benchmark suites that are plugged into the test-suite as an "external" suite, the current best practices are that the first run be dropped. Whether or not this makes a meaningful difference isn't something I have evidence of, but this is the current recommendation that I'm hearing.

Intuitively it does make some sense, for example when I run some industry standard benchmarks I first build with high parallelism (--build-threads 10 for example) and then execute serially (-j1) and there's a chance that the machine after building is thermally recovering from the build phase.

aemerson marked this pull request as ready for review October 25, 2025 05:14

Fix flake error

f44eded

aemerson requested a review from ldionne October 25, 2025 05:41

ldionne reviewed Oct 27, 2025

View reviewed changes

Remove accidental uv.lock commit

e30ba55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add --drop-exec option to filter warmup samples #120

Add --drop-exec option to filter warmup samples #120

aemerson commented Oct 25, 2025

Uh oh!

aemerson commented Oct 25, 2025 •

edited

Loading

Uh oh!

ldionne left a comment

Uh oh!

ldionne Oct 27, 2025

Uh oh!

aemerson Oct 27, 2025

Uh oh!

Uh oh!

aemerson commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Add --drop-exec option to filter warmup samples #120

Are you sure you want to change the base?

Add --drop-exec option to filter warmup samples #120

Conversation

aemerson commented Oct 25, 2025

Uh oh!

aemerson commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldionne left a comment

Choose a reason for hiding this comment

Uh oh!

ldionne Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

aemerson Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aemerson commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aemerson commented Oct 25, 2025 •

edited

Loading