Skip to content

Conversation

@cburgard
Copy link
Contributor

This Pull request:

Adds the ability to control the ToyMCSampler to emit binned toys.
Makes use of this new capability in the HypoTestCalculator in cases where the observed data is binned.

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

This PR fixes #

@cburgard cburgard changed the title [RF] ToyMCSampler improvements for binned toys [RF] ToyMCSampler improvements for binned toys [WIP] Oct 22, 2025
@cburgard
Copy link
Contributor Author

This PR is still WIP. This is because the use case relies on #20097 being merged first (even though the functionalities are not closely related).

@guitargeek guitargeek marked this pull request as draft October 22, 2025 16:08
@guitargeek guitargeek self-assigned this Oct 22, 2025
@guitargeek
Copy link
Contributor

I have merged #20097. Could you please rebase this PR?

Also, could you please add a commit where you add both of you to the list of contributors for ROOT 6.38 (insert in alphabetical order)?
https://github.com/root-project/root/blob/master/README/ReleaseNotes/v638/index.md

Thanks a lot!

@github-actions
Copy link

github-actions bot commented Oct 22, 2025

Test Results

    22 files      22 suites   3d 18h 55m 45s ⏱️
 3 696 tests  3 688 ✅ 0 💤   8 ❌
79 361 runs  79 185 ✅ 0 💤 176 ❌

For more details on these failures, see this check.

Results for commit b2f731f.

♻️ This comment has been updated with latest results.

@cburgard cburgard force-pushed the toymcsampler-bugfix branch from fbb3468 to c5dddab Compare October 23, 2025 09:19
@cburgard cburgard force-pushed the toymcsampler-bugfix branch from c5dddab to 6dbedd7 Compare October 23, 2025 09:23
@cburgard
Copy link
Contributor Author

I have merged #20097. Could you please rebase this PR?

Also, could you please add a commit where you add both of you to the list of contributors for ROOT 6.38 (insert in alphabetical order)? https://github.com/root-project/root/blob/master/README/ReleaseNotes/v638/index.md

Thanks a lot!

Thanks a lot!

Done and done. I've removed the WIP marker now.

@cburgard cburgard changed the title [RF] ToyMCSampler improvements for binned toys [WIP] [RF] ToyMCSampler improvements for binned toys Oct 23, 2025
@cburgard cburgard marked this pull request as ready for review October 23, 2025 09:25
@cburgard cburgard requested a review from dpiparo as a code owner October 23, 2025 09:25
Copy link
Contributor

@guitargeek guitargeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much! This change makes sense to me.

Comment on lines +67 to +68
// --- Ensure the ToyMCSampler generates toys with the same structure as the observed data
toymcs->SetProtoData(&data);
Copy link
Contributor

@guitargeek guitargeek Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a change in behavior that likely causes the RooStats tests to fail. I'll take a look.

Copy link
Contributor

@guitargeek guitargeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR actually introduces a behavioral change that causes the RooStats tests like stressRooStats to fail. Even if the warning about duplicate args is silenced in this line by doing allVars.add(*prototype->get(), true).

@cburgard, did you intend the behavior change?

@cburgard
Copy link
Contributor Author

This PR actually introduces a behavioral change that causes the RooStats tests like stressRooStats to fail. Even if the warning about duplicate args is silenced in this line by doing allVars.add(*prototype->get(), true).

@cburgard, did you intend the behavior change?

This is a bit of a hard question. I'll try to explain my reasoning here.

The core issue that triggered this PR is this:

When running the ToyMCSampler on a binned dataset, before this fix, this would happen:

cppyy.gbl.std.runtime_error: RooStats::HypoTestResult* RooStats::HypoTestCalculatorGeneric::GetHypoTest() => runtime_error: Error in RooAbsReal::setData(): only resetting with same-structured data is supported.

That is because the ToyMCSampler would always default to an unbinned dataset, and the way the HypoTestCalculator is setup it would not communicate the shape of the dataset to the underlying sampler, triggering this error.

In that sense, the change in behavior is intentional and is the main reason for this fix.

I would, however, have hoped that it would not affect the behavior on datasets that are already unbinned, where the ToyMCSampler (supposedly?) worked correctly already before.

The change of the behavior for binned datasets is intended and the point of this fix.
I do not currently have a test case to check whether there is any significant change in behavior also for unbinned datsets, and would certainly have hoped that any change there is minor or perhaps also to be considered a "fix" (perhaps there are some other properties of the toys that are more reflective of the example dataset now than they were before?)
But if there is an actual change in the quality of the estimate (that is, the resulting p-values of the test change for a scenario that is recognized as a unit test that was working correctly before my fix), that is probably then to be considerend "non-intentional".

So perhaps to fling this question back to you: Is there a unit test that produces notably different results than it did before, or is this just a case where the numerics changed a bit due to the toys now being more reflective of the example dataset than they were before?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants