Skip to content

Conversation

@oetr
Copy link
Contributor

@oetr oetr commented Oct 10, 2025

This allows the users to annotate fuzz tests methods or any type by @DictionaryProvider and provide values directly to the type mutators. Currently, only String and Integral mutators make use of this feature, but this feature makes sense for other types as well.

The main motivation is to work around libFuzzer's TORC (table of recent compares) limitation of 64 bytes. libFuzzer dictionaries suffer from the same limitation. But with this PR, the issue below is found in no time.

  public static Stream<?> myDict() {
    return Stream.of(
        "0123456789abcdef".repeat(50),
        "sitting duck suprime".repeat(53),
        // We can mix all kinds of values in the same dictionary.
        // Each mutator only takes the values it can use.
        123);
  }

  @FuzzTest
  // Just propagate the dictionary to all types of the fuzz test method that can use it.
  @DictionaryProvider(
      value = {"myDict"},
      // Don't want to wait, force String mutators to use dictionary values every other time.
      pInv = 2)
  public static void fuzzerTestOneInput(
      @NotNull @WithUtf8Length(max = 10000) String data,
      @NotNull @WithUtf8Length(max = 10000) String data2) {
    /*
     * libFuzzer's table of recent compares only allows 64 bytes, so asking the fuzzer to construct
     * these long strings would run for a very very long time without finding them. With a
     * DictionaryProvider this problem is trivial, because we can directly provide these long strings to
     * the fuzzer, and also force that they are used more often by setting pInv to a low value.
     */
    if (data.equals("0123456789abcdef".repeat(50))
        && data2.equals("sitting duck suprime".repeat(53))) {
      throw new FuzzerSecurityIssueLow("Found the long string!");
    }
  }

@oetr oetr force-pushed the CIF-1785-dictionary-provider branch 15 times, most recently from 958d54a to d164e3c Compare October 16, 2025 23:50
@oetr oetr force-pushed the CIF-1785-dictionary-provider branch 3 times, most recently from 5eac7ec to 5069fb3 Compare October 28, 2025 19:20
@oetr oetr changed the title feat: dictionary provider for selected types feat: dictionary provider for Strings and Integrals Oct 28, 2025
@oetr oetr marked this pull request as ready for review October 28, 2025 19:25
Copilot AI review requested due to automatic review settings October 28, 2025 19:25
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds dictionary provider support to the mutation framework, enabling users to provide custom dictionary values for fuzzing through the @DictionaryProvider annotation. The implementation:

  • Introduces MutatorRuntime class to provide runtime information (including the fuzz test method) to mutators
  • Adds @DictionaryProvider annotation that references static methods returning Stream<?> of dictionary values
  • Updates all MutatorFactory implementations to accept MutatorRuntime parameter
  • Implements dictionary support for String and integral mutators using weighted sampling
  • Adds SamplingUtils with Vose's alias method for efficient O(1) weighted sampling
  • Includes comprehensive tests for the new functionality

Reviewed Changes

Copilot reviewed 85 out of 85 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
DictionaryProvider.java New annotation for specifying dictionary provider methods with probability control
MutatorRuntime.java New runtime info class providing fuzz test method to mutators
DictionaryProviderSupport.java Helper methods to extract dictionary values from provider methods
IgnoreRecursiveConflicts.java Meta-annotation to allow duplicate annotations during type hierarchy propagation
SamplingUtils.java Weighted sampling utilities using Vose's alias method
StringMutatorFactory.java Implements dictionary support for String mutators
IntegralMutatorFactory.java Implements dictionary support for integral type mutators
MutatorFactory.java Updated interface to accept MutatorRuntime parameter
ArgumentsMutator.java Forwards method-level @DictionaryProvider annotations to parameters
TestSupport.java Adds helper methods for creating dummy MutatorRuntime in tests
All other factory files Updated to pass through MutatorRuntime parameter

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@oetr oetr force-pushed the CIF-1785-dictionary-provider branch 3 times, most recently from 87f2aba to 41633c8 Compare October 30, 2025 12:29
oetr added 5 commits October 30, 2025 13:30
In addition to primitive arrays, these types are now also supported:
- List<Integer> []
- List<Integer> [][][]
Allow annotations to be inherited at multiple levels of the type hierarchy,
enabling both broad and specific configuration of mutators.

Use case: Configure mutators that share common types. For example, annotate
a fuzz test method to apply default settings to all String mutators, while
still allowing individual String parameters to override those settings with
different values.

Without this feature, an annotation could only appear once in the inheritance
chain, preventing this layered configuration approach.
Enables easy tweaking of probabilities for indidual mutation functions
in the future.
For now it only stores the fuzz test method
@oetr oetr force-pushed the CIF-1785-dictionary-provider branch from 41633c8 to 0044fb9 Compare October 30, 2025 12:30
oetr added 6 commits October 30, 2025 13:42
This is just the enabling work. Methods and types annotated by
@DictionaryProvider recursively propagate this annotation down the
type hierarchy by default (can set to be for the annotated type only).

Any mutator can now be adapted to use the user-provided values
this annotation points to.
The StringMutatorFactory now extracts applicable Strings from the
@DictionaryProvider and uses them during mutation according to
the pInv of the last @DictionaryProvider annotation it found on this type.
After adding @DictionaryProvider to IntegralMutatorFactory, the selection of
mutation functions now does an addition step that runs through
weightedSampler, that selects whether to stay in the selection or
do an additional step and select the alias.
Some tests have too strict expectations on mutator output and are way
off from their true probabilities, and simply running the stress test for
more iterations, or with a different seed will result in failed tests
due to variance.
Changing the usage of PRNG in the mutators can affect duration of some
tests. Slow GH runners are especially affected.
@oetr oetr force-pushed the CIF-1785-dictionary-provider branch from 0044fb9 to 84707f2 Compare October 30, 2025 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants