Intro to OSS-Fuzz-Gen

A Framework for Fuzz Target Generation and Evaluation

Konstantinos Chousos

Department of Informatics & Telecommunications, University of Athens

April 11, 2025

Overview

  1. Intro to fuzzing
  2. OSS-Fuzz
  3. OSS-Fuzz-Gen
    1. from_scratch branch
  4. Future work

Fuzzing

Fuzzing

What is fuzzing?

Fuzzing is the execution of a Program Under Test (PUT) using input(s) sampled from an input space (the “fuzz input space”) that protrudes the expected input space of the PUT [1].

Overview of a fuzz campaign.

Overview of a fuzz campaign.

Fuzzing

What is fuzzing?

  • These inputs are often generated or mutated automatically.

    Generational fuzzing
    Inputs generated randomly from a BNF grammar.
    Mutational fuzzing
    Inputs resulted from mutating inputs from a pre-existing corpus.
  • Goal: trigger unexpected behavior (e.g., crashes, hangs, memory errors).

Fuzzing

Why fuzz?

The purpose of fuzzing relies on the assumption that there are bugs within every program, which are waiting to be discovered. Therefore, a systematic approach should find them sooner or later.

OWASP Foundation

Fuzzing

Why fuzz?

Fuzz testing is valuable for:

  • Software that receives inputs from untrusted sources (security);
  • Sanity checking the equivalence of two complex algorithms (correctness);
  • Verifying the stability of a high-volume API that takes complex inputs (stability), e.g. a decompressor, even if all the inputs are trusted.

Google

Fuzzing

Success stories

  • Heartbleed vulnerability, OpenSSL [2] (CVE-2014-0160)
    • Easily found with fuzzing ⇒ Preventable
  • Shellshock vulnerabilities, Bash (CVE-2014-6271)
  • Mayhem (FKA ForAllSecure) [3]
    1. Cloudflare
    2. OpenWRT

Fuzzing

Fuzzer implementations

  • LibFuzzer [4].
    • In-process, coverage-guided, mutation-based fuzzer.
  • Americal Fuzzy Lop (AFL) [5].
    • Instrumented binaries for edge coverage.
    • Adds more fuzzing strategies, better speed, and QEMU/Unicorn support.
    • Superseded by AFL++ [6].

LibFuzzer

LibFuzzer is an in-process, coverage-guided, evolutionary fuzzing engine. LibFuzzer is linked with the library under test, and feeds fuzzed inputs to the library via a specific fuzzing entrypoint (fuzz target).

Used to fuzz library functions. The programmer writes a fuzz target to test their implementation.

LibFuzzer

Fuzz target

A function that accepts an array of bytes and does something interesting with these bytes using the API under test [4].

AKA fuzz driver, fuzzer entry point, harness.

LibFuzzer

Fuzz target structure

  • Entry point called repeatedly with mutated inputs.
  • Feedback-driven: uses coverage to guide mutations.
  • Best for libraries, not full programs.
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
  DoSomethingWithData(Data, Size);
  return 0;
}

AFL++

AFL fuzzes programs/binaries. The inputs are taken from the seeds_dir and their mutations.

$ ./afl-fuzz -i seeds_dir -o output_dir -- /path/to/tested/program
  • Works on black-box or instrumented binaries.
  • Uses fork-server model for speed.
  • Supports persistent mode, QEMU, and Unicorn modes.

OSS-Fuzz

OSS-Fuzz

Continuous fuzzing for open source software

Scalable, distributed, CI fuzzing solution for open-source projects [7].

  • Supports LibFuzzer, AFL++, Honggfuzz and Centipede fuzzing engines.
  • Supports C/C++, Rust, Go, Python and Java/JVM projects.
  • Based on ClusterFuzz [8].
  • Started in 2016, in response to the Heartbleed vulnerability [2].

The vulnerability had the potential to affect almost every internet user, yet was caused by a relatively simple memory buffer overflow bug that could have been detected by fuzzing [9].

OSS-Fuzz

OSS-Fuzz

Problems

  • Upfront cost of writing fuzz targets.
  • Integration specifications1:
    • project.yaml
    • Dockerfile
    • build.sh
  • Only “big” (stars/loc) projects.
  • Required Google developer account.

OSS-Fuzz-Gen

OSS-Fuzz-Gen

This framework generates fuzz targets for real-world C/C++, Java, Python projects with various Large Language Models (LLM) and benchmarks them via the OSS-Fuzz platform [10].

  • Goal: Take as input a GitHub repository and output an OSS-Fuzz project as well as a ClusterFuzzLite project with a meaningful fuzz harness [11].

OSS-Fuzz-Gen

Architecture

Warning

The project must come with preexisting fuzz targets. Fuzz-Introspector gives the LLM info about the harnesses, not the main program/functions.

OSS-Fuzz-Gen

LLM Prompting

  1. Input: Fuzz-Introspector json code reports.
  2. Include the above in prompt templates → send to LLM.
  3. Result: Harness returned from LLM.

OSS-Fuzz-Gen

Results

One of our sample projects, tinyxml2, went from 38% line coverage to 69% without any interventions from our team.

OSS-Fuzz-Gen

Problems

  • Project needs to be part of OSS-Fuzz to use OSS-Fuzz-Gen’s capabilities.
    • Same hinderances as OSS-Fuzz.
  • Project needs preexisting harnesses.
  • Results range from good to bad.

from_scratch Branch

from_scratch Branch

Future plans for OSS-Fuzz-Gen include bootstrapping a project fuzz-wise, meaning generating harnesses for a codebase without harnesses.

The work for this feature is located in https://github.com/google/oss-fuzz-gen/blob/main/experimental/from_scratch. The latest commits do not work. Known working commit: 171aac2.

Demo Time

Demo Time

  1. Clone and install Fuzz-Introspector.
  2. Clone and setup OSS-Fuzz-Gen.
    1. Checkout working commit: $ git checkout 171aac2.
    2. Export API key.
  3. Prepare a target project. README uses dvhar/dateparse.
  4. Execute the script:
 python3 -m experimental.from_scratch.generate \
              --language c++ \
              --model gpt-4 \
              --function dateparse \
              --target-dir ../../dvhar/dateparse/ \
              --out-dir out

Demo Time

Result

// out/01.rawoutput
<code>
#include <stdint.h>
#include <stddef.h>

typedef struct {
    int sec;     /* seconds after the minute - [ 0 to 59 ] */
    int min;     /* minutes after the hour - [ 0 to 59 ] */
    int hour;    /* hours since midnight - [ 0 to 23] */
    int mday;    /* day of the month - [ 1 to 31 ] */
    int mon;     /* months since January - [ 0 to 11 ] */
    int year;    /* years */
    int wday;    /* days since Sunday - [ 0 to 6 ] */
    int yday;    /* days since January 1 - [ 0 to 365 ] */
} date_t;

extern int dateparse(const char* datestr, date_t* t, int *offset, int stringlen);

int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Ignore input if it is less than 1
    if (size < 1) {
        return 0;
    }

    // Convert data to string
    char *datestr = (char *)data;

    // Initialize a date_t struct and an offset integer
    date_t t;
    int offset = 0;

    // Call the function-under-test
    dateparse(datestr, &t, &offset, (int)size);

    return 0;
}
</code>

Demo Time

Problems

  1. Response wrapped in <code> tags.
  2. Even without them, harness does not compile.
  3. Missing headers.

Where do we go from here?

Future work

High-level goal

A GitHub action that when integrated to a C/C++ project will:

  1. Use LLMs to create fuzz targets from scratch.
  2. Build and run them, evaluate them based on runtime, coverage etc.
  3. Create PRs to integrate them to the project.

Future work

“Good to have” features

  1. No strict prerequisites.
    • E.g. project structure, build system.
  2. Support for Python projects using the Atheris [12] fuzzer.

Future work

Flowchart

%%{ init: { "flowchart": { "nodeSpacing": 30, "rankSpacing": 30 } } }%%
flowchart LR
    n1(["Start"]) --> n2["Add action"]
    n2 --> n8["Project info"]
    n3["LLM"] --> n4[/"Gen harnesses"/]
    n4 --> n5{"Pass?"}
    n5 -- False --> n3
    n5 -- True --> n6["PR"]
    n6 --> n7(["End"])
    n8 --> n3

References

[1]
V. J. M. Manes et al., “The Art, Science, and Engineering of Fuzzing: A Survey.” [Online]. Available: http://arxiv.org/abs/1812.00140
[2]
“Heartbleed Bug.” [Online]. Available: https://heartbleed.com/
[3]
T. Simonite, “This Bot Hunts Software Bugs for the Pentagon,” Wired, Jun. 01, 2020. Available: https://www.wired.com/story/bot-hunts-software-bugs-pentagon/
[4]
libFuzzer – a library for coverage-guided fuzz testing. — LLVM 21.0.0git documentation.” [Online]. Available: https://llvm.org/docs/LibFuzzer.html
[5]
“American fuzzy lop.” [Online]. Available: https://lcamtuf.coredump.cx/afl/
[6]
M. Heuse, H. Eißfeldt, A. Fioraldi, and D. Maier, AFL++. (Jan. 2022). Available: https://github.com/AFLplusplus/AFLplusplus
[7]
A. Arya, O. Chang, J. Metzman, K. Serebryany, and D. Liu, OSS-Fuzz. (Apr. 08, 2025). Available: https://github.com/google/oss-fuzz
[8]
Google/clusterfuzz. (Apr. 09, 2025). Google. Available: https://github.com/google/clusterfuzz
[9]
OSS-Fuzz Documentation.” [Online]. Available: https://google.github.io/oss-fuzz/
[10]
D. Liu, O. Chang, J. metzman, M. Sablotny, and M. Maruseac, OSS-fuzz-gen: Automated fuzz target generation. (May 2024). Available: https://github.com/google/oss-fuzz-gen
[11]
OSS-Fuzz Maintainers, “Introducing LLM-based harness synthesis for unfuzzed projects.” [Online]. Available: https://blog.oss-fuzz.com/posts/introducing-llm-based-harness-synthesis-for-unfuzzed-projects/
[12]
Google/atheris. (Apr. 09, 2025). Google. Available: https://github.com/google/atheris

These slides can be found at: https://kchousos.github.io/ofg-presentation/

Thank you!