Intro to OSS-Fuzz-Gen

A Framework for Fuzz Target Generation and Evaluation

Konstantinos Chousos

sdi2000215@di.uoa.gr

Department of Informatics & Telecommunications, University of Athens

April 11, 2025

Overview

Intro to fuzzing
OSS-Fuzz
OSS-Fuzz-Gen
1. from_scratch branch
Future work

Fuzzing

What is fuzzing?

Fuzzing is the execution of a Program Under Test (PUT) using input(s) sampled from an input space (the “fuzz input space”) that protrudes the expected input space of the PUT [1].

Fuzzing

What is fuzzing?

These inputs are often generated or mutated automatically.

Generational fuzzing

Inputs generated randomly from a BNF grammar.

Mutational fuzzing

Inputs resulted from mutating inputs from a pre-existing corpus.
Goal: trigger unexpected behavior (e.g., crashes, hangs, memory errors).

Fuzzing

Why fuzz?

The purpose of fuzzing relies on the assumption that there are bugs within every program, which are waiting to be discovered. Therefore, a systematic approach should find them sooner or later.

— OWASP Foundation

Fuzzing

Why fuzz?

Fuzz testing is valuable for:

Software that receives inputs from untrusted sources (security);

Sanity checking the equivalence of two complex algorithms (correctness);

Verifying the stability of a high-volume API that takes complex inputs (stability), e.g. a decompressor, even if all the inputs are trusted.

— Google

Fuzzing

Success stories

Heartbleed vulnerability, OpenSSL [2] (CVE-2014-0160)
- Easily found with fuzzing ⇒ Preventable
Shellshock vulnerabilities, Bash (CVE-2014-6271)
Mayhem (FKA ForAllSecure) [3]
1. Cloudflare
2. OpenWRT

Fuzzing

Fuzzer implementations

LibFuzzer [4].
- In-process, coverage-guided, mutation-based fuzzer.
Americal Fuzzy Lop (AFL) [5].
- Instrumented binaries for edge coverage.
- Adds more fuzzing strategies, better speed, and QEMU/Unicorn support.
- Superseded by AFL++ [6].

LibFuzzer

LibFuzzer is an in-process, coverage-guided, evolutionary fuzzing engine. LibFuzzer is linked with the library under test, and feeds fuzzed inputs to the library via a specific fuzzing entrypoint (fuzz target).

Used to fuzz library functions. The programmer writes a fuzz target to test their implementation.

LibFuzzer

Fuzz target

A function that accepts an array of bytes and does something interesting with these bytes using the API under test [4].

AKA fuzz driver, fuzzer entry point, harness.

LibFuzzer

Fuzz target structure

Entry point called repeatedly with mutated inputs.
Feedback-driven: uses coverage to guide mutations.
Best for libraries, not full programs.

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
  DoSomethingWithData(Data, Size);
  return 0;
}

AFL++

AFL fuzzes programs/binaries. The inputs are taken from the seeds_dir and their mutations.

$ ./afl-fuzz -i seeds_dir -o output_dir -- /path/to/tested/program

Works on black-box or instrumented binaries.
Uses fork-server model for speed.
Supports persistent mode, QEMU, and Unicorn modes.

OSS-Fuzz

Continuous fuzzing for open source software

Scalable, distributed, CI fuzzing solution for open-source projects [7].

Supports LibFuzzer, AFL++, Honggfuzz and Centipede fuzzing engines.
Supports C/C++, Rust, Go, Python and Java/JVM projects.
Based on ClusterFuzz [8].
Started in 2016, in response to the Heartbleed vulnerability [2].

The vulnerability had the potential to affect almost every internet user, yet was caused by a relatively simple memory buffer overflow bug that could have been detected by fuzzing [9].

OSS-Fuzz

OSS-Fuzz

Problems

Upfront cost of writing fuzz targets.
Integration specifications¹:
- project.yaml
- Dockerfile
- build.sh
Only “big” (stars/loc) projects.
Required Google developer account.

OSS-Fuzz-Gen

This framework generates fuzz targets for real-world C/C++, Java, Python projects with various Large Language Models (LLM) and benchmarks them via the OSS-Fuzz platform [10].

Goal: Take as input a GitHub repository and output an OSS-Fuzz project as well as a ClusterFuzzLite project with a meaningful fuzz harness [11].

OSS-Fuzz-Gen

Architecture

Warning

The project must come with preexisting fuzz targets. Fuzz-Introspector gives the LLM info about the harnesses, not the main program/functions.

Δεδομένου ενός github repo link, γίνονται τα ακόλουθα:

compile το project με βάση κάποια predefined generic scripts κι άλλα “build heuristics”
ξανά compile με Fuzz Introspector για program analysis -> json report file με στατιστικά για κάθε συνάρτηση, καθώς πληροφορίες για το signature, τα ορίσματα κτλ.
το report χρησιμοποιείται σε ένα prompt που δίνεται στο LLM για να παράξει harness για κάποια συγκεκριμένη συνάρτηση.
Κάθε harness τεστάρετε για το αν δουλεύει και δεν κρασάρει κατευθείαν. Μετά γίνονται integrated σε OSS-Fuzz/ClusterFuzzLite projects.

OSS-Fuzz-Gen

LLM Prompting

Input: Fuzz-Introspector json code reports.
Include the above in prompt templates → send to LLM.
Result: Harness returned from LLM.

OSS-Fuzz-Gen

Results

One of our sample projects, tinyxml2, went from 38% line coverage to 69% without any interventions from our team.

OSS-Fuzz-Gen

Problems

Project needs to be part of OSS-Fuzz to use OSS-Fuzz-Gen’s capabilities.
- Same hinderances as OSS-Fuzz.
Project needs preexisting harnesses.
Results range from good to bad.

`from_scratch` Branch

Future plans for OSS-Fuzz-Gen include bootstrapping a project fuzz-wise, meaning generating harnesses for a codebase without harnesses.

The work for this feature is located in https://github.com/google/oss-fuzz-gen/blob/main/experimental/from_scratch. The latest commits do not work. Known working commit: 171aac2.

Demo Time

Clone and install Fuzz-Introspector.
Clone and setup OSS-Fuzz-Gen.
1. Checkout working commit: $ git checkout 171aac2.
2. Export API key.
Prepare a target project. README uses dvhar/dateparse.
Execute the script:

❯ python3 -m experimental.from_scratch.generate \
              --language c++ \
              --model gpt-4 \
              --function dateparse \
              --target-dir ../../dvhar/dateparse/ \
              --out-dir out

Demo Time

Result

// out/01.rawoutput
<code>
#include <stdint.h>
#include <stddef.h>

typedef struct {
    int sec;     /* seconds after the minute - [ 0 to 59 ] */
    int min;     /* minutes after the hour - [ 0 to 59 ] */
    int hour;    /* hours since midnight - [ 0 to 23] */
    int mday;    /* day of the month - [ 1 to 31 ] */
    int mon;     /* months since January - [ 0 to 11 ] */
    int year;    /* years */
    int wday;    /* days since Sunday - [ 0 to 6 ] */
    int yday;    /* days since January 1 - [ 0 to 365 ] */
} date_t;

extern int dateparse(const char* datestr, date_t* t, int *offset, int stringlen);

int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Ignore input if it is less than 1
    if (size < 1) {
        return 0;
    }

    // Convert data to string
    char *datestr = (char *)data;

    // Initialize a date_t struct and an offset integer
    date_t t;
    int offset = 0;

    // Call the function-under-test
    dateparse(datestr, &t, &offset, (int)size);

    return 0;
}
</code>

Demo Time

Problems

Response wrapped in <code> tags.
Even without them, harness does not compile.
Missing headers.

Where do we go from here?

Future work

High-level goal

A GitHub action that when integrated to a C/C++ project will:

Use LLMs to create fuzz targets from scratch.
Build and run them, evaluate them based on runtime, coverage etc.
Create PRs to integrate them to the project.

Future work

“Good to have” features

No strict prerequisites.
- E.g. project structure, build system.
Support for Python projects using the Atheris [12] fuzzer.

Future work

Flowchart

%%{ init: { "flowchart": { "nodeSpacing": 30, "rankSpacing": 30 } } }%%
flowchart LR
    n1(["Start"]) --> n2["Add action"]
    n2 --> n8["Project info"]
    n3["LLM"] --> n4[/"Gen harnesses"/]
    n4 --> n5{"Pass?"}
    n5 -- False --> n3
    n5 -- True --> n6["PR"]
    n6 --> n7(["End"])
    n8 --> n3

References

[1]

V. J. M. Manes et al., “The Art, Science, and Engineering of Fuzzing: A Survey.” [Online]. Available: http://arxiv.org/abs/1812.00140

[2]

“Heartbleed Bug.” [Online]. Available: https://heartbleed.com/

[3]

T. Simonite, “This Bot Hunts Software Bugs for the Pentagon,” Wired, Jun. 01, 2020. Available: https://www.wired.com/story/bot-hunts-software-bugs-pentagon/

[4]

“libFuzzer – a library for coverage-guided fuzz testing. — LLVM 21.0.0git documentation.” [Online]. Available: https://llvm.org/docs/LibFuzzer.html

[5]

“American fuzzy lop.” [Online]. Available: https://lcamtuf.coredump.cx/afl/

[6]

M. Heuse, H. Eißfeldt, A. Fioraldi, and D. Maier, AFL++. (Jan. 2022). Available: https://github.com/AFLplusplus/AFLplusplus

[7]

A. Arya, O. Chang, J. Metzman, K. Serebryany, and D. Liu, OSS-Fuzz. (Apr. 08, 2025). Available: https://github.com/google/oss-fuzz

[8]

Google/clusterfuzz. (Apr. 09, 2025). Google. Available: https://github.com/google/clusterfuzz

[9]

“OSS-Fuzz Documentation.” [Online]. Available: https://google.github.io/oss-fuzz/

[10]

D. Liu, O. Chang, J. metzman, M. Sablotny, and M. Maruseac, OSS-fuzz-gen: Automated fuzz target generation. (May 2024). Available: https://github.com/google/oss-fuzz-gen

[11]

OSS-Fuzz Maintainers, “Introducing LLM-based harness synthesis for unfuzzed projects.” [Online]. Available: https://blog.oss-fuzz.com/posts/introducing-llm-based-harness-synthesis-for-unfuzzed-projects/

[12]

Google/atheris. (Apr. 09, 2025). Google. Available: https://github.com/google/atheris

These slides can be found at: https://kchousos.github.io/ofg-presentation/

Thank you!

Intro to OSS-Fuzz-Gen

Overview

Fuzzing

Fuzzing

What is fuzzing?

Fuzzing

What is fuzzing?

Fuzzing

Why fuzz?

Fuzzing

Why fuzz?

Fuzzing

Success stories

Fuzzing

Fuzzer implementations

LibFuzzer

LibFuzzer

LibFuzzer

Fuzz target structure

AFL++

OSS-Fuzz

OSS-Fuzz

Continuous fuzzing for open source software

OSS-Fuzz

OSS-Fuzz

Problems

OSS-Fuzz-Gen

OSS-Fuzz-Gen

OSS-Fuzz-Gen

Architecture

OSS-Fuzz-Gen

LLM Prompting

OSS-Fuzz-Gen

Results

OSS-Fuzz-Gen

Problems

from_scratch Branch

from_scratch Branch

Demo Time

Demo Time

Demo Time

Result

Demo Time

Problems

Where do we go from here?

Future work

High-level goal

Future work

“Good to have” features

Future work

Flowchart

References

Thank you!

`from_scratch` Branch

`from_scratch` Branch