'How to fuzz test API as a whole and not with file inputs?
I'm learning my way around fuzz testing C applications. As I understand it, most of the time when fuzzing, one has a C function that takes/reads files. The fuzzer is given a valid sample file, mutates it randomly or with coverage heuristics, and executes the function with this new input.
But now I don't want to fuzz a function that takes file inputs but a few functions that together make up an API. For example:
int setState(int state);
int run(void); // crashes when previous set state was == 123
The idea is to test the API as a whole and detect if misuse and calling functions in the wrong order (here: calling setState(123) followed with run()) crashes something somewhere.
How could one do such a thing? I'm searching for fuzzing frameworks (does not have to be C), general concepts and examples.
I tried to use libFuzzer from LLVM and "consumed" its fuzzer-data byte by byte. I read a single byte to determine what function to call, then read a parameter if needed, and finally call the function. Then I repeat until no more fuzzer-input-data is left. It looked something like this:
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
while(/* not end of fuzzer data reached */)
switch (fuzzerConsumeByte()) {
case 0:
setState(fuzzerConsumeInt());
break;
case 1:
run();
break;
default:
break;
}
}
return 0;
}
A source that mentions this fuzzing style I found was this:
[...] randomly select functions from your Public API and call them in random order with random parameters. code-intelligence
This seems not like a good or efficient use of an input file based fuzzer. Fuzzing with libFuzzer finds the bug after a few seconds though. But I think that if I extend the API with multiple other functions it will probably take a long time.
Solution 1:[1]
To answer my own question:
Yes, that's how API fuzzing can be done.
For consuming the data bytewise the functions provided by libFuzzer #include <fuzzer/FuzzedDataProvider.h> (C++) could be used. Problem with this: The crash dump and fuzzer corpus won't be human readable.
For a more readable fuzzer, implementing a structure aware custom data mutator for libFuzzer is beneficial.
I used the premade data mutator libprotobuf-mutator (C++) to fuzz the example API. It generates valid input data based on a protocol buffer definition and not just (semi) random bytes. It does make the fuzzing a bit slower though. The bug in the given contrived example API was found after ~2min, compared to ~30secs with the basic byte consuming setup. But I do think that it would scale much better for larger (real) API's.
Solution 2:[2]
One more thing to keep in mind when fuzzing an API of a stateful application this way is that you should make sure to either reset your application between every fuzz test or use AFL instead of libFuzzer to fork for every new input that will be tested. Otherwise your found crashes might not be reproducible with the crash dump because the crash is depending on some changes of your target application that were made by an earlier test case.
I also would like to mention that we are using the "[...] randomly select functions from your Public API and call them in random order with random parameters." fuzzing approach also on larger real life APIs (up to a couple of hundred functions) and are achieving good code coverage and finding results within reasonable times.
You are right about the crash dumps not being human readable but with some Feedback-based Fuzzing tools you are not only provided with a dump of the crashing input, but also with additional information like a stack trace that can help you to analyze the root cause.
Edit:
Here an example fuzz test that uses this fuzzing approach and makes use of the FuzzedDataProvider:
#include <stdint.h>
#include <stddef.h>
#include "FuzzedDataProvider.h"
#include "GPS_module_1.h"
#include "crypto_module_1.h"
#include "crypto_module_2.h"
#include "key_management_module_1.h"
#include "time_module_1.h"
// Wrapper function for FuzzedDataProvider.h
// Writes |num_bytes| of input data to the given destination pointer. If there
// is not enough data left, writes all remaining bytes and fills the rest with zeros.
// Return value is the number of bytes written.
void ConsumeDataAndFillRestWithZeros(void *destination, size_t num_bytes, FuzzedDataProvider *fuzz_data) {
if (destination != nullptr) {
size_t num_bytes_with_fuzz_data = fuzz_data->ConsumeData(destination, num_bytes);
if (num_bytes > num_bytes_with_fuzz_data) {
size_t num_bytes_with_zeros = num_bytes - num_bytes_with_fuzz_data;
std::memset((char*)destination+num_bytes_with_fuzz_data, 0, num_bytes_with_zeros);
}
}
}
extern "C" int FUZZ(const uint8_t *Data, size_t Size) {
// Ensure a minimum data length
if(Size < 100) return 0;
// Setup FuzzedDataProvider
FuzzedDataProvider fuzz_data_provider(Data, Size);
FuzzedDataProvider *fuzz_data = &fuzz_data_provider;
// Reset the state of the target software
// to ensure that crashes are reproducible
crypto::init();
int number_of_functions = fuzz_data->ConsumeIntegralInRange<int>(1,100);
for (int i=0; i<number_of_functions; i++) {
int func_id = fuzz_data->ConsumeIntegralInRange<int>(0, 15);
switch(func_id) {
case 0: {
// Create a struct and fill it with fuzz data
GPS::position struct_0 = {0};
ConsumeDataAndFillRestWithZeros(&struct_0, sizeof(struct_0), fuzz_data);
GPS::get_current_position(&struct_0);
break;
}
case 1: {
GPS::get_destination_position();
break;
}
case 2: {
GPS::init_crypto_module();
break;
}
case 3: {
GPS::position struct_0 = {0};
ConsumeDataAndFillRestWithZeros(&struct_0, sizeof(struct_0), fuzz_data);
GPS::set_destination_position(struct_0);
break;
}
case 4: {
// Create a vector of "random" size
// and fill it with fuzz data
std::vector<uint8_t> fuzz_data_0 = fuzz_data->ConsumeBytes<uint8_t>(fuzz_data->ConsumeIntegral<uint8_t>());
size_t fuzz_size_0 = fuzz_data_0.size();
crypto::hmac struct_0 = {0};
ConsumeDataAndFillRestWithZeros(&struct_0, sizeof(struct_0), fuzz_data);
crypto::calculate_hmac(fuzz_data_0.data(), fuzz_size_0, &struct_0);
break;
}
case 5: {
crypto::get_state();
break;
}
case 6: {
crypto::init();
break;
}
case 7: {
crypto::key struct_0 = {0};
ConsumeDataAndFillRestWithZeros(&struct_0, sizeof(struct_0), fuzz_data);
crypto::set_key(struct_0);
break;
}
case 8: {
crypto::nonce struct_0 = {0};
ConsumeDataAndFillRestWithZeros(&struct_0, sizeof(struct_0), fuzz_data);
crypto::set_nonce(struct_0);
break;
}
case 9: {
std::vector<uint8_t> fuzz_data_0 = fuzz_data->ConsumeBytes<uint8_t>(fuzz_data->ConsumeIntegral<uint8_t>());
size_t fuzz_size_0 = fuzz_data_0.size();
crypto::hmac struct_0 = {0};
ConsumeDataAndFillRestWithZeros(&struct_0, sizeof(struct_0), fuzz_data);
crypto::verify_hmac(fuzz_data_0.data(), fuzz_size_0, &struct_0);
break;
}
case 10: {
crypto::key struct_0 = {0};
ConsumeDataAndFillRestWithZeros(&struct_0, sizeof(struct_0), fuzz_data);
crypto::verify_key(struct_0);
break;
}
case 11: {
crypto::nonce struct_0 = {0};
ConsumeDataAndFillRestWithZeros(&struct_0, sizeof(struct_0), fuzz_data);
crypto::verify_nonce(&struct_0);
break;
}
case 12: {
std::vector<uint8_t> fuzz_data_0 = fuzz_data->ConsumeBytes<uint8_t>(fuzz_data->ConsumeIntegral<uint8_t>());
size_t fuzz_size_0 = fuzz_data_0.size();
key_management::create_key(fuzz_data_0.data(), fuzz_size_0);
break;
}
case 13: {
std::vector<uint8_t> fuzz_data_0 = fuzz_data->ConsumeBytes<uint8_t>(fuzz_data->ConsumeIntegral<uint8_t>());
size_t fuzz_size_0 = fuzz_data_0.size();
key_management::create_nonce(fuzz_data_0.data(), fuzz_size_0);
break;
}
case 14: {
std::vector<uint8_t> fuzz_data_0 = fuzz_data->ConsumeBytes<uint8_t>(fuzz_data->ConsumeIntegral<uint8_t>());
size_t fuzz_size_0 = fuzz_data_0.size();
key_management::generate_random_bytes(fuzz_data_0.data(), fuzz_size_0);
break;
}
case 15: {
time_management::current_time();
break;
}
}
}
return 0;
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | NikLeberg |
| Solution 2 |
