Skip to content

Commit

Permalink
Open source init
Browse files Browse the repository at this point in the history
  • Loading branch information
corydu committed May 9, 2023
1 parent 5b28e16 commit a442cd3
Show file tree
Hide file tree
Showing 183 changed files with 48,601 additions and 6 deletions.
53 changes: 53 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
[package]
name = "snapchange"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
vmm-sys-util = "0.11.0"
kvm-bindings = { version = "0.6.0", features = ["fam-wrappers"] }
kvm-ioctls = "0.12.0"
anyhow = "1.0.40"
libc = "0.2.98"
bitflags = "1.2.1"
x86_64 = { version = "0.14.10", default-features = false, features = ["instructions"] }
clap = { version = "4.1.8", features = ["derive"] }
serde_json = "1.0"
serde = { version = "1.0", features = ["derive"] }
serde-hex = "0.1.0"
nix = "0.22.2"
iced-x86 = "1.13.0"
thiserror = "1.0"
ctrlc = "3.2.0"
core_affinity = "0.5.10"
lazy_static = "1.4.0"
log = "0.4"
env_logger = "0.9.0"
clap-verbosity-flag = "2.0"
rand = "0.8"
rand_core = "0.6"
ahash = "0.8"

# For addr2line implementation
addr2line = "0.17.0"
memmap = "0.7"
object = { version = "0.27.1", default-features = false, features = ["read"], optional = true }

# For MSRs to/from primitive
num_enum = "0.5.7"

# For stats TUI
tui = "0.19.0"
crossterm = "0.22.1"
tui-logger = "0.8.1"
toml = "0.7.2"

[profile.release]
debug = true
incremental = true

[features]
default = []
redqueen = []
36 changes: 36 additions & 0 deletions HACKING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Brief roadmap of the source files

Below is a brief roadmap of the source code and where to potentially start reading:

* `src/commands/[fuzz|minimize|trace|coverage|project].rs` - Contains the logic for each command
- Each command follows roughly the same logic:

```
Initialize a KVM environment
Create a `FuzzVm` which executes and maintains the state of the Guest VM
Prepare the Guest VM by mutating input or modifying state
loop {
Run the Guest VM (fuzzvm.run())
Handle the exit condition of the VM (handle_vmexit)
Save metadata
}
```
* `src/fuzzvm.rs` - Provides `FuzzVm` to handle the main state of the Guest.
- This struct is how a researcher can read/write state into the guest
- Examples:
- fuzzvm.read_bytes()
- fuzzvm.read::<u32>()
- fuzzvm.read::<[u8; 128]>()
- fuzzvm.write_bytes()
- fuzzvm.write::<u32>()
- fuzzvm.write::<[u8; 128]>()
- fuzzvm.rip() | fuzzvm.set_rip()
- fuzzvm.rax() | fuzzvm.set_rax()
- fuzzvm.hexdump()
- fuzzvm.translate()
- fuzzvm.print_context()
* `src/fuzzer.rs` - Provides the `Fuzzer` trait that each target specific fuzzer will implement
* `src/stats.rs` - Aggregates and displays the statistics from each core
22 changes: 22 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
all: check docs

check:
# Clippy checks
RUST_BACKTRACE=full cargo clippy --color always -- \
--no-deps \
--allow clippy::verbose_bit_mask \
--allow clippy::print_with_newline \
--allow clippy::write_with_newline \
--allow clippy::module_name_repetitions \
--allow clippy::missing_errors_doc \
--deny missing_docs \
--deny clippy::missing_docs_in_private_items \
--deny clippy::pedantic \
--allow clippy::struct_excessive_bools \
--allow clippy::redundant_field_names \
--allow clippy::must_use_candidate \
--allow clippy::manual_flatten

docs: check
# Documentation build regardless of arch
cargo doc --no-deps
166 changes: 160 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,166 @@
## My Project
# Snapchange

TODO: Fill this README out!
Lightweight fuzzing of a memory snapshot using KVM

Be sure to:
Snapchange provides the ability to load a raw memory dump and register state into a
KVM virtual machine (VM) for execution. At a point in execution, this VM can be reset to its initial
state by resetting the dirty pages found by KVM or pages manually dirtied by a
fuzzer.

* Change the title in this README
* Edit your repository description on GitHub
## Quick Links:

* [Cookbook](./docs/COOKBOOK.md) provides examples [fuzz](./docs/COOKBOOK.md#fuzz-commands), [trace](./docs/COOKBOOK.md#trace-commands), [coverage](./docs/COOKBOOK.md#coverage-commands), [minimize](./docs/COOKBOOK.md#minimize-commands), and [project](./docs/COOKBOOK.md#project-commands) command line utilities
* [Taking a snapshot with QEMU](./qemu_snapshot/README.md)
* [Architecture](./docs/ARCHITECTURE.md)
* [Fuzzer Lifecycle](./docs/FUZZ_FUNCTION_LIFECYCLE.md)

## Tutorials

* [Tutorial 1 - Basic Usage](./examples/01_getpid/README.md)
* [Tutorial 2 - `LibTIFF` with ASAN](./examples/02_libtiff/README.md)
* [Tutorial 3 - `FFmpeg` with custom mutator](./examples/03_ffmpeg_custom_mutator/README.md)
* [Tutorial 4 - Syscall fuzzer](./examples/04_syscall_fuzzer/README.md)
* [Tutorial 5 - Redqueen](./examples/05_redqueen/README.md)

# Aspirations

* Replay a physical memory and register state snapshot using KVM
* Parallel execution across multiple cores
* Provide a set of introspection features to the guest VM
* Real-time coverage state via breakpoint coverage
* Real-time performance metrics of fuzzer components
* Provide fuzzing utilities such as single-step debug tracing, testcase minimization, and testcase coverage
* Input abstraction to allow custom mutation and generation strategies

# Example:

#### Create a target fuzzer from the fuzzer template

```console
$ cp -r -L fuzzer_template your_new_fuzzer
```

#### Modify `your_new_fuzzer/create_snapshot.sh` to take a snapshot of your target

#### Update `src/fuzzer.rs` to inject mutated data into the guest VM

```rust
#[derive(Default)]
pub struct TemplateFuzzer;

impl Fuzzer for TemplateFuzzer {
// The type of Input being fuzzed. Used to know how to generate and mutate useful inputs.
type Input = Vec<u8>;
// The starting address of the snapshot
const START_ADDRESS: u64 = 0x402363;
// The maximum length of mutated input to generate
const MAX_INPUT_LENGTH: usize = 100;

fn set_input(&mut self, input: &Self::Input, fuzzvm: &mut FuzzVm<Self>) -> Result<()> {
// Write the mutated input into the data buffer in the guest VM
fuzzvm.write_bytes_dirty(VirtAddr(0x402004), CR3, &input)?;
Ok(())
}

fn reset_breakpoints(&self) -> Option<&[BreakpointLookup]> {
Some(&[
// Reset when the VM hits example1!main+0x123
BreakpointLookup::SymbolOffset("example1!main", 0x123)
])
}
}
```

#### Start fuzzing with 16 cores

```console
$ cargo run -r -- fuzz -c 16
```


# Implementation

Quick usage of terms for this README:

* Hypervisor: The target agnostic code executing the snapshot in KVM
* Fuzzer: The target specific code used to modify and monitor the guest for a target
specific fuzz case

The hypervisor begins by mapping the physical memory file for each core requested. In
this way, each core has its own, unique copy of memory. The hypervisor then creates the
KVM guest and gives the guest this backing memory. This guest's register state is then
initialized with the given register state and execution of the guest is launched. The
hypervisor waits until the guest exits. Each exit is handled by the hyperisor and some
are passed to the fuzzer for target specific mutation, modification, or introspection. If
the handler of the exit signifies that the guest should be reset, the hyperisor exits the
run loop and resets the guest back to the original snapshot state and restarts the run
loop again.

Coverage of the guest is generated by using coverage breakpoints. A separate file with a
list of addresses to breakpoint can be given to the hypervisor. If any of these addresses
are hit, the address will be added to the coverage database and the instruction for that
address will be restored. In this way, the breakpoint will not be triggered again.

# Project directory

Snapchange leverages target specific project directories for configuration. This directory
is where input and output files and directories are placed. The following file
extensions/directories are used as inputs:

* `.physmem` - The file containing the raw, physical memory file
* Register file (one of the following)
- `.regs` - JSON register file containing the [register state](./docs/REGISTER.md)
- `.qemuregs` - Output from `info registers` from `qemu`

The full list of files and their uses in the project directory can be found [here](./docs/PROJECT_DIRECTORY.md)

# Debugging Trace

A full example of the debugging single-step trace can be found [here](./docs/DEBUG_TRACE.md).

```
ITERATION 604 0x00007ffff7ecb0d5 0x11115000 | libc-2.31.so!__GI___getpid+0x5 (0x7ffff7ecb0d5)
syscall
[0f, 05]
ITERATION 605 0xffffffff83a00000 0x11115000 | entry_SYSCALL_64+0x0 (0xffffffff83a00000)
swapgs
[0f, 01, f8]
ITERATION 606 0xffffffff83a00003 0x11115000 | entry_SYSCALL_64+0x3 (0xffffffff83a00003)
mov qword ptr gs:[0xa014], rsp
[None:0x0+0xa014=0xa014]]
RSP:0x7fffffffeb78 -> example1!main+0x19 (0x55555555514e)-> 0xff8458b48f44589
[65, 48, 89, 24, 25, 14, a0, 00, 00]
ITERATION 607 0xffffffff83a0000c 0x11115000 | entry_SYSCALL_64+0xc (0xffffffff83a0000c)
nop
[66, 90]
ITERATION 608 0xffffffff83a0000e 0x11115000 | entry_SYSCALL_64+0xe (0xffffffff83a0000e)
mov rsp, cr3
RSP:0x7fffffffeb78 -> example1!main+0x19 (0x55555555514e) -> 0xff8458b48f44589
CR3:0x11115000
[0f, 20, dc]
```

# Snapshots

Information about obtaining a snapshot via `VirtualBox` or `QEMU` are below:

* [VirtualBox](./docs/VIRTUALBOX_SNAPSHOT.md)
* [QEMU](./qemu_snapshot/README.md)

The examples include a `make_example.sh` (like [example 1](./examples/01_getpid/make_example.sh)) script which goes a full snapshot from scratch. These
examples can be used as a template for other targets for reproducible snapshots.

# Documentation and clippy

```
make all
cargo doc --open
```

# Where to begin reading?

The [HACKING](./HACKING.md) provides a few higher level locations in the code base to start
understanding the system.

## Security

Expand All @@ -14,4 +169,3 @@ See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more inform
## License

This project is licensed under the Apache-2.0 License.

11 changes: 11 additions & 0 deletions bench/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[package]
name = "fuzzer_template"
version = "0.1.0"
edition = "2021"
exclude = ["qemu_snapshot", "snapshot"]

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
anyhow = "1.0.68"
snapchange = { version = "0.1.0", path = ".." }
95 changes: 95 additions & 0 deletions bench/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Snapchange benchmark

This is the basic snapchange benchmark for gathering performance data across several factors:

* Instructions per iteration
* Breakpoints triggered per iteration
* Dirty Pages restored per iteration
* Number of cores used

Building the benchmark is a script:

```sh
$ ./make_bench_target.sh
```

Executing the benchmark is also a simple script:

```sh
$ ./bench.sh
```

This will generate a `data_120s` directory containing all of a `stats` file for each benchmark
configuration. There is a small utility used to convert the data to be used for generating
the comparison graphs.

```sh
$ cd gather_data
$ cargo run -r -- ../data_120s
$ cd ..
```

This will create a `data_120s.dat` data file with the extracted data ready for [seaborn](https://seaborn.pydata.org/)
to generate several graphs.

```sh
$ pip3 install seaborn
$ python3 generate_graph.py ./data_120.dat
```

## Benchmark harness

The benchmark harness is a small [assembly snippet](./bench_harness/src/main.rs):

```
// Execute the dirty pages and instructions for this benchmark
// R9 - Memory that can be dirtied (should NOT have to be set in the benchmark fuzzer)
// R10 - Number of pages to dirty (at least 1)
// RCX - Number of instructions to execute (not including dirtying pages)
unsafe {
std::arch::asm!(
r#"
4:
mov byte ptr [r9], 0x41
add r9, 0x1000
dec r10
jnz 4b
2:
dec rcx
jnz 2b
"#,
in("r9") scratch,
options(nostack)
);
}
```

This section will dirty pages by writing a single byte to each page. The number of pages to dirty
is set in `rdi`

```
4:
mov byte ptr [r9], 0x41
add r9, 0x1000
dec r10
jnz 4b
```

For number of instructions, there is a tight loop that executes. The number of instructions is stored in `rcx`.

```
2:
dec rcx
jnz 2b
```

The number of instructions and dirty pages cores are given to the harness via environment variables.

For running the benchmark with `1000` pages, `1000000` instructions, and `8` cores.

```PAGES=1000 INSTRS=1000000 timeout 120s cargo run -r -- fuzz -c 8 --timeout 120s```

For running the benchmark with `1000` pages, `1000000` breakpoints, and `8` cores.

```PAGES=1000 VMEXITS=1000000 timeout 120s cargo run -r -- fuzz -c 8 --timeout 120s```
Loading

0 comments on commit a442cd3

Please sign in to comment.