Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Blog for safety-of-methods-for-numeric-primitive-types - 2024-12-02 #51

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
195 changes: 195 additions & 0 deletions _posts/2024-12-02-safety-of-methods-for-numeric-primitive-types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
---
title: Safety of Methods for Numeric Primitive Types
layout: post
---
Authors : Rajath M Kotyal, Yen-Yun Wu, Lanfei Ma, Junfeng Jin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Authors : Rajath M Kotyal, Yen-Yun Wu, Lanfei Ma, Junfeng Jin
Authors: Rajath M Kotyal, Yen-Yun Wu, Lanfei Ma, Junfeng Jin

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


In this blog post, we discuss how we verified the absence of arithmetic overflow/underflow and undefined behavior in various unsafe methods provided by Rust's numeric primitive types, given that their safety preconditions are satisfied.

## Introduction
Ensuring the correctness and safety of numeric operations is crucial for developing reliable systems. In high-performance applications, numeric operations often require bypassing checks to maximize efficiency. However, ensuring the safety of these methods under their stated preconditions is critical to preventing undefined behavior.

In the past 3 months, we have rigorously analyzed unsafe methods provided by Rust's numeric primitive types, such as `unchecked_add` and `unchecked_sub`, which omit runtime checks for overflow and underflow, using formal verification techniques.

## Challenge Overview
The [challenge](https://model-checking.github.io/verify-rust-std/challenges/0011-floats-ints.html) is divided into three parts:
- [Part 1] Unsafe Integer Methods: Prove safety of methods like ```unchecked_add```, ```unchecked_sub```, ```unchecked_mul```, ```unchecked_shl```, ```unchecked_shr```, and ```unchecked_neg``` for various integer types.
- [Part 2] Safe API Verification: Verify safe APIs that leverage the unsafe integer methods from Part 1, such as ```wrapping_shl```, ```wrapping_shr```, ```widening_mul```, and ```carrying_mul```.
- [Part 3] Float to Integer Conversion: Verify the absence of undefined behavior in ```to_int_unchecked``` for floating-point types ```f32``` and ```f64```. **FIXME: should we include f16 and f128 as well?**

In addition to verifying the methods under their specified safety preconditions, we needed to ensure the absence of specific undefined behaviors:
- Invoking undefined behavior via compiler intrinsics.
- Reading from uninitialized memory.
- Mutating immutable bytes.
- Producing an invalid value.

## Approach and Implementation
To tackle this challenge, we utilized Kani's verification capabilities to write proof harnesses for the unsafe methods. Our strategy involved:
1. **Specifying Safety Preconditions**: To ensure the methods behaved correctly, we explicitly specified safety preconditions in the code. These preconditions were used to:
- Guide input generation.
- Define the boundaries for valid inputs.
- Avoid triggering undefined behavior in cases where the preconditions were violated.

2. **Writing Verification Harnesses**: We developed Kani proof harnesses for each unsafe method by:
- Generating inputs that satisfy the required safety preconditions.
- Invoking the methods within an `unsafe` block to verify their correctness and safety under the defined conditions.

3. **Using Macros for Code Generation**: To efficiently handle the large number of methods and types that required verification, we utilized Rust macros. These macros automated the generation of proof harnesses, reducing redundancy and improving maintainability.

4. **Handling Large Input Spaces**: For methods with large input spaces (e.g., `unchecked_mul` for 64-bit integers), we made the verification process more tractable by:
- Introducing assumptions to limit the range of inputs.
- Focusing on specific edge cases (e.g., maximum values, zero, and boundary conditions) that are most likely to cause overflow or underflow.

We will walk through each part with examples from the Unsafe Integer Methods part of the challenge.

### 1. Specifying Safety Preconditions
```rust
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make for slightly nicer reading if there were one or two sentences of text between the heading and the example. Such text should explain what the source code that follows will exemplify.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

#[requires(!self.overflowing_add(rhs).1)]
pub const unsafe fn unchecked_add(self, rhs: Self) -> Self {
assert_unsafe_precondition!(
check_language_ub,
concat!(stringify!($SelfT), "::unchecked_add cannot overflow"),
(
lhs: $SelfT = self,
rhs: $SelfT = rhs,
) => !lhs.overflowing_add(rhs).1,
);

// SAFETY: The caller guarantees that the preconditions are met.
unsafe {
intrinsics::unchecked_add(self, rhs)
}
}
```
Here, the ```#[requires]``` attribute is used to specify that `unchecked_add` requires the addition to not overflow. This ensures that the function operates within defined behavior when called in an unsafe context. While we directly leveraged the existing precondition in `unchecked_add`, not every function has an explicit precondition check. Still, we can write preconditions for unsafe functions based on the `SAFETY` section in the official documentation or the `SAFETY` comments in the source code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding a note on what already was there before you arrived and what it is that you added, i.e., really just the requires clause, but I don't think that is as obvious to other potential readers.


The same logic applies to the postconditions (`#[ensure]`) of a function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The same logic applies to the postconditions (`#[ensure]`) of a function.
The same logic applies to the postconditions (`#[ensures]`) of a function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't know what "the same logic" really means here?


### 2. Generating Verification Harnesses
We created macros to generate verification harnesses for each method and type. For example, to generate harnesses for ```unchecked_add```:
```rust
#[cfg(kani)]
mod verify {
use super::*;

macro_rules! generate_unchecked_math_harness {
($type:ty, $method:ident, $harness_name:ident) => {
#[kani::proof_for_contract($type::$method)]
pub fn $harness_name() {
let num1: $type = kani::any::<$type>();
let num2: $type = kani::any::<$type>();

unsafe {
num1.$method(num2);
}
}
}
}

generate_unchecked_math_harness!(i8, unchecked_add, unchecked_add_i8);
generate_unchecked_math_harness!(i16, unchecked_add, unchecked_add_i16);
// ... Repeat for other integer types
}
```
The harness contains several parts:
- Symbolic values: These values (```num1``` and ```num2```) are generated by Kani using ```kani::any```.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Assumptions: With `#[kani::proof_for_contract]` annotation, Kani automatically inserts `kani::assume()` before the unsafe function call. It ensures that all generated values respect the preconditions of `unchecked_add`. For further details, you can refer to [Kani official RFC for function contracts]([url](https://github.com/model-checking/kani/blob/main/rfc/src/rfcs/0009-function-contracts.md#user-experience)).
- Unsafe Execution: The invocation of ```unchecked_add``` within an unsafe block for verification.

### 3. Handling Large Input Spaces
For methods like ```unchecked_mul```, verifying over the entire input space is infeasible due to the exponential number of possibilities. We addressed this by partitioning the input space into intervals:
```rust
generate_unchecked_mul_intervals!(i32, unchecked_mul,
unchecked_mul_i32_small, -10i32, 10i32,
unchecked_mul_i32_large_pos, i32::MAX - 1000i32, i32::MAX,
unchecked_mul_i32_large_neg, i32::MIN, i32::MIN + 1000i32,
unchecked_mul_i32_edge_pos, i32::MAX / 2, i32::MAX,
unchecked_mul_i32_edge_neg, i32::MIN, i32::MIN / 2
);
```
By focusing on critical ranges, we ensured that the verification process remained tractable while still covering important cases where overflows are likely to occur.

The critical ranges include :
- Small Ranges Near Zero: Includes values around 0, where behavior transitions (e.g., sign changes) are more likely to expose issues like underflows.
- Boundary Checks: Covers the maximum and minimum values `(i32::MAX, i32::MIN)`, ensuring the method handles edge cases correctly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Boundary Checks: Covers the maximum and minimum values `(i32::MAX, i32::MIN)`, ensuring the method handles edge cases correctly.
- Boundary Checks: Covers the maximum and minimum values (`i32::MAX`, `i32::MIN`), ensuring the method handles edge cases correctly.

to avoid the impression that you are talking about an (open) interval.

- Halfway Points: For instance, from `i32::MAX / 2` to `i32::MAX`. This helps validate behavior at large magnitudes. For types like i64 and larger, where state space `(2^64 x 2^64 = ~2^128 combinations)` becomes infeasible, narrower ranges or sampling are critical.

## Part 2: Verifying Safe APIs
We also verified safe APIs that internally use the previously verified unsafe methods through the above workflow.

Why Verify Safe APIs?
Safe APIs, like widening_mul, leverage internally unsafe operations (e.g., unchecked_mul). Even though marked safe, their reliability still depends on the correctness of these underlying operations. Verifying them ensures no overflow occurs in the wider type used internally. When verifying safe APIs like widening_mul, we consider the underlying unsafe functions and target specific input intervals to ensure correctness across critical ranges, balancing thoroughness with practicality.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Safe APIs, like widening_mul, leverage internally unsafe operations (e.g., unchecked_mul). Even though marked safe, their reliability still depends on the correctness of these underlying operations. Verifying them ensures no overflow occurs in the wider type used internally. When verifying safe APIs like widening_mul, we consider the underlying unsafe functions and target specific input intervals to ensure correctness across critical ranges, balancing thoroughness with practicality.
Safe APIs, like `widening_mul`, leverage internally unsafe operations (e.g., `unchecked_mul`). Even though marked safe, their reliability still depends on the correctness of these underlying operations. Verifying them ensures no overflow occurs in the wider type used internally. When verifying safe APIs like `widening_mul`, we consider the underlying unsafe functions and target specific input intervals to ensure correctness across critical ranges, balancing thoroughness with practicality.


For example, verifying `widening_mul` for `u16`:
```rust
generate_widening_mul_intervals!(u16, u32,
widening_mul_u16_small, 0u16, 10u16,
widening_mul_u16_large, u16::MAX - 10u16, u16::MAX,
widening_mul_u16_mid_edge, (u16::MAX / 2) - 10u16, (u16::MAX / 2) + 10u16
);
```

This macro generates harnesses that verify `widening_mul` over specified input intervals, ensuring that it operates safely across different ranges. By verifying these safe APIs, we validated that they uphold Rust's safety guarantees, even when internally relying on unsafe methods.

## Part 3: Verifying Float to Integer Conversion
For the `to_int_unchecked` method, we specified preconditions to ensure the float is finite and within the target integer type's representable range:

**FIXME: Mention Kani feature requests (float_to_int_unchecked and in_range support) here**

**FIXME: Mention that at the moment float_to_int_unchecked does not support f16 and f128.**

```rust
#[requires(self.is_finite() && kani::float::float_to_int_in_range::<Self, Int>(self))]
pub unsafe fn to_int_unchecked<Int>(self) -> Int where Self: FloatToInt<Int> {
// Implementation
}
```
Our harnesses then verified that, under these preconditions, the conversion does not result in undefined behavior:
```rust
macro_rules! generate_to_int_unchecked_harness {
($floatType:ty, $($intType:ty, $harness_name:ident),+) => {
$(
#[kani::proof_for_contract($floatType::to_int_unchecked)]
pub fn $harness_name() {
let num: $floatType = kani::any();
let result = unsafe { num.to_int_unchecked::<$intType>() };

assert_eq!(result, num as $intType);
}
)+
}
}

generate_to_int_unchecked_harness!(f32,
i8, to_int_unchecked_f32_i8,
i16, to_int_unchecked_f32_i16,
// ... Repeat for other integer types
);
```

## Challenges Encountered and Lessons Learned

### Handling Exponential Input Spaces
Verifying methods with large integer types was challenging due to the vast number of possible input combinations. To manage this, we implemented:
- Partitioning Input Ranges: We targeted critical intervals where overflows are most likely, such as boundary values and mid-range points.
- Using Assumptions: Leveraging `kani::assume`, we constrained inputs to these manageable ranges, ensuring comprehensive coverage of important cases without overwhelming the verifier.

### Ensuring Assumptions Reflect Safety Preconditions
Aligning our assumptions with the actual safety preconditions of each method was essential. Any mismatch could result in unsound verification, either by overlooking edge cases or by falsely validating incorrect behaviors.

### Writing Efficient Macros
Creating macros to generate numerous harnesses required meticulous design. We focused on maintaining readability and reusability, which minimized code duplication and enhanced the scalability and maintainability of our verification process.

## Conclusion
This challenge provided valuable insights into the process of formally verifying unsafe methods in Rust's standard library. By leveraging Kani and carefully designing our verification harnesses, we successfully demonstrated the safety of numerous methods across various numeric primitive types.

Our work contributes to the robustness of Rust's standard library and highlights the importance of formal verification tools in modern software development.

We hope this post provides valuable insights into the verification process of unsafe numeric methods in Rust. If you're interested in formal verification or Rust programming, we encourage you to explore Kani and contribute to its ongoing development.

## References
[Kani Documentation](https://model-checking.github.io/kani/)
- Rust Numeric Types
- Rust Intrinsics

Link to the challenge: [Safety Verification: Floats and Integers](https://model-checking.github.io/verify-rust-std/challenges/0011-floats-ints.html)