timing_shield/
barriers.rs

1/// Identity function accepting a `u8` as input and outputting that same `u8` as
2/// output, while blocking the compiler from applying optimizations across this node in the data
3/// dependence graph.
4///
5/// This is an **internal utility** of rust-timing-shield that users of the framework will not
6/// normally need to use.
7///
8/// # Background
9///
10/// rust-timing-shield has essentially two goals: (1) prevent the programmer from writing code that
11/// they shouldn't, and (2) prevent the compiler from transforming a programmer's secure code
12/// into vulnerable code. This function is a small primitive that assists in accomplishing the 2nd
13/// goal.
14///
15/// An important rule of protecting code from timing attacks is to ensure that a program never
16/// branches on a secret value. In rust-timing-shield terminology, we should never emit a
17/// branch conditional on a `TpBool`. The Rust compiler uses many LLVM optimization passes that may
18/// produce a conditional branch where none was written by the programmer, so we must remove the
19/// compiler's ability to identify situations where a conditional branch could reasonably be
20/// introduced. The most problematic optimizations identified so far output [an LLVM IR `select`
21/// instruction](https://llvm.org/docs/LangRef.html#select-instruction).
22///
23/// Of course, it is legal for the compiler to emit `select` instructions in any situation as long
24/// as the output and side effects of the code remain the same. For example, it could expand this
25/// code:
26///
27/// ```
28/// fn add(a: u8, b: u8) -> u8 {
29///     a + b
30/// }
31/// ```
32///
33/// to:
34///
35/// ```
36/// fn add(a: u8, b: u8) -> u8 {
37///     if a == 5 && b == 5 {
38///         return 10;
39///     }
40///     if a == 7 && b == 5 {
41///         return 12;
42///     }
43///     a + b
44/// }
45/// ```
46///
47/// The current implementation of rust-timing-shield assumes that this is ridiculous and wouldn't
48/// happen. In particular, an assumption is made that the compiler will only insert a `select` for
49/// "values of boolean origin".  For example, this branchless code: 
50///
51/// ```
52/// fn add_bool(number: u8, b: bool) -> u8 {
53///     number + (b as u8)
54/// }
55/// ```
56///
57/// could reasonably, under this assumption, be rewritten with a branch:
58///
59/// ```
60/// fn add_bool(number: u8, b: bool) -> u8 {
61///     if b {
62///         return number + 1;
63///     }
64///
65///     number
66/// }
67/// ```
68///
69/// since `b as u8` is a "value of boolean origin".  Based on the empirical tests of the Rust
70/// compiler and a review of how LLVM optimization passes are implemented, this assumption appears
71/// to be reasonable.
72///
73/// Now, with this assumption in place, rust-timing-shield only needs to ensure that all
74/// timing-protected values either are not of boolean origin or hide their origin from the
75/// compiler.  This is accomplished in two steps:
76///
77/// 1. All timing-protected boolean values (`TpBool`) are stored as `u8` values and never as
78///    `bool`, even in intermediate computations.
79/// 2. `TpBool::protect(bool)` converts the `bool` to a `u8` and uses an optimization barrier (this
80///    function) to hide the origin of the `u8`.
81///
82/// # Usage
83///
84/// `optimization_barrier_u8` is a low-cost (but not zero-cost) way to hide a value's origin from the compiler. If a
85/// value passes through an optimization barrier, the compiler will be able to perform
86/// optimizations on computations prior to the barrier and optimizations on computations after the
87/// barrier, but will be unable to perform optimizations *across* the barrier.
88///
89/// It is important that the barrier interrupt the flow of a value from one part of the program to
90/// the next. As an example of incorrect usage:
91///
92/// ```
93/// # use timing_shield::barriers::optimization_barrier_u8;
94/// #
95/// fn add_bool(number: u8, secret_condition: bool) -> u8 {
96///     let secret_condition_u8 = secret_condition as u8;
97///
98///     // WRONG: barrier does not interrupt data flow
99///     optimization_barrier_u8(secret_condition_u8);
100///
101///     number + secret_condition_u8
102/// }
103/// ```
104///
105/// Here, the return value from `optimization_barrier_u8` is unused, so the optimization barrier
106/// has no effect. An optimization barrier must be the single connection between two parts of the
107/// program's data dependence graph:
108///
109/// ```
110/// # use timing_shield::barriers::optimization_barrier_u8;
111/// #
112/// fn add_bool(number: u8, secret_condition: bool) -> u8 {
113///     let secret_condition_u8 = secret_condition as u8;
114///
115///     // Override the previous definition to avoid accidentally using the pre-barrier value
116///     let secret_condition_u8 = optimization_barrier_u8(secret_condition_u8);
117///
118///     number + secret_condition_u8
119/// }
120/// ```
121///
122/// In rust-timing-shield, optimization barriers are used to hide when a `u8` is of boolean origin.
123/// The use of a barrier prevents the compiler from identifying that the `u8` value after the
124/// barrier is the same value that was cast from a `bool` before the barrier. This suppresses the
125/// many optimizations that would transform the timing-leak-proof branchless computations with `u8`
126/// values that rust-timing-shield produces into branching computations that leak boolean values.
127///
128/// # Performance considerations
129///
130/// `optimization_barrier_u8` is currently implemented as an empty inline assembly block. The `u8`
131/// input value is provided as an input/output register to the assembly block and then immediately
132/// returned as is. Although the assembly block is a no-op, LLVM is forced to assume that the value
133/// may have changed and can make no assumptions about what the output may be. The Rust Unstable
134/// reference [describes the assembly block as a black
135/// box](https://doc.rust-lang.org/unstable-book/library-features/asm.html):
136///
137/// > “The compiler cannot assume that the instructions in the asm are the ones that will
138/// actually end up executed.  This effectively means that the compiler must treat the `asm!` as a
139/// black box and only take the interface specification into account, not the instructions
140/// themselves.”
141///
142/// Since no actual assembler instructions are provided, it might seem that this function call
143/// would have zero overhead after inlining. However, there are other considerations that may
144/// affect performance:
145///
146/// - The barrier will (obviously) prevent matching on code patterns that span across the barrier.
147/// This is intended.
148/// - The barrier will force the compiler to schedule an actual register to hold the value
149/// temporarily.
150/// - The barrier will force the value into a single register, which may impair the compiler's
151/// ability to perform optimizations such as auto-vectorization.
152/// - Constant folding cannot proceed past a barrier.
153///
154/// For these reasons, optimization barriers are only used where necessary to minimize any
155/// potential impact on performance, keeping rust-timing-shield zero-cost for as many applications
156/// as possible.
157#[inline(always)]
158pub fn optimization_barrier_u8(mut value: u8) -> u8 {
159    unsafe {
160        std::arch::asm!(
161            // Rust requires us to use every register defined, so we use it inside of a comment.
162            "/* optimization_barrier_u8 {unused} */",
163
164            // Define a single input/output register called "unused".
165            // The Rust compiler will perceive this as a mutation of `value`.
166            unused = inout(reg_byte) value,
167
168            // By guaranteeing more invariants we improve the compiler's ability to optimize.
169            // Since the assembly block is a no-op, we easily uphold all of these invariants.
170            options(pure, nomem, nostack, preserves_flags)
171        );
172    }
173
174    value
175}
176
177#[cfg(test)]
178mod tests {
179    use super::*;
180    use quickcheck::quickcheck;
181
182    quickcheck! {
183        fn optimization_barrier_is_identity(value: u8) -> bool {
184            optimization_barrier_u8(value) == value
185        }
186    }
187
188    // TODO I would like to add a test that checks the compiled LLVM IR for `select` instructions.
189    // This would help confirm that the optimization barrier is working correctly.
190    //
191    // According to my tests, the following function compiles to a single `select`:
192    //
193    //  pub fn select(cond: bool, a: u32, b: u32) -> u32 {
194    //      let mask = (cond as u32) - 1;
195    //
196    //      (a & (!mask)) | (b & mask)
197    //  }
198    //
199    // while this function is branchless in LLVM IR and x86_64:
200    //
201    //  pub fn select(cond: bool, a: u32, b: u32) -> u32 {
202    //      let mask = (optimization_barrier_u8(cond as u8) as u32) - 1;
203    //
204    //      (a & (!mask)) | (b & mask)
205    //  }
206    //
207    // Tested with `cargo llvm-ir` from the `cargo-asm` tool.
208}