Crate cee_scape

source ·
Expand description

The cee-scape crate provides access to setjmp and sigsetjmp functionality, via an interface that ensures LLVM won’t miscompile things.

Example usage

The main intention is for this interface to be used with C code that expects to longjmp via jump buffers established at Rust-to-C FFI boundaries.

Here is an example, where we are using extern "C" functions as stand-ins for the code you would normally expect to find in an external C library.

mod pretend_this_comes_from_c {
    use cee_scape::JmpBuf;

    // Returns sum of a and b, but longjmps through `env` if either argument
    // is negative (passing 1) or if the sum overflows (passing 2).
    pub extern "C" fn careful_sum(env: JmpBuf, a: i32, b: i32) -> i32 {
        check_values(env, a, b);
        return a + b;
    }

    extern "C" fn check_values(env: JmpBuf, a: i32, b: i32) {
        use cee_scape::longjmp;
        if a < 0 || b < 0 { unsafe { longjmp(env, -1); } }
        if (i32::MAX - a) < b { unsafe { longjmp(env, -2); } }
    }
}

use pretend_this_comes_from_c::careful_sum as sum;
use cee_scape::call_with_setjmp;

assert_eq!(call_with_setjmp(|env| { sum(env, 10, 20) + 1000 }), 1030);
assert_eq!(call_with_setjmp(|env| { sum(env, -10, 20) + 1000 }), -1);
assert_eq!(call_with_setjmp(|env| { sum(env, 10, -20) + 1000 }), -1);
assert_eq!(call_with_setjmp(|env| { sum(env, i32::MAX, 1) + 1000 }), -2);

Background on setjmp and longjmp.

The setjmp and longjmp functions in C are used as the basis for “non-local jumps”, also known as “escape continuations”. It is a way to have a chain of calls “entry calls middle_1 calls middle_2 calls innermost”, where the bodies of middle_1 or middle_2 or innermost might at some point decide that they want to jump all the way back to entry without having to pass through the remaining code that they would normally have to execute when returning via each of their respective callers.

In C, this is done by having entry first call setjmp to initialize a jump enviroment (which would hold, for example, the current stack pointer and, if present, the current frame pointer), and then passing a pointer to that jump environment along during each of the child subroutines of A. If at any point a child subroutine wants to jump back to the point where setjmp had first returned, that child subroutine invoke longjmp, which reestablishes the stack to the position it had when setjmp had originally returned.

Safety (or lack thereof)

This crate cannot ensure that the usual Rust control-flow rules are upheld, which means that the act of actually doing a longjmp/siglongjmp to a non-local jump environment (aka continuation) is unsafe.

For example, several Rust API’s rely on an assumption that they will always run some specific cleanup code after a callback is done. Such cleanup is sometimes encoded as a Rust destructor, but it can also just be directly encoded as straight-line code waiting to be run.

Calls to longjmp blatantly break these assumptions. A longjmp invocation does not invoke any Rust destructors, and it does not “unwind the stack”. All pending cleanup code between the longjmp invocation and the target jump environment (i.e. the place where the relevant setjmp first returned) is skipped.

use std::cell::Cell;
// This emulates a data structure that has an ongoing invariant:
// the `depth` is incremented/decremented according to entry/exit
// to a given callback (see `DepthTracker::enter` below).
pub struct DepthTracker { depth: Cell<usize>, }

let track = DepthTracker::new();
cee_scape::call_with_setjmp(|env| {
    track.enter(|| {
        // This is what we expect: depth is larger in context of
        // DepthTracker::enter callback
        assert_eq!(track.depth(), 1);
        "normal case"
    });
    0
});

// Normal case: the tracked depth has returned to zero.
assert_eq!(track.depth(), 0);

assert_eq!(cee_scape::call_with_setjmp(|env| {
    track.enter(|| {
        // This is what we expect: depth is larger in context of
        // DepthTracker::enter callback
        assert_eq!(track.depth(), 1);
        // DIFFERENT: Now we bypass the DepthTracker's cleanup code.
        unsafe { cee_scape::longjmp(env, 4) }
        "abnormal case"
    });
    0
}), 4);

// This is the "surprise" due to the DIFFERENT line: longjmp skipped
// over the decrement from returning from the callback, and so the count
// is not consistent with what the data structure expects.
assert_eq!(track.depth(), 1 /* not 0 */);

// (These are just support routines for the `DepthTracker` above.)
impl DepthTracker {
    pub fn depth(&self) -> usize {
        self.depth.get()
    }
    pub fn enter<X>(&self, callback: impl FnOnce() -> X) -> X {
        self.update(|x|x+1);
        let ret = callback();
        self.update(|x|x-1);
        ret
    }
    fn update(&self, effect: impl Fn(usize) -> usize) {
        self.depth.set(effect(self.depth.get()));
    }
    pub fn new() -> Self {
        DepthTracker { depth: Cell::new(0) }
    }
}

In short, the longjmp routine is a blunt instrument. When a longjmp invocation skips some cleanup code, the compiler cannot know whether skipping that cleanup code was exactly what the program author intended, or if it represents a programming error.

Furthermore, much cleanup code of this form is enforcing Rust safety invariants. This is why longjmp is provided here as an unsafe method; that is a reminder that while one can invoke call_with_setjmp safely, the obligation remains to audit whether any invocations of longjmp on the provided jump environment are breaking those safety invariants by skipping over such cleanup code.

Some static checking

While not all of Rust’s safety rules are statically enforced, one important one is enforced: When invoking call_with_setjmp, the saved jump environment is not allowed to escape the scope of the callback that is fed to call_with_setjmp:

let mut escaped = None;
cee_scape::call_with_setjmp(|env| {
    // If `env` were allowed to escape...
    escaped = Some(env);
    0
});
// ... it would be bad if we could then do this with it.
unsafe { cee_scape::longjmp(escaped.unwrap(), 1); }

We also cannot share jump environments across threads, because it is undefined behavior to longjmp via a jump environments that was initialized by a call to setjmp in a different thread.

cee_scape::call_with_setjmp(move |env| {
    std::thread::scope(|s| {
        s.spawn(move || {
            unsafe { cee_scape::longjmp(env, 1); }
        });
        0
    })
});

Structs

  • JmpBufFields are the accessible fields when viewed via a JmpBuf pointer. But also: You shouldn’t be poking at these!
  • SigJmpBufFields are the accessible fields when viewed via a SigJmpBuf pointer. But also: You shouldn’t be poking at these!

Functions

  • Covers the usual use case for setjmp: it invokes the callback, and the code of the callback can use longjmp to exit early from the call_with_setjmp.
  • Covers the usual use case for sigsetjmp: it invokes the callback, and the code of the callback can use siglongjmp to exit early from the call_with_sigsetjmp.
  • Given a calling environment jbuf (which one can acquire via call_with_setjmp) and a non-zero value val, moves the stack and program counters to match the return position of where jbuf was established via a call to setjmp, and then returns val from that spot.
  • Given a calling environment jbuf (which one can acquire via call_with_sigsetjmp) and a non-zero value val, moves the stack and program counters to match the return position of where jbuf was established via a call to setjmp, and then returns val from that spot.

Type Aliases

  • This is the type of the first argument that is fed to longjmp.
  • This is the type you use to allocate a JmpBuf on the stack. (Glibc puns the two.)
  • This is the type of the first argument that is fed to siglongjmp.
  • This is the type you use to allocate a SigJmpBuf on the stack.