Crate ffizz_string

source ·
Expand description

This crate provides a string abstraction that is convenient to use from both Rust and C. It provides a way to pass strings into Rust functions and to return strings to C, with clear rules for ownership.

Usage

The types in this crate are specializations of a ffizz_passby::OpaqueStruct. See the documentation ffizz-passby crate for more general guidance on creating effective C APIs.

String Type

Expose the C type fz_string_t in your C header as a struct with the same structure as that in the fz_string_t docstring. This is large enough to hold the FzString type, and ensures the C compiler will properly align the value.

You may call the type whatever you like. Type names are erased in the C ABI, so it’s fine to write a Rust declaration using fz_string_t and equivalent C declaration using mystrtype_t. You may also rename the Rust type with use ffizz_string::fz_string_t as .., if you prefer.

String Utility Functions

This crate includes a number of utility functions, named fz_string_... These can be re-exported to C using whatever names you prefer, and with docstrings based on those in this crate, including C declarations:

ffizz_snippet!{
#[ffizz(name="mystrtype_free")]
/// Free a mystrtype_t.
///
/// # Safety
///
/// The string must not be used after this function returns, and must not be freed more than once.
/// It is safe to free Null-variant strings.
///
/// ```c
/// EXTERN_C void mystrtype_free(mystrtype_t *);
/// ```
}
ffizz_string::reexport!(fz_string_free as mystrtype_free);

Strings as Function Arguments

There are two design decisions to make when accepting strings as function arguments. First, does ownership of the string transfer from the caller to the callee? Or in Rust terms, is the value moved? This is largely a matter of convenience for the callers, but it’s best to be consistent throughout an API.

Second, do you want to pass strings by value or pointer? Passing by pointer is recommended as it is typically more efficient and allows invalidating moved values in a way that prevents use-after-free errors.

By Pointer

Define your extern "C" function to take a *mut fz_string_t argument:

pub unsafe extern "C" fn is_a_color_name(name: *const fz_string_t) -> bool { .. };

If taking ownership of the value, use FzString::take_ptr. Otherwise, use FzString::with_ref or FzString::with_ref_mut to borrow a reference from the pointer.

All of these methods are unsafe. As standard practice, address each of the items listed in the “Safety” section of each unsafe method you call. For example:

// SAFETY:
//  - name is not NULL (see docstring)
//  - no other thread will mutate name (type is documented as not threadsafe)
unsafe {
    FzString::with_ref(name, |name| {
        if let Some(name) = name.as_str() {
            return Colors::from_str(name).is_some();
        }
        false // invalid UTF-8 is _not_ a color name
    })
}
By Value

Pass strings by value of type fz_string_t:

pub unsafe extern "C" fn is_a_color_name(name: *const fz_string_t) -> bool { .. };

Then, use FzString::take to take ownership of the string as a Rust value. There is no option for the caller to retain ownership when passing by value.

Always Take Everything

If your C API definition indicates that a function takes ownership of values in its function arguments, take ownersihp of all arguments before any early returns can occur. For example:

pub unsafe extern "C" convolve_strings(a: *const fz_string_t, b: *const fz_string_t) -> bool {
    // SAFETY: ...
    let a = unsafe { FzString::take_ptr(a) };
    if a.len() == 0 {
        return false;
    }
    // SAFETY: ...
    let b = unsafe { FzString::take_ptr(b) }; // BAD!
    // ...
}

Here, if a is invalid, the function will not free b, despite the API contract promising to do so. To fix, move the let b statement before the early return.

Strings as Return Values

To return a string, define your extern "C" function to return an fz_string_t:

pub unsafe extern "C" fn favorite_color() -> fz_string_t { .. }

Then use FzString::return_val to return the value:

// SAFETY:
//  - caller will free the returned string (see docstring)
unsafe {
    return FzString::return_val(color);
}

Strings as Out Parameters

An “out parameter” is a common idiom in C and C++. To return a string into an out parameter, use FzString::to_out_param or FzString::to_out_param_nonnull:

result = FzString::from("the result");
unsafe {
    FzString::to_out_param(result_out, result);
}

Example

See the kv example in this crate for a worked example of a simple library using ffizz_string.

Performance

The implementation is general-purpose, and may result in more allocations or string copies than strictly necessary. This is particularly true if the Rust implementation immediately converts FzString into std::string::String. This conversion brings great simplicity, but involves an allocation and a copy of the string.

In situations where API performance is critical, it may be preferable to use FzString throughout the implementation.

Macros

Re-export a fz_string_t utility function in your own crate.

Structs

EmbeddedNulError indicates that the string contains embedded NUL bytes and could not be represented as a C string.
InvalidUTF8Error indicates that the string contains invalid UTF-8 and could not be represented as a Rust string.
fz_string_t represents a string suitable for use with this crate, as an opaque stack-allocated value.

Enums

A FzString carries a single string between Rust and C code, represented from the C side as an opaque struct.

Functions

Create a new fz_string_t containing a pointer to the given C string.
Create a new fz_string_t by cloning the content of the given C string. The resulting fz_string_t is independent of the given string.
Create a new fz_string_t containing the given string with the given length. This allows creation of strings containing embedded NUL characters. As with fz_string_clone, the resulting fz_string_t is independent of the passed buffer.
Get the content of the string as a regular C string.
Get the content of the string as a pointer and length.
Free a fz_string_t.
Determine whether the given fz_string_t is a Null variant.
Create a new, null fz_string_t. Note that this is not the zero value of fz_string_t.