Expand description
§facet-solver
Helps facet deserializers implement #[facet(flatten)] and #[facet(untagged)]
correctly, efficiently, and with useful diagnostics.
§The Problem
When deserializing a type with a flattened enum:
#[derive(Facet)]
struct TextMessage { content: String }
#[derive(Facet)]
struct BinaryMessage { data: Vec<u8>, encoding: String }
#[derive(Facet)]
#[repr(u8)]
enum MessagePayload {
Text(TextMessage),
Binary(BinaryMessage),
}
#[derive(Facet)]
struct Message {
id: String,
#[facet(flatten)]
payload: MessagePayload,
}…we don’t know which variant to use until we’ve seen the fields:
{"id": "msg-1", "content": "hello"} // Text
{"id": "msg-2", "data": [1,2,3], "encoding": "raw"} // BinaryThe solver answers: “which variant has a content field?” or “which variant
has both data and encoding?”
§How It Works
The solver pre-computes all valid field combinations (“configurations”) for a type, then uses an inverted index to quickly find which configuration(s) match the fields you’ve seen.
use facet_solver::{KeyResult, Schema, Solver};
// Build schema once (can be cached)
let schema = Schema::build(Message::SHAPE).unwrap();
// Create a solver for this deserialization
let mut solver = Solver::new(&schema);
// As you see fields, report them:
match solver.see_key("id") {
KeyResult::Unambiguous { .. } => { /* both configs have "id" */ }
_ => {}
}
match solver.see_key("content") {
KeyResult::Solved(config) => {
// Only Text has "content" - we now know the variant!
assert!(config.has_key_path(&["content"]));
}
_ => {}
}§Nested Disambiguation
When top-level keys don’t distinguish variants, the solver can look deeper:
#[derive(Facet)]
struct TextPayload { content: String }
#[derive(Facet)]
struct BinaryPayload { bytes: Vec<u8> }
#[derive(Facet)]
#[repr(u8)]
enum Payload {
Text { inner: TextPayload },
Binary { inner: BinaryPayload },
}
#[derive(Facet)]
struct Wrapper {
#[facet(flatten)]
payload: Payload,
}Both variants have an inner field. But inner.content only exists in Text,
and inner.bytes only exists in Binary. The ProbingSolver handles this:
use facet_solver::{ProbingSolver, ProbeResult, Schema};
let schema = Schema::build(Wrapper::SHAPE).unwrap();
let mut solver = ProbingSolver::new(&schema);
// Top-level "inner" doesn't disambiguate
assert!(matches!(solver.probe_key(&[], "inner"), ProbeResult::KeepGoing));
// But "inner.content" does!
match solver.probe_key(&["inner"], "content") {
ProbeResult::Solved(config) => {
assert!(config.has_key_path(&["inner", "content"]));
}
_ => panic!("should have solved"),
}§Lazy Type Disambiguation
Sometimes variants have identical keys but different value types. The solver handles this without buffering—it lets you probe “can this value fit type X?” lazily:
#[derive(Facet)]
struct SmallPayload { value: u8 }
#[derive(Facet)]
struct LargePayload { value: u16 }
#[derive(Facet)]
#[repr(u8)]
enum Payload {
Small { payload: SmallPayload },
Large { payload: LargePayload },
}
#[derive(Facet)]
struct Container {
#[facet(flatten)]
inner: Payload,
}Both variants have payload.value, but one is u8 (max 255) and one is u16 (max 65535).
When the deserializer sees value 1000, it can rule out Small without ever parsing into
the wrong type:
use facet_solver::{Solver, KeyResult, Schema};
let schema = Schema::build(Container::SHAPE).unwrap();
let mut solver = Solver::new(&schema);
// "payload" exists in both - ambiguous by key alone
solver.probe_key(&[], "payload");
// "value" also exists in both, but with different types!
match solver.probe_key(&["payload"], "value") {
KeyResult::Ambiguous { fields } => {
// fields contains (FieldInfo, score) pairs for u8 and u16
// Lower score = more specific type
assert_eq!(fields.len(), 2);
}
_ => {}
}
// Deserializer sees value 1000 - ask which types fit
let shapes = solver.get_shapes_at_path(&["payload", "value"]);
let fits: Vec<_> = shapes.iter()
.filter(|s| match s.type_identifier {
"u8" => "1000".parse::<u8>().is_ok(), // false!
"u16" => "1000".parse::<u16>().is_ok(), // true
_ => false,
})
.copied()
.collect();
// Narrow to types the value actually fits
solver.satisfy_at_path(&["payload", "value"], &fits);
assert_eq!(solver.candidates().len(), 1); // Solved: LargeThis enables true streaming deserialization: you never buffer values, never parse speculatively, and never lose precision. The solver tells you what types are possible, you check which ones the raw input satisfies, and disambiguation happens lazily.
§Performance
- O(1) field lookup: Inverted index maps field names to bitmasks
- O(configs/64) narrowing: Bitwise AND to filter candidates
- Zero allocation during solving: Schema is built once, solving just manipulates bitmasks
- Early termination: Stops as soon as one candidate remains
Typical disambiguation: ~50ns for 4 configurations, <1µs for 64+ configurations.
§Why This Exists
Serde’s #[serde(flatten)] and #[serde(untagged)] have fundamental limitations
because they buffer values into an intermediate Content enum, then re-deserialize.
This loses type information and breaks many use cases.
Facet takes a different approach: determine the type first, then deserialize directly. No buffering, no loss of fidelity.
§Serde Issues This Resolves
| Issue | Problem | Facet’s Solution |
|---|---|---|
| serde#2186 | Flatten buffers into Content, losing type distinctions (e.g., 1 vs "1") | Scan keys only, deserialize values directly into the resolved type |
| serde#1600 | flatten + deny_unknown_fields doesn’t work | Schema knows all valid fields per configuration |
| serde#1626 | flatten + default on enums | Solver tracks required vs optional per-field |
| serde#1560 | Empty variant ambiguity with “first match wins” | Explicit configuration enumeration, no guessing |
| serde_json#721 | arbitrary_precision + flatten loses precision | No buffering through serde_json::Value |
| serde_json#1155 | u128 in flattened struct fails | Direct deserialization, no Value intermediary |
§Sponsors
Thanks to all individual sponsors:
…along with corporate sponsors:
…without whom this work could not exist.
§Special thanks
The facet logo was drawn by Misiasart.
§License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Structs§
- Candidate
Failure - Information about why a specific candidate (resolution) failed to match.
- Duplicate
Field Error - Error when building a resolution.
- Field
Info - Information about a single field in a resolution.
- Field
Path - A path through the type tree to a field.
- Field
Suggestion - Suggestion for a field that might have been misspelled.
- Missing
Field Info - Information about a missing required field for error reporting.
- Probing
Solver - Depth-aware probing solver for streaming deserialization.
- Resolution
- One possible “shape” the flattened type could take.
- Resolution
Set - A set of configuration indices, stored as a bitmask for O(1) intersection.
- Schema
- Cached schema for a type that may contain flattened fields.
- Solver
- State machine solver for lazy value-based disambiguation.
- Variant
Selection - Records that a specific enum field has a specific variant selected.
- Variants
ByFormat - Information about variants grouped by their expected format.
Enums§
- Enum
Repr - How enum variants are represented in the serialized format.
- KeyResult
- Result of reporting a key to the solver.
- Match
Result - Result of matching input fields against a resolution.
- Path
Segment - A segment in a field path.
- Probe
Result - Result of reporting a key to the probing solver.
- Satisfy
Result - Result of reporting which fields the value can satisfy.
- Schema
Error - Errors that can occur when building a schema.
- Solver
Error - Errors that can occur during flatten resolution.
- Variant
Format - Classification of an enum variant’s expected serialized format.
Functions§
- specificity_
score - Compute a specificity score for a shape. Lower score = more specific.
Type Aliases§
- KeyPath
- A path of serialized key names for probing. Unlike FieldPath which tracks the internal type structure (including variant selections), KeyPath only tracks the keys as they appear in the serialized format.