outlines_core/
lib.rs

1//! # Outlines_core
2//!
3//! `outlines_core` crate provides a convenient way to:
4//!
5//! - build regular expressions from JSON schemas
6//!
7//! - construct an [`index::Index`] object by combining a [`vocabulary::Vocabulary`] and regular
8//!   expression to efficiently map tokens from a given `Vocabulary` to state transitions in a
9//!   finite-state automation
10//!
11//! ## `json_schema`
12//!
13//! [`json_schema`] module provides interfaces to generate a regular expression based on a given JSON schema, depending on its type:
14//! - [`json_schema::regex_from_str`]
15//! - [`json_schema::regex_from_value`]
16//!
17//! Whitespace pattern could be customized, otherwise the default [`json_schema::WHITESPACE`] pattern is used.
18//!
19//! Note, that not all the features of JSON schema are supported for regex generation: [Supported Features](json_schema#supported-features)
20//!
21//! ## `Index`
22//!
23//! Once [`index::Index`] is built, it can be used to evaluate or validate token sequences.
24//!
25//! ### Complexity and construction cost
26//!
27//! `Index` can accommodate large vocabularies and complex regular expressions. However, its size **may** grow
28//! significantly with the complexity of the input, as well as time and computational resources.
29//!
30//! ## Python bindings
31//!
32//! Additionally, crate provides interfaces to integrate the crate's functionality with Python.
33//!
34//! ## Support
35//!
36//! `Outlines_core` is primarily used in structured text generation project [`outlines`](https://github.com/dottxt-ai/outlines),
37//! if you need support, consider reaching out to its maintainers, you can also open an issue or start a discussion
38//! on [github](https://github.com/dottxt-ai/outlines-core)
39//!
40//! ## Example
41//!
42//! Basic example of how it all fits together.
43//!
44//! ```rust
45//! # use outlines_core::Error;
46//! use outlines_core::prelude::*;
47//!
48//! # fn main() -> Result<(), Error> {
49//! // Define a JSON schema
50//! let schema = r#"{
51//!     "type": "object",
52//!     "properties": {
53//!         "name": { "type": "string" },
54//!         "age": { "type": "integer" }
55//!     },
56//!     "required": ["name", "age"]
57//! }"#;
58//!
59//! // Generate a regular expression from it
60//! let regex = json_schema::regex_from_str(&schema, None, None)?;
61//! println!("Generated regex: {}", regex);
62//!
63//! // Create `Vocabulary` from pretrained large language model (but manually is also possible)
64//! let vocabulary = Vocabulary::from_pretrained("openai-community/gpt2", None)?;
65//!
66//! // Create new `Index` from regex and a given `Vocabulary`
67//! let index = Index::new(&regex, &vocabulary)?;
68//!
69//! let initial_state = index.initial_state();
70//! println!("Is initial state {} a final state? {}", initial_state, index.is_final_state(&initial_state));
71//!
72//! let allowed_tokens = index.allowed_tokens(&initial_state).expect("Some allowed tokens");
73//! println!("Allowed tokens at initial state are {:?}", allowed_tokens);
74//!
75//! let token_id = allowed_tokens.first().expect("First token");
76//! println!("Next state for the token_id {} is {:?}", token_id, index.next_state(&initial_state, token_id));
77//! println!("Final states are {:?}", index.final_states());
78//! println!("Index has exactly {} transitions", index.transitions().len());
79//! # Ok(())
80//! }
81//! ```
82
83pub mod error;
84pub mod index;
85pub mod json_schema;
86pub mod prelude;
87pub mod primitives;
88pub mod vocabulary;
89
90pub use error::{Error, Result};
91
92#[cfg(feature = "python-bindings")]
93mod python_bindings;