Crate xgrammar

Crate xgrammar 

Source

Modules§

testing

Structs§

BatchGrammarMatcher
A batch version of GrammarMatcher that can fill the next token bitmask for multiple matchers in parallel. It utilizes multiple threads to speed up the computation. It is especially useful when the batch size is large.
CompiledGrammar
This is the primary object to store compiled grammar.
CxxUniquePtr
Binding to C++ std::unique_ptr<T, std::default_delete<T>>.
DLDataType
\brief The data type the tensor can hold. The data type is assumed to follow the native endian-ness. An explicit error message should be raised when attempting to export an array with non-native endianness
DLDevice
\brief A Device for Tensor and operator.
DLManagedTensor
\brief C Tensor object, manage memory of DLTensor. This data structure is intended to facilitate the borrowing of DLTensor by another framework. It is not meant to transfer the tensor. When the borrowing framework doesn’t need the tensor, it should call the deleter to notify the host that the resource is no longer needed.
DLTensor
\brief Plain C Tensor object, does not manage memory.
Grammar
This class represents a grammar object in XGrammar, and can be used later in the grammar-guided generation.
GrammarCompiler
The compiler for grammars. It is associated with a certain tokenizer info, and compiles grammars into CompiledGrammar with the tokenizer info. It allows parallel compilation with multiple threads, and has a cache to store the compilation result, avoiding compiling the same grammar multiple times.
GrammarMatcher
Match tokens/strings to a compiled grammar and generate next-token masks.
StructuralTagItem
A structural tag item. See crate::Grammar::from_structural_tag for more details.
TokenizerInfo
TokenizerInfo contains the vocabulary, its type, and metadata used by the grammar-guided generation.
cxx_int
Newtype wrapper for an int
cxx_longlong
Newtype wrapper for a long long
cxx_ulong
Newtype wrapper for an unsigned long
cxx_ulonglong
Newtype wrapper for an unsigned long long

Enums§

DLDataTypeCode
\brief The type code options DLDataType.
DLDeviceType
VocabType

Functions§

allocate_token_bitmask
Allocate the bitmask for the next token prediction. The bitmask is an int32 tensor on CPU with shape (batch_size, ceil(vocab_size / 32)).
get_bitmask_shape
Get the shape of the bitmask for next token prediction.
get_max_recursion_depth
Get the maximum allowed recursion depth. The depth is shared per process.
get_serialization_version
Get the serialization version number. The current version is “v5”.
reset_token_bitmask
Reset the bitmask to the full mask (all bits set to 1, meaning no tokens are masked).
set_max_recursion_depth
Set the maximum allowed recursion depth. The depth is shared per process. This method is thread-safe.