A batch version of GrammarMatcher that can fill the next token bitmask for multiple
matchers in parallel. It utilizes multiple threads to speed up the computation. It is
especially useful when the batch size is large.
\brief The data type the tensor can hold. The data type is assumed to follow the
native endian-ness. An explicit error message should be raised when attempting to
export an array with non-native endianness
\brief C Tensor object, manage memory of DLTensor. This data structure is
intended to facilitate the borrowing of DLTensor by another framework. It is
not meant to transfer the tensor. When the borrowing framework doesn’t need
the tensor, it should call the deleter to notify the host that the resource
is no longer needed.
The compiler for grammars. It is associated with a certain tokenizer info, and compiles
grammars into CompiledGrammar with the tokenizer info. It allows parallel compilation with
multiple threads, and has a cache to store the compilation result, avoiding compiling the
same grammar multiple times.