pub struct TokenClassificationConfig {
pub model_type: ModelType,
pub model_resource: ModelResource,
pub config_resource: Box<dyn ResourceProvider + Send>,
pub vocab_resource: Box<dyn ResourceProvider + Send>,
pub merges_resource: Option<Box<dyn ResourceProvider + Send>>,
pub lower_case: bool,
pub strip_accents: Option<bool>,
pub add_prefix_space: Option<bool>,
pub device: Device,
pub kind: Option<Kind>,
pub label_aggregation_function: LabelAggregationOption,
pub batch_size: usize,
}
Expand description
Configuration for TokenClassificationModel
Contains information regarding the model to load and device to place the model on.
Fields§
§model_type: ModelType
Model type
model_resource: ModelResource
Model weights resource (default: pretrained BERT model on CoNLL)
config_resource: Box<dyn ResourceProvider + Send>
Config resource (default: pretrained BERT model on CoNLL)
vocab_resource: Box<dyn ResourceProvider + Send>
Vocab resource (default: pretrained BERT model on CoNLL)
merges_resource: Option<Box<dyn ResourceProvider + Send>>
Merges resource (default: pretrained BERT model on CoNLL)
lower_case: bool
Automatically lower case all input upon tokenization (assumes a lower-cased model)
strip_accents: Option<bool>
Flag indicating if the tokenizer should strip accents (normalization). Only used for BERT / ALBERT models
add_prefix_space: Option<bool>
Flag indicating if the tokenizer should add a white space before each tokenized input (needed for some Roberta models)
device: Device
Device to place the model on (default: CUDA/GPU when available)
kind: Option<Kind>
Model weights precision. If not provided, will default to full precision on CPU, or the loaded weights precision otherwise
label_aggregation_function: LabelAggregationOption
Sub-tokens aggregation method (default: LabelAggregationOption::First
)
batch_size: usize
Batch size for predictions
Implementations§
source§impl TokenClassificationConfig
impl TokenClassificationConfig
sourcepub fn new<RC, RV>(
model_type: ModelType,
model_resource: ModelResource,
config_resource: RC,
vocab_resource: RV,
merges_resource: Option<RV>,
lower_case: bool,
strip_accents: impl Into<Option<bool>>,
add_prefix_space: impl Into<Option<bool>>,
label_aggregation_function: LabelAggregationOption
) -> TokenClassificationConfig
pub fn new<RC, RV>( model_type: ModelType, model_resource: ModelResource, config_resource: RC, vocab_resource: RV, merges_resource: Option<RV>, lower_case: bool, strip_accents: impl Into<Option<bool>>, add_prefix_space: impl Into<Option<bool>>, label_aggregation_function: LabelAggregationOption ) -> TokenClassificationConfig
Instantiate a new token classification configuration of the supplied type.
Arguments
model_type
-ModelType
indicating the model type to load (must match with the actual data to be loaded!)- model - The
ResourceProvider
pointing to the model to load (e.g. model.ot) - config - The
ResourceProvider
pointing to the model configuration to load (e.g. config.json) - vocab - The
ResourceProvider
pointing to the tokenizers’ vocabulary to load (e.g. vocab.txt/vocab.json) - vocab - An optional
ResourceProvider
pointing to the tokenizers’ merge file to load (e.g. merges.txt), needed only for Roberta. - lower_case - A
bool
indicating whether the tokenizer should lower case all input (in case of a lower-cased model)
Trait Implementations§
source§impl Default for TokenClassificationConfig
impl Default for TokenClassificationConfig
source§fn default() -> TokenClassificationConfig
fn default() -> TokenClassificationConfig
Provides a default CoNLL-2003 NER model (English)