pub struct TokenClassificationConfig {
pub model_type: ModelType,
pub model_resource: Box<dyn ResourceProvider + Send>,
pub config_resource: Box<dyn ResourceProvider + Send>,
pub vocab_resource: Box<dyn ResourceProvider + Send>,
pub merges_resource: Option<Box<dyn ResourceProvider + Send>>,
pub lower_case: bool,
pub strip_accents: Option<bool>,
pub add_prefix_space: Option<bool>,
pub device: Device,
pub label_aggregation_function: LabelAggregationOption,
pub batch_size: usize,
}Expand description
Configuration for TokenClassificationModel
Contains information regarding the model to load and device to place the model on.
Fields§
§model_type: ModelTypeModel type
model_resource: Box<dyn ResourceProvider + Send>Model weights resource (default: pretrained BERT model on CoNLL)
config_resource: Box<dyn ResourceProvider + Send>Config resource (default: pretrained BERT model on CoNLL)
vocab_resource: Box<dyn ResourceProvider + Send>Vocab resource (default: pretrained BERT model on CoNLL)
merges_resource: Option<Box<dyn ResourceProvider + Send>>Merges resource (default: pretrained BERT model on CoNLL)
lower_case: boolAutomatically lower case all input upon tokenization (assumes a lower-cased model)
strip_accents: Option<bool>Flag indicating if the tokenizer should strip accents (normalization). Only used for BERT / ALBERT models
add_prefix_space: Option<bool>Flag indicating if the tokenizer should add a white space before each tokenized input (needed for some Roberta models)
device: DeviceDevice to place the model on (default: CUDA/GPU when available)
label_aggregation_function: LabelAggregationOptionSub-tokens aggregation method (default: LabelAggregationOption::First)
batch_size: usizeBatch size for predictions
Implementations§
source§impl TokenClassificationConfig
impl TokenClassificationConfig
sourcepub fn new<RM, RC, RV>(
model_type: ModelType,
model_resource: RM,
config_resource: RC,
vocab_resource: RV,
merges_resource: Option<RV>,
lower_case: bool,
strip_accents: impl Into<Option<bool>>,
add_prefix_space: impl Into<Option<bool>>,
label_aggregation_function: LabelAggregationOption
) -> TokenClassificationConfigwhere
RM: ResourceProvider + Send + 'static,
RC: ResourceProvider + Send + 'static,
RV: ResourceProvider + Send + 'static,
pub fn new<RM, RC, RV>(
model_type: ModelType,
model_resource: RM,
config_resource: RC,
vocab_resource: RV,
merges_resource: Option<RV>,
lower_case: bool,
strip_accents: impl Into<Option<bool>>,
add_prefix_space: impl Into<Option<bool>>,
label_aggregation_function: LabelAggregationOption
) -> TokenClassificationConfigwhere
RM: ResourceProvider + Send + 'static,
RC: ResourceProvider + Send + 'static,
RV: ResourceProvider + Send + 'static,
Instantiate a new token classification configuration of the supplied type.
Arguments
model_type-ModelTypeindicating the model type to load (must match with the actual data to be loaded!)- model - The
ResourceProviderpointing to the model to load (e.g. model.ot) - config - The
ResourceProviderpointing to the model configuration to load (e.g. config.json) - vocab - The
ResourceProviderpointing to the tokenizers’ vocabulary to load (e.g. vocab.txt/vocab.json) - vocab - An optional
ResourceProviderpointing to the tokenizers’ merge file to load (e.g. merges.txt), needed only for Roberta. - lower_case - A
boolindicating whether the tokenizer should lower case all input (in case of a lower-cased model)
Trait Implementations§
source§impl Default for TokenClassificationConfig
impl Default for TokenClassificationConfig
source§fn default() -> TokenClassificationConfig
fn default() -> TokenClassificationConfig
Provides a default CoNLL-2003 NER model (English)