pub struct QuestionAnsweringConfig {Show 13 fields
pub model_resource: Box<dyn ResourceProvider + Send>,
pub config_resource: Box<dyn ResourceProvider + Send>,
pub vocab_resource: Box<dyn ResourceProvider + Send>,
pub merges_resource: Option<Box<dyn ResourceProvider + Send>>,
pub device: Device,
pub model_type: ModelType,
pub lower_case: bool,
pub strip_accents: Option<bool>,
pub add_prefix_space: Option<bool>,
pub max_seq_length: usize,
pub doc_stride: usize,
pub max_query_length: usize,
pub max_answer_length: usize,
}Expand description
Configuration for question answering
Contains information regarding the model to load and device to place the model on.
Fields§
§model_resource: Box<dyn ResourceProvider + Send>Model weights resource (default: pretrained DistilBERT model on SQuAD)
config_resource: Box<dyn ResourceProvider + Send>Config resource (default: pretrained DistilBERT model on SQuAD)
vocab_resource: Box<dyn ResourceProvider + Send>Vocab resource (default: pretrained DistilBERT model on SQuAD)
merges_resource: Option<Box<dyn ResourceProvider + Send>>Merges resource (default: None)
device: DeviceDevice to place the model on (default: CUDA/GPU when available)
model_type: ModelTypeModel type
lower_case: boolFlag indicating if the model expects a lower casing of the input
strip_accents: Option<bool>Flag indicating if the tokenizer should strip accents (normalization). Only used for BERT / ALBERT models
add_prefix_space: Option<bool>Flag indicating if the tokenizer should add a white space before each tokenized input (needed for some Roberta models)
max_seq_length: usizeMaximum sequence length for the combined query and context
doc_stride: usizeStride to apply if the context needs to be broken down due to a large length. Represents the number of overlapping tokens between sliding windows.
max_query_length: usizeMaximum length for the query
max_answer_length: usizeMaximum length for the answer
Implementations§
source§impl QuestionAnsweringConfig
impl QuestionAnsweringConfig
sourcepub fn new<RM, RC, RV>(
model_type: ModelType,
model_resource: RM,
config_resource: RC,
vocab_resource: RV,
merges_resource: Option<RV>,
lower_case: bool,
strip_accents: impl Into<Option<bool>>,
add_prefix_space: impl Into<Option<bool>>
) -> QuestionAnsweringConfigwhere
RM: ResourceProvider + Send + 'static,
RC: ResourceProvider + Send + 'static,
RV: ResourceProvider + Send + 'static,
pub fn new<RM, RC, RV>(
model_type: ModelType,
model_resource: RM,
config_resource: RC,
vocab_resource: RV,
merges_resource: Option<RV>,
lower_case: bool,
strip_accents: impl Into<Option<bool>>,
add_prefix_space: impl Into<Option<bool>>
) -> QuestionAnsweringConfigwhere
RM: ResourceProvider + Send + 'static,
RC: ResourceProvider + Send + 'static,
RV: ResourceProvider + Send + 'static,
Instantiate a new question answering configuration of the supplied type.
Arguments
model_type-ModelTypeindicating the model type to load (must match with the actual data to be loaded!)- model_resource - The
ResourceProviderpointing to the model to load (e.g. model.ot) - config_resource - The
ResourceProviderpointing to the model configuration to load (e.g. config.json) - vocab_resource - The
ResourceProviderpointing to the tokenizer’s vocabulary to load (e.g. vocab.txt/vocab.json) - merges_resource - An optional
ResourceProviderpointing to the tokenizer’s merge file to load (e.g. merges.txt), needed only for Roberta. - lower_case - A
boolindicating whether the tokenizer should lower case all input (in case of a lower-cased model)
sourcepub fn custom_new<RM, RC, RV>(
model_type: ModelType,
model_resource: RM,
config_resource: RC,
vocab_resource: RV,
merges_resource: Option<RV>,
lower_case: bool,
strip_accents: impl Into<Option<bool>>,
add_prefix_space: impl Into<Option<bool>>,
max_seq_length: impl Into<Option<usize>>,
doc_stride: impl Into<Option<usize>>,
max_query_length: impl Into<Option<usize>>,
max_answer_length: impl Into<Option<usize>>
) -> QuestionAnsweringConfigwhere
RM: ResourceProvider + Send + 'static,
RC: ResourceProvider + Send + 'static,
RV: ResourceProvider + Send + 'static,
pub fn custom_new<RM, RC, RV>(
model_type: ModelType,
model_resource: RM,
config_resource: RC,
vocab_resource: RV,
merges_resource: Option<RV>,
lower_case: bool,
strip_accents: impl Into<Option<bool>>,
add_prefix_space: impl Into<Option<bool>>,
max_seq_length: impl Into<Option<usize>>,
doc_stride: impl Into<Option<usize>>,
max_query_length: impl Into<Option<usize>>,
max_answer_length: impl Into<Option<usize>>
) -> QuestionAnsweringConfigwhere
RM: ResourceProvider + Send + 'static,
RC: ResourceProvider + Send + 'static,
RV: ResourceProvider + Send + 'static,
Instantiate a new question answering configuration of the supplied type.
Arguments
model_type-ModelTypeindicating the model type to load (must match with the actual data to be loaded!)- model_resource - The
ResourceProviderpointing to the model to load (e.g. model.ot) - config_resource - The
ResourceProviderpointing to the model configuration to load (e.g. config.json) - vocab_resource - The
ResourceProviderpointing to the tokenizer’s vocabulary to load (e.g. vocab.txt/vocab.json) - merges_resource - An optional
ResourceProviderpointing to the tokenizer’s merge file to load (e.g. merges.txt), needed only for Roberta. - lower_case - A
boolindicating whether the tokenizer should lower case all input (in case of a lower-cased model) - max_seq_length - Optional maximum sequence token length to limit memory footprint. If the context is too long, it will be processed with sliding windows. Defaults to 384.
- max_query_length - Optional maximum question token length. Defaults to 64.
- doc_stride - Optional stride to apply if a sliding window is required to process the input context. Represents the number of overlapping tokens between sliding windows. This should be lower than the max_seq_length minus max_query_length (otherwise there is a risk for the sliding window not to progress). Defaults to 128.
- max_answer_length - Optional maximum token length for the extracted answer. Defaults to 15.