# trustformers
Main integration crate providing high-level APIs, pipelines, and Hugging Face Hub integration for the TrustformeRS ecosystem.
## Current State
This crate serves as the **primary entry point** for users, offering HuggingFace-compatible APIs for common NLP tasks. It includes comprehensive pipeline implementations, auto model classes, and seamless integration with the Hugging Face Model Hub.
- **SLoC:** ~59,862
- **Tests:** ~1,740
- **Public API exports (prelude):** 50+
- **Pipeline types:** 23+
- **Stubs remaining:** 11 (minor, in complex pipeline composition code)
## Features
### Pipeline API
Complete implementations of 23+ NLP pipeline types:
- **Text Generation**: Language modeling, text completion, causal LM
- **Text Classification**: Sentiment analysis, topic classification
- **Token Classification**: Named Entity Recognition (NER), POS tagging
- **Question Answering**: Extractive QA from context
- **Fill-Mask**: Masked language modeling
- **Summarization**: Abstractive text summarization
- **Translation**: Language translation (seq2seq)
- **Zero-Shot Classification**: Classification without training examples
- **ConversationalPipeline**: Multi-turn dialogue
- **MultiModal**: Vision-language pipelines
- **DocumentUnderstanding**: Document analysis and extraction
### Pipeline Composition
Advanced pipeline orchestration:
- **ComposedPipeline**: Sequential multi-stage pipelines
- **EnsemblePipeline**: Aggregated predictions from multiple models
- **PipelineChain**: Chained pipeline execution
- **PipelineComposer**: Dynamic pipeline construction
All pipelines support:
- Batched inference for efficiency
- Async execution for concurrent requests
- Async streaming for real-time applications
- Device placement (CPU/GPU)
### Safety Filtering
- **SafetyFilter** with `ExtendedSafetyConfig` (boxed to prevent stack overflow)
- **EnhancedSafetyFilter** with multi-risk assessment:
- Toxicity detection
- Hate speech classification
- Personal information detection
- Violence content filtering
- Adult content filtering
- Harassment detection
- Bias assessment
### Auto Classes
Automatic model selection based on task:
- **AutoModel**: Base model auto-selection
- **AutoModelForSequenceClassification**: Text classification models
- **AutoModelForTokenClassification**: Token-level classification
- **AutoModelForQuestionAnswering**: QA models
- **AutoModelForCausalLM**: Text generation models
- **AutoModelForMaskedLM**: Masked language models
- **AutoModelForSeq2SeqLM**: Translation and summarization models
- **AutoTokenizer**: Automatic tokenizer selection
- **AutoConfig**: Configuration auto-detection
### Infrastructure
- **MemoryPool**: Efficient tensor memory management
- **ConfigurationManager**: Centralized configuration handling
- **EnhancedProfiler**: Performance profiling and tracing
- **HubMirror**: Mirror support for model hub access
- **ValidationManager**: Input/output validation
- **BenchmarkSuite**: Built-in benchmarking utilities
### Hugging Face Hub Integration
- **Model downloading** with progress tracking
- **Caching system** for offline use
- **Authentication** for private models
- **Revision/branch** selection
- **Model card** parsing
- **SafeTensors** format support
## Usage Examples
### Pipeline Usage
```rust
use trustformers::pipeline;
// Text classification
let classifier = pipeline("sentiment-analysis")?;
let results = classifier("I love using Rust for ML!")?;
println!("Label: {}, Score: {}", results[0].label, results[0].score);
// Text generation
let generator = pipeline("text-generation")?;
let output = generator("Once upon a time")?;
println!("Generated: {}", output[0].generated_text);
// Question answering
let qa = pipeline("question-answering")?;
let answer = qa(
"What is Rust?",
"Rust is a systems programming language focused on safety."
)?;
println!("Answer: {}", answer.answer);
```
### Auto Classes Usage
```rust
use trustformers::{
AutoModel, AutoTokenizer,
AutoModelForSequenceClassification,
};
// Load model and tokenizer automatically
let model_name = "bert-base-uncased";
let tokenizer = AutoTokenizer::from_pretrained(model_name)?;
let model = AutoModelForSequenceClassification::from_pretrained(model_name)?;
// Use for inference
let inputs = tokenizer.encode("Hello, world!", None)?;
let outputs = model.forward(&inputs)?;
```
### Pipeline Composition
```rust
use trustformers::pipelines::{PipelineChain, EnsemblePipeline};
// Chain pipelines sequentially
let chain = PipelineChain::new()
.add(summarization_pipeline)
.add(classification_pipeline)
.build()?;
let result = chain.run("Very long document text...")?;
```
### Hub Integration
```rust
use trustformers::hub::{Hub, HubConfig};
// Configure hub access
let config = HubConfig {
token: Some("your_token".to_string()),
cache_dir: Some("/path/to/cache".to_string()),
..Default::default()
};
let hub = Hub::new(config)?;
// Download model with progress
let model_path = hub.download_model(
"meta-llama/Llama-2-7b-hf",
Some("main"), // revision
)?;
```
## Architecture
```
trustformers/
├── src/
│ ├── pipelines/ # 23+ pipeline implementations
│ │ ├── text_classification.rs
│ │ ├── text_generation.rs
│ │ ├── token_classification.rs
│ │ ├── conversational.rs
│ │ ├── multimodal.rs
│ │ ├── document_understanding.rs
│ │ ├── composed.rs
│ │ ├── ensemble.rs
│ │ └── ...
│ ├── auto/ # Auto classes
│ │ ├── model.rs
│ │ ├── tokenizer.rs
│ │ └── config.rs
│ ├── hub/ # Hub integration
│ │ ├── download.rs
│ │ ├── cache.rs
│ │ ├── mirror.rs
│ │ └── auth.rs
│ ├── safety/ # Safety filtering
│ │ ├── filter.rs
│ │ └── enhanced.rs
│ ├── generation/ # Generation strategies
│ │ ├── sampling.rs
│ │ ├── beam_search.rs
│ │ └── streaming.rs
│ └── utils/ # Infrastructure utilities
│ ├── memory_pool.rs
│ ├── profiler.rs
│ ├── benchmark.rs
│ └── validation.rs
```
## Pipeline Features
### Advanced Generation
- **Sampling strategies**: Top-k, top-p, temperature
- **Beam search**: With length penalty and early stopping
- **Streaming generation**: Token-by-token async output
- **Constrained generation**: With logit processors
- **Batch generation**: Efficient multi-prompt processing
### Pipeline Options
```rust
use trustformers::{pipeline, PipelineConfig};
let config = PipelineConfig {
device: "cuda:0".to_string(),
batch_size: 32,
max_length: 512,
num_threads: 4,
..Default::default()
};
let pipeline = pipeline_with_config("text-generation", config)?;
```
## Performance
### Benchmarks
| Text Classification | BERT-base | 32 | 850 samples/s |
| Text Generation | GPT-2 | 1 | 45 tokens/s |
| Question Answering | BERT-base | 16 | 320 QA pairs/s |
| Token Classification | BERT-base | 32 | 750 samples/s |
*Benchmarks on NVIDIA RTX 4090*
### Optimization Features
- **Dynamic batching**: Automatic batch optimization
- **MemoryPool**: Efficient tensor allocation and reuse
- **Lazy loading**: On-demand weight loading
- **Memory mapping**: Efficient large model loading
## Supported Models
The library supports all models implemented in `trustformers-models`:
- BERT, RoBERTa, ALBERT, DistilBERT
- GPT-2, GPT-Neo, GPT-J
- T5 (encoder-decoder)
- LLaMA, Mistral, Gemma, Qwen
- CLIP (multimodal)
- And more...
## Testing
- ~1,740 tests covering pipeline correctness and edge cases
- Auto class functionality tests
- Hub integration tests
- Generation strategy tests
- Safety filter tests
- Performance benchmarks via BenchmarkSuite
## Known Limitations (Alpha)
- 11 stub implementations remain in complex pipeline composition code
- Some pipelines require specific model types
- Hub download requires internet connection
- Large models require significant disk space
## License
Apache-2.0