trustformers
Version: 0.1.1 | Status: Alpha | Updated: 2026-04-25
Main integration crate providing high-level APIs, pipelines, and Hugging Face Hub integration for the TrustformeRS ecosystem.
Current State
This crate serves as the primary entry point for users, offering HuggingFace-compatible APIs for common NLP tasks. It includes comprehensive pipeline implementations, auto model classes, and seamless integration with the Hugging Face Model Hub.
- SLoC: ~59,862
- Tests: ~1,740
- Public API exports (prelude): 50+
- Pipeline types: 23+
- Stubs remaining: 11 (minor, in complex pipeline composition code)
Features
Pipeline API
Complete implementations of 23+ NLP pipeline types:
- Text Generation: Language modeling, text completion, causal LM
- Text Classification: Sentiment analysis, topic classification
- Token Classification: Named Entity Recognition (NER), POS tagging
- Question Answering: Extractive QA from context
- Fill-Mask: Masked language modeling
- Summarization: Abstractive text summarization
- Translation: Language translation (seq2seq)
- Zero-Shot Classification: Classification without training examples
- ConversationalPipeline: Multi-turn dialogue
- MultiModal: Vision-language pipelines
- DocumentUnderstanding: Document analysis and extraction
Pipeline Composition
Advanced pipeline orchestration:
- ComposedPipeline: Sequential multi-stage pipelines
- EnsemblePipeline: Aggregated predictions from multiple models
- PipelineChain: Chained pipeline execution
- PipelineComposer: Dynamic pipeline construction
All pipelines support:
- Batched inference for efficiency
- Async execution for concurrent requests
- Async streaming for real-time applications
- Device placement (CPU/GPU)
Safety Filtering
- SafetyFilter with
ExtendedSafetyConfig(boxed to prevent stack overflow) - EnhancedSafetyFilter with multi-risk assessment:
- Toxicity detection
- Hate speech classification
- Personal information detection
- Violence content filtering
- Adult content filtering
- Harassment detection
- Bias assessment
Auto Classes
Automatic model selection based on task:
- AutoModel: Base model auto-selection
- AutoModelForSequenceClassification: Text classification models
- AutoModelForTokenClassification: Token-level classification
- AutoModelForQuestionAnswering: QA models
- AutoModelForCausalLM: Text generation models
- AutoModelForMaskedLM: Masked language models
- AutoModelForSeq2SeqLM: Translation and summarization models
- AutoTokenizer: Automatic tokenizer selection
- AutoConfig: Configuration auto-detection
Infrastructure
- MemoryPool: Efficient tensor memory management
- ConfigurationManager: Centralized configuration handling
- EnhancedProfiler: Performance profiling and tracing
- HubMirror: Mirror support for model hub access
- ValidationManager: Input/output validation
- BenchmarkSuite: Built-in benchmarking utilities
Hugging Face Hub Integration
- Model downloading with progress tracking
- Caching system for offline use
- Authentication for private models
- Revision/branch selection
- Model card parsing
- SafeTensors format support
Usage Examples
Pipeline Usage
use pipeline;
// Text classification
let classifier = pipeline?;
let results = classifier?;
println!;
// Text generation
let generator = pipeline?;
let output = generator?;
println!;
// Question answering
let qa = pipeline?;
let answer = qa?;
println!;
Auto Classes Usage
use ;
// Load model and tokenizer automatically
let model_name = "bert-base-uncased";
let tokenizer = from_pretrained?;
let model = from_pretrained?;
// Use for inference
let inputs = tokenizer.encode?;
let outputs = model.forward?;
Pipeline Composition
use ;
// Chain pipelines sequentially
let chain = new
.add
.add
.build?;
let result = chain.run?;
Hub Integration
use ;
// Configure hub access
let config = HubConfig ;
let hub = new?;
// Download model with progress
let model_path = hub.download_model?;
Architecture
trustformers/
├── src/
│ ├── pipelines/ # 23+ pipeline implementations
│ │ ├── text_classification.rs
│ │ ├── text_generation.rs
│ │ ├── token_classification.rs
│ │ ├── conversational.rs
│ │ ├── multimodal.rs
│ │ ├── document_understanding.rs
│ │ ├── composed.rs
│ │ ├── ensemble.rs
│ │ └── ...
│ ├── auto/ # Auto classes
│ │ ├── model.rs
│ │ ├── tokenizer.rs
│ │ └── config.rs
│ ├── hub/ # Hub integration
│ │ ├── download.rs
│ │ ├── cache.rs
│ │ ├── mirror.rs
│ │ └── auth.rs
│ ├── safety/ # Safety filtering
│ │ ├── filter.rs
│ │ └── enhanced.rs
│ ├── generation/ # Generation strategies
│ │ ├── sampling.rs
│ │ ├── beam_search.rs
│ │ └── streaming.rs
│ └── utils/ # Infrastructure utilities
│ ├── memory_pool.rs
│ ├── profiler.rs
│ ├── benchmark.rs
│ └── validation.rs
Pipeline Features
Advanced Generation
- Sampling strategies: Top-k, top-p, temperature
- Beam search: With length penalty and early stopping
- Streaming generation: Token-by-token async output
- Constrained generation: With logit processors
- Batch generation: Efficient multi-prompt processing
Pipeline Options
use ;
let config = PipelineConfig ;
let pipeline = pipeline_with_config?;
Performance
Benchmarks
| Pipeline | Model | Batch Size | Throughput |
|---|---|---|---|
| Text Classification | BERT-base | 32 | 850 samples/s |
| Text Generation | GPT-2 | 1 | 45 tokens/s |
| Question Answering | BERT-base | 16 | 320 QA pairs/s |
| Token Classification | BERT-base | 32 | 750 samples/s |
Benchmarks on NVIDIA RTX 4090
Optimization Features
- Dynamic batching: Automatic batch optimization
- MemoryPool: Efficient tensor allocation and reuse
- Lazy loading: On-demand weight loading
- Memory mapping: Efficient large model loading
Supported Models
The library supports all models implemented in trustformers-models:
- BERT, RoBERTa, ALBERT, DistilBERT
- GPT-2, GPT-Neo, GPT-J
- T5 (encoder-decoder)
- LLaMA, Mistral, Gemma, Qwen
- CLIP (multimodal)
- And more...
Testing
- ~1,740 tests covering pipeline correctness and edge cases
- Auto class functionality tests
- Hub integration tests
- Generation strategy tests
- Safety filter tests
- Performance benchmarks via BenchmarkSuite
Known Limitations (Alpha)
- 11 stub implementations remain in complex pipeline composition code
- Some pipelines require specific model types
- Hub download requires internet connection
- Large models require significant disk space
License
Apache-2.0