๐ง Tllama
๐ Lightweight Local LLM Inference Engine
Tllama is a Rust-based open-source LLM engine designed for efficient local execution. It features a command-line interface and OpenAI-compatible API for seamless model interaction.
๐ Key Features
- ๐ Smart model detection
- ๐ค Full OpenAI API compatibility
- โก Blazing-fast startup (<0.5s)
- ๐ฆ Ultra-compact binary (<20MB)
๐ฆ Installation
Script install
|
Cargo install
Pre-built binaries
Download from Releases
๐งช Usage Guide
Discover models
Text generation
Example:
Interactive chat
Start API server
Chat API Example:
๐ Development Roadmap
- Core CLI implementation
- GGUF quantized model support
- Model auto-download & caching
- Web UI integration
- Comprehensive test suite
๐ Contributing
PRs welcome! See CONTRIBUTING.md for guidelines
๐ License
MIT License
โจ Design Philosophy
Terminal-first: Optimized for CLI workflows with 10x faster startup than Ollama Minimal footprint: Single binary under 5MB, zero external dependencies Seamless integration: Compatible with OpenAI SDKs and LangChain
๐ฌ Contact
- GitHub: moyanj/tllama
- Issues: Report bugs
- Feature requests: Open discussion issue
โญ Star us on GitHub to show your support!