Expand description
CLI Commands - Ollama-style interface
Modules§
- bench
- Bench command - Throughput and latency benchmarking
- embed
- Embed command - Generate embeddings using BERT models
- list
- List command - Show downloaded models
- pull
- Pull command - Download a model from HuggingFace Hub
- run
- Run command - Interactive chat with a model (ollama-style)
- serve
- Serve command - Start the HTTP inference server
- stop
- Stop command - Stop the running server