MiniLLM

A mini inference engine for running language models.

🚧 Work in Progress

This project is currently under active development. Features may change as the project evolves.

Description

MiniLLM is a lightweight inference engine designed for running language models efficiently. The goal is to provide a simple, fast, and memory-efficient solution for LLM inference.

Features (Planned)

Model loading and inference
Support for multiple model formats
Memory-efficient execution
CPU and GPU acceleration
Simple API for integration

Installation

This crate is not yet published to crates.io. Once available, you can install it with:

cargo add minillm

Usage

Documentation and usage examples will be provided as the project develops.

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

BM Monjur Morshed

minillm 0.1.0