

valentinus
next generation vector db built with lmdb bindings
dependencies
- bincode/serde - serialize/deserialize
- lmdb-rs - database bindings
- ndarray - numpy equivalent
- ort/onnx - embeddings
getting started
git clone https://github.com/kn0sys/valentinus && cd valentinus
tests
- Note: all tests currently require the
all-Mini-LM-L6-v2_onnx directory
- Get the model.onnx and tokenizer.json from huggingface or build them
mkdir all-Mini-LM-L6-v2_onnx
cd all-Mini-LM-L6-v2_onnx && wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/config.json
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/onnx/model.onnx
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/special_tokens_map.json
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer_config.json
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer.json
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/vocab.txt
examples
see examples
reference
inspired by this chromadb python tutorial