valentinus 0.2.3

Next generation vector database built with LMDB bindings
Documentation

.github/workflows/rust.yml Crates.io Version Crates.io Downloads (latest version) docs.rs GitHub commit activity Matrix

alt text

valentinus

next generation vector db built with lmdb bindings

dependencies

  • bincode/serde - serialize/deserialize
  • lmdb-rs - database bindings
  • ndarray - numpy equivalent
  • ort/onnx - embeddings

getting started

git clone https://github.com/kn0sys/valentinus && cd valentinus

tests

  • Note: all tests currently require the all-Mini-LM-L6-v2_onnx directory
  • Get the model.onnx and tokenizer.json from huggingface or build them
mkdir all-Mini-LM-L6-v2_onnx
cd all-Mini-LM-L6-v2_onnx && wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/config.json
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/onnx/model.onnx
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/special_tokens_map.json
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer_config.json
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer.json
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/vocab.txt

examples

see examples

reference

inspired by this chromadb python tutorial