ragit 0.4.5

git-like rag pipeline
Documentation
# ragit 0.3.0 release

2025-02-24

## Dependencies

- bytes: new
- json: removed
- rand: 0.8.5 -> 0.9.0

## `rag clone`, `rag push`

Finally, ragit supports `push` operation. It is still a minimum viable product, but works!

In 0.2.1, `clone` naively fetches all the chunks and images, one per http request. It was very inefficient. Now it uses archive files, which is significantly more efficient.

## `rag archive-create`, `rag archive-extract`

0.3.0 supports creating/extracting archive files. You can easily create an archive of your knowledge-base using `rag archive-create`. You can share knowledge-bases even more easily with archive files. Clone/Push operations now also use archive files.

## `rag add`

Fixed some quirks. Previously, you could add files in `.ragit/` and `.git/`.

## `rag build`

Now `rag build` uses multi-process to call LLMs and build chunks. It makes LLM calls and incremental ii-build much more efficient.

## `rag retrieve-chunks`

Added a new command, which only retrieves chunks and does not ask a question. It's like `rag tfidf`, but a bit different.

|                         | `rag tfidf`                             | `rag retrieve-chunks`  |
|-------------------------|-----------------------------------------|------------------------|
| default input           | Keywords, but there's `--query` flag    | Query                  |
| extract keywords        | No                                      | Yes                    |
| rerank                  | No                                      | Yes                    |

Also, when ragit implements vector searches (someday), `rag retrieve-chunks` may use vector searches, but `rag tfidf` will still use tfidf.

## Chat models

Previously, all the models are hard-coded, and there's no way to add/remove your own models. Now, ragit uses `models.json` to manage models. If you're using an OpenAI-compatible model, Anthropic model or Cohere model, you can easily add your model to `models.json`.

CLI commands for adding/removing models are coming soon!

## CLI

Finally, ragit supports short flags and `--` flag. Instead of `rag rm --recursive`, you can use `rag rm -r`. At ragit 0.2.1, there's no way to add a file whose name starts with `"-"`. Now you can use `rag add -- --file-name` to do so.

## Safer file operations

0.3.0 implements `WriteMode::Atomic`, which tries its best to write a file atomically. It first creates a tmp file then rename the tmp file. In most file systems, a rename operation is atomic.

It reduces the error rate of `tests/many_chunks.py` and `tests/many_jobs.py` significantly.

## Ignore

Fixed a few bugs in the ignore-parser.

## Ragit-server

Implemented 5 new endpoints

- GET `/{user-name}/{repo-name}/archive-list`
- GET `/{user-name}/{repo-name}/archive/{archive-key}`
- POST `/{user-name}/{repo-name}/begin-push`
- POST `/{user-name}/{repo-name}/archive`
- POST `/{user-name}/{repo-name}/finalize-push`

These are for the new push/clone operations.

## tests

7 tests are added: add_and_rm2, ignore, archive, many_jobs, symlink, extract_keywords and migrate2.