iocaine
=======
[![Build status][ci:badge]][ci:url]
[![Container image][oci:badge]][oci:url]
[![Demo][demo:badge]][demo:url]
[![Documentation][docs:badge]][docs:url]
[ci:badge]: https://git.madhouse-project.org/iocaine/iocaine/actions/workflows/build.yaml/badge.svg?style=for-the-badge&label=CI
[ci:url]: https://git.madhouse-project.org/iocaine/iocaine/actions/workflows/build.yaml/runs/latest
[oci:badge]: https://img.shields.io/badge/container-latest-blue?style=for-the-badge
[oci:url]: https://git.madhouse-project.org/iocaine/-/packages/container/iocaine/latest
[demo:badge]: https://img.shields.io/badge/demo-iocaine-seagreen?style=for-the-badge
[demo:url]: https://poison.madhouse-project.org/
[docs:badge]: https://img.shields.io/badge/docs-online-orange?style=for-the-badge
[docs:url]: https://iocaine.madhouse-project.org/
> The deadliest poison known to AI.
This is a tarpit, modeled after [Nepenthes](https://zadzmo.org/code/nepenthes/), intended to catch unwelcome web crawlers, but with a slightly different, more aggressive intended usage scenario. The core idea is to configure a reverse proxy to serve content generated by `iocaine` to AI crawlers, but normal content to every other visitor. This differs from Nepenthes, where the idea is to link to it, and trap crawlers that way. Not with `iocaine`, where the trap is laid by the reverse proxy.
`iocaine` does not try to slow crawlers. It does not try to waste their time that way - that is left up to the reverse proxy. `iocaine` is *purely* about generating garbage.
For more information about what this is, how it works, and how to deploy it, have a look at the [dedicated website][docs:url].
Lets make AI poisoning the norm. If we all do it, they won't have anything to crawl.
Development
-----------
Assuming you have rust and cargo installed, you can build iocaine with:
```bash
cargo build
```
You can use the `maze-data/` directory to store your textual sources and configuration:
```bash
mkdir maze-data
# For convenience, we use the README as a markov source.
# A longer document would be better.
cp README.md maze-data/markov.txt
# Build a word list from the markov source
cp -r templates/* maze-data/
# Create a config.toml file
cat > maze-data/config.toml <<EOF
[server]
bind = "0.0.0.0:42069"
[templates]
directory = "maze-data/"
[sources]
words = "maze-data/words.txt"
markov = ["maze-data/markov.txt"]
[generator]
initial_seed = ""
EOF
```
To build and run iocaine with your local changes, run:
```bash
cargo run -- --config-file maze-data/config.toml
```
If you have epubs around and `pandoc` installed, you can very easily build a word list and a markov source like this:
```bash
pandoc maze-data/*.epub -t plain > maze-data/markov.txt