Jump to: English | Deutsch | Turkce
English
Table of Contents
- What is AIRust?
- Key Features at a Glance
- Installation & Setup
- Architecture Overview
- Agent Types — The Brain of AIRust
- Knowledge Base — The Memory
- PDF Processing — Learn from Documents
- Web Dashboard — The Control Center
- Bot Ecosystem — Automated Data Collection
- CLI — Command Line Interface
- Using AIRust as a Library
- Text Processing Utilities
- Docker Deployment
- API Reference
- Configuration & Feature Flags
- Training Data Format
- Project Structure
- Use Cases & Ideas
- Version History
- License
1. What is AIRust?
AIRust is a self-contained AI engine written entirely in Rust. Unlike cloud-based AI solutions, AIRust runs 100% locally — no OpenAI, no API keys, no internet required. You train it with your own data, and it answers questions using pattern matching, fuzzy search, and semantic similarity algorithms.
Think of it as: Your own private AI assistant that you teach yourself.
It comes with:
- Multiple intelligent agent types (exact matching, fuzzy matching, semantic search)
- A built-in web dashboard with chat interface
- PDF document processing for automatic knowledge extraction
- Web scraping bots for automated data collection
- A SQLite database for persistent storage
- Full REST API for integration with other systems
Summary: AIRust is a local, trainable AI engine in Rust. You feed it knowledge (text, PDFs, web scraping), and it answers questions intelligently — no cloud, no API keys, fully private.
2. Key Features at a Glance
| Feature | Description |
|---|---|
| 4 Agent Types | Exact Match, Fuzzy Match, TF-IDF/BM25 Semantic, Context-Aware |
| Knowledge Base | JSON-based, compile-time embedded, runtime expandable |
| PDF Processing | Convert PDFs to structured training data with smart chunking |
| Web Dashboard | Full UI with chat, training manager, bot control, file browser |
| Bot Ecosystem | Automated web scraping with review workflow |
| Vector Database | Embedding storage and similarity search |
| Chat History | Persistent conversations with archiving |
| Multi-Language UI | English, German, Turkish |
| REST API | 50+ endpoints for full programmatic control |
| WebSocket Console | Live terminal with server logs, shell access, built-in commands |
| Docker Support | One-command deployment |
| CLI Tools | Interactive mode, query tools, PDF conversion |
Summary: AIRust provides everything you need to build, train, and deploy an AI system: from agents and knowledge management to a full web interface and automated data collection.
3. Installation & Setup
As a Rust Library
Add to your Cargo.toml:
[]
= "0.1.7"
Build from Source
Run the Web Server
# Start on default port 7070
# Custom port
# Run in background (detached mode)
# Show landing page + dashboard (default: dashboard only)
# Stop background server
Then open http://localhost:7070 in your browser.
With Docker
Summary: You can use AIRust as a library in your own Rust projects, run it as a standalone web server, or deploy it in Docker. The web dashboard is available at port 7070 by default.
4. Architecture Overview
┌──────────────────────────────────────┐
│ AIRust Engine │
├──────────┬───────────┬───────────────┤
│MatchAgent│TfidfAgent │ ContextAgent │
│(exact/ │(BM25 │ (wraps any │
│ fuzzy) │ semantic) │ agent + mem) │
├──────────┴───────────┴───────────────┤
│ Knowledge Base │
│ (JSON / Embedded / Runtime) │
├──────────────────────────────────────┤
│ Text Processing │
│ (tokenize, stopwords, similarity) │
└────────┬──────────┬──────────────────┘
│ │
┌────────▼──┐ ┌────▼──────────────┐
│ CLI Tool │ │ Web Server (Axum) │
│ (airust) │ │ + REST API │
└───────────┘ │ + WebSocket │
│ + SQLite DB │
│ + Bot Scheduler │
└───────────────────┘
Core Traits — Every agent implements these interfaces:
| Trait | Purpose |
|---|---|
Agent |
Base trait: predict(), confidence(), can_answer() |
TrainableAgent |
Adds train(), add_example() |
ContextualAgent |
Adds add_context(), clear_context() for conversation memory |
ConfidenceAgent |
Adds calculate_confidence(), predict_top_n() |
Summary: AIRust has a layered architecture: agents do the thinking, the knowledge base stores the data, and the web server provides the interface. Everything communicates through clean Rust traits.
5. Agent Types — The Brain of AIRust
5.1 MatchAgent (Exact & Fuzzy)
The simplest and fastest agent. It compares your question directly against its training data.
Exact Mode — finds answers only when the question matches exactly (case-insensitive):
let agent = new_exact;
Fuzzy Mode — tolerates typos using Levenshtein distance:
let agent = new_fuzzy;
// With custom tolerance
let agent = new;
When to use: FAQ bots, command recognition, structured Q&A where questions are predictable.
5.2 TfidfAgent (Semantic Search with BM25)
Uses the BM25 algorithm (the same algorithm behind search engines like Elasticsearch) to find the most relevant answer based on term frequency and document importance.
let agent = new;
// Fine-tune the algorithm
let agent = new
.with_bm25_params; // k1 = term scaling, b = length norm
When to use: Document search, knowledge bases with natural language questions, when exact matching is too strict.
5.3 ContextAgent (Conversational Memory)
Wraps any other agent and adds conversation memory. It remembers the last N exchanges so follow-up questions work naturally.
let base = new;
let agent = new // Remember 5 turns
.with_context_format;
Context Formats:
| Format | Example Output |
|---|---|
QAPairs |
Q: What is Rust? A: A programming language. Q: ... |
List |
[What is Rust? -> A programming language, ...] |
Sentence |
Previous questions: What is Rust? - A programming language; ... |
Custom |
Your own formatting function |
When to use: Chatbots, interactive assistants, any scenario where users ask follow-up questions.
Summary: AIRust offers three agent types: MatchAgent for fast exact/fuzzy matching, TfidfAgent for intelligent semantic search, and ContextAgent for conversational memory. Choose based on your use case, or combine them.
6. Knowledge Base — The Memory
The Knowledge Base is where all training data lives. It supports two modes:
Compile-Time Embedding
Data from knowledge/train.json is baked into the binary at build time:
let kb = from_embedded;
Runtime Management
let mut kb = new;
// Add entries
kb.add_example;
// Save to disk
kb.save?;
// Load from file
let kb = load?;
// Merge multiple knowledge bases
kb.merge;
Data Format
Response Formats: Text, Markdown, or Json — the agent automatically handles the right format.
Legacy Support: Old-style {"input": "...", "output": "..."} files (where output is a plain string) are still fully supported.
Summary: The Knowledge Base stores everything the AI knows. Data can be embedded at compile time for zero-cost access or managed dynamically at runtime. Entries have weights (importance) and optional metadata.
7. PDF Processing — Learn from Documents
AIRust can extract knowledge from PDF documents automatically. It splits text into intelligent chunks and creates training examples.
Command-Line Tool
# Basic conversion
# Custom output path
# Full configuration
In Code
use ;
let config = PdfLoaderConfig ;
let loader = with_config;
let kb = loader.pdf_to_knowledge_base?;
println!;
Merging Multiple Sources
# Place multiple JSON files in knowledge/
# Then merge them all into train.json
How Chunking Works
PDF Document
│
▼
┌─────────────────────────────────┐
│ Full extracted text │
└──────────┬──────────────────────┘
│ Split by sentences
▼
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│Chunk1│ │Chunk2│ │Chunk3│ │Chunk4│ (with overlap)
└──────┘ └──────┘ └──────┘ └──────┘
│
▼
TrainingExample per chunk
(with page number metadata)
Summary: Feed PDFs into AIRust and it automatically creates structured training data. Intelligent chunking respects sentence boundaries and maintains context through overlapping segments. Merge multiple PDFs into one unified knowledge base.
8. Web Dashboard — The Control Center
Start the server with cargo run and open http://localhost:7070.
Dashboard Tabs
| Tab | What it does |
|---|---|
| Chat | Talk to your AI agent, see confidence scores, switch agents |
| Training | Manage training data with categories, import/export JSON |
| Knowledge | Browse, search, add, delete knowledge base entries |
| Bots | Create and manage web scraping bots |
| Data Review | Approve/reject data collected by bots |
| Vectors | Manage vector collections and embeddings |
| Files | Browse project files and SQLite database |
| Console | Real-time WebSocket log viewer with shell access |
| Settings | Theme (dark/light), language (EN/DE/TR), accent colors |
Smart Settings via Chat
You can change settings by chatting naturally:
- "Make the page dark" — switches to dark theme
- "Change to green background" — updates accent color
- "Use German language" — switches UI language
- "Mach die Seite dunkel" — also works in German
- "Turkce yap" — switches to Turkish
Agent Switching
Switch between agent types at any time via the API or UI:
- Exact Match
- Fuzzy Match
- TF-IDF (BM25)
- Context Agent (with conversation memory)
Console — Real-Time Server Terminal
The console panel sits at the bottom of the dashboard and acts as a live terminal connected to your AIRust server via WebSocket (/ws/console).
What it does:
- Streams every server log (requests, errors, agent activity) to your browser in real time
- Lets you type commands directly — both built-in commands and arbitrary shell commands
- Shows color-coded output:
info(blue),warn(yellow),error(red),cmd(green),stdout/stderr
Built-in Commands:
| Command | What it does |
|---|---|
help or ? |
Show list of available commands |
clear |
Clear all console output |
status |
Show AIRust version, server state, and working directory |
stop |
Gracefully shut down the server |
restart |
Restart the server process |
| anything else | Executed as a shell command (e.g. ls, df -h, cat knowledge/train.json) |
UI Features:
| Feature | How it works |
|---|---|
| Drag to resize | Grab the console header bar and drag up/down to resize the panel |
| Minimize/Expand | Click the toggle button in the header to collapse or expand |
| Command history | Press arrow keys (up/down) to cycle through previous commands |
| Auto-reconnect | If the WebSocket disconnects, it automatically reconnects every 2 seconds |
| Connection indicator | Green dot = connected, red dot = disconnected |
Technical Details:
- Server keeps a ring buffer of 500 log entries — when you connect, you receive the full history
- Client caps at 1000 DOM nodes to keep the browser fast
- WebSocket broadcast with fan-out: multiple browser tabs all receive logs simultaneously
- Fuzzy command matching: if you mistype a built-in command (e.g.
stausinstead ofstatus), it suggests the correct one - Shell commands run asynchronously via
tokio::spawn, so long-running commands don't block the server
┌────────────────────────────────────────────────────┐
│ Console [─] [drag] │
├────────────────────────────────────────────────────┤
│ 12:34:01 [info] Server started on port 7070 │
│ 12:34:05 [info] POST /api/query → 200 (12ms) │
│ 12:35:10 [cmd] $ status │
│ 12:35:10 [info] AIRust v0.1.7 │
│ 12:35:10 [info] Server: running │
│ 12:35:10 [info] CWD: /app │
│ 12:36:00 [cmd] $ ls knowledge/ │
│ 12:36:00 [stdout] train.json │
├────────────────────────────────────────────────────┤
│ $ _ │
└────────────────────────────────────────────────────┘
Summary: The web dashboard is a complete control center for your AI: chat with it, manage training data in categories, control bots, review collected data, browse files, and run server commands through the built-in live console — all in the browser.
9. Bot Ecosystem — Automated Data Collection
AIRust includes a built-in web scraping system to automatically collect training data from websites.
Workflow
1. Create Bot → Define URL, crawl config
2. Start Bot Run → Scraper collects content
3. Review Raw Data → Approve or reject entries
4. Convert to KB → Add approved data to training
5. Retrain Agent → Agent learns new knowledge
Features
- Web Crawling: Configurable depth, URL patterns
- Deduplication: Content hashing prevents duplicate entries
- Manual Review: Approve/reject workflow ensures data quality
- Run History: Track every bot execution with stats
- Scheduling: Automated periodic execution
Data Flow
Website → Crawler → Raw Data (pending)
│
Manual Review
┌────┴────┐
Approved Rejected
│
Add to Knowledge Base
│
Retrain Agent
Summary: Bots crawl websites and collect text data automatically. A manual review step ensures quality before the data enters your knowledge base. This creates a self-improving AI pipeline.
10. CLI — Command Line Interface
Query Modes
# Exact matching
# Fuzzy matching (tolerates typos)
# Semantic search (best for natural language)
Interactive Mode
Opens an interactive REPL where you can:
- Choose your agent type
- Ask questions in real-time
- See confidence scores
- Maintain conversation context
Knowledge Base Management
Opens a menu for:
- Viewing all entries
- Adding new entries
- Deleting entries
- Saving/loading the knowledge base
PDF Import
# Convert PDF to knowledge base
# Merge all knowledge files
Summary: The CLI gives you quick access to all agent types, an interactive chat mode, knowledge management, and PDF conversion — perfect for testing and quick queries without starting the web server.
11. Using AIRust as a Library
Basic Example
use ;
With TF-IDF and Context
use *;
PDF to Agent Pipeline
use ;
Incremental Training with append()
train() replaces all data. Use append() or train_single() to add examples without losing existing data:
use ;
Confidence Scores with ConfidenceAgent
TfidfAgent and MatchAgent implement the ConfidenceAgent trait for ranked predictions:
use ;
use ConfidenceAgent;
Summary: As a library, AIRust gives you full programmatic control. Create agents, load knowledge, train, and query — all in a few lines of Rust code. Combine agents, process PDFs, and build custom AI applications.
12. Text Processing Utilities
AIRust provides built-in text processing tools in the text_utils module:
use text_utils;
// Tokenization
let tokens = tokenize;
// → ["hello", "world"]
// Unique terms
let terms = unique_terms;
// → {"the", "cat", "and", "dog"}
// Stopword removal (supports English and German)
let filtered = remove_stopwords;
// Removes: the, and, is, in, of, to, a, with, for, ...
let filtered_de = remove_stopwords;
// Removes: der, die, das, und, in, ist, von, mit, zu, ...
// String similarity
let dist = levenshtein_distance; // → 3
let sim = jaccard_similarity;
// N-grams
let bigrams = create_ngrams;
// Unicode normalization
let normalized = normalize_text; // → "café"
Summary: Built-in text utilities handle tokenization, stopword removal (EN/DE), similarity metrics, n-grams, and Unicode normalization — no external NLP libraries needed.
13. Docker Deployment
Dockerfile (Multi-Stage Build)
FROM rust:1.85-bookworm AS builder
WORKDIR /app
COPY . .
RUN cargo build --release
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y ca-certificates
WORKDIR /app
COPY --from=builder /app/target/release/airust /usr/local/bin/airust
COPY --from=builder /app/knowledge/ ./knowledge/
EXPOSE 7070
CMD ["airust"]
Build & Run
# With persistent database
Summary: Docker provides a clean deployment path: multi-stage build keeps the image small, only the binary and knowledge files are included. Mount a volume for database persistence.
14. API Reference
Core Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
Web Dashboard (HTML) |
GET |
/api/status |
Server status, agent type, KB size |
POST |
/api/query |
Query the AI agent |
POST |
/api/agent/switch |
Switch agent type |
Knowledge Base
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/knowledge |
List entries (paginated, searchable) |
POST |
/api/knowledge/add |
Add new entry |
DELETE |
/api/knowledge/:index |
Delete entry |
POST |
/api/knowledge/save |
Save KB to file |
POST |
/api/knowledge/load |
Load KB from file |
POST |
/api/pdf/upload |
Upload & process PDF |
POST |
/api/upload/json |
Upload JSON knowledge |
Training Data
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/training/categories |
List categories |
POST |
/api/training/categories |
Create category |
DELETE |
/api/training/categories/:id |
Delete category |
GET |
/api/training/data |
List training data |
POST |
/api/training/data |
Add training entry |
DELETE |
/api/training/data/:id |
Delete entry |
POST |
/api/training/import |
Import from JSON |
GET |
/api/training/export |
Export to JSON |
Chat System
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/chats |
List conversations |
POST |
/api/chats |
Create new chat |
GET |
/api/chats/:id/messages |
Get chat messages |
DELETE |
/api/chats/:id |
Delete chat |
POST |
/api/chats/:id/archive |
Archive chat |
Bot Management
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/bots |
List all bots |
POST |
/api/bots |
Create bot |
GET |
/api/bots/:id |
Get bot details |
PUT |
/api/bots/:id |
Update bot |
DELETE |
/api/bots/:id |
Delete bot |
POST |
/api/bots/:id/start |
Start bot run |
POST |
/api/bots/:id/stop |
Stop bot run |
GET |
/api/bots/:id/runs |
Run history |
Data Review
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/data/pending |
Pending data from bots |
POST |
/api/data/:id/approve |
Approve entry |
POST |
/api/data/:id/reject |
Reject entry |
POST |
/api/data/approve-all |
Batch approve |
GET |
/api/data/approved |
View approved data |
POST |
/api/data/add-to-kb |
Add to knowledge base |
Vector Database
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/vectors/stats |
Vector DB statistics |
POST |
/api/vectors/rebuild |
Rebuild index |
GET |
/api/vectors/collections |
List collections |
POST |
/api/vectors/collections |
Create collection |
DELETE |
/api/vectors/collections/:id |
Delete collection |
GET |
/api/vectors/entries |
List entries |
POST |
/api/vectors/entries |
Add entry |
DELETE |
/api/vectors/entries/:id |
Delete entry |
POST |
/api/vectors/search |
Similarity search |
Settings & System
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/settings |
Get settings |
POST |
/api/settings |
Update settings |
GET |
/api/translations/:lang |
Get UI translations |
GET |
/api/files |
List files |
GET |
/api/files/read |
Read file |
POST |
/api/files/write |
Write file |
GET |
/api/files/db/tables |
List SQLite tables |
GET |
/api/files/db/query |
Execute SQL query |
WS |
/ws/console |
Real-time console log |
Summary: Over 50 REST API endpoints give you full control over agents, knowledge, training data, bots, chats, vectors, files, and settings. Plus a WebSocket endpoint for real-time logging.
15. Configuration & Feature Flags
Cargo Feature Flags
| Flag | Default | Description |
|---|---|---|
colors |
Yes | Colored terminal output |
web |
Yes | Web server + SQLite + Bot ecosystem |
bots |
Yes (via web) | Web scraping (reqwest, scraper) |
async |
Yes (via web) | Async runtime (tokio) |
plotting |
No | Data visualization (plotly, plotters) |
# Minimal (library only, no web server)
= { = "0.1.7", = false }
# Library + colors
= { = "0.1.7", = false, = ["colors"] }
# Everything including plotting
= { = "0.1.7", = ["plotting"] }
Runtime Settings (via Web UI or API)
| Setting | Values | Description |
|---|---|---|
theme |
dark, light |
UI theme |
language |
en, de, tr |
Interface language |
accent_color |
hex color | Primary accent color |
bg_color |
hex color | Background color |
Summary: Feature flags let you control what gets compiled — from a minimal library to a full web platform. Runtime settings control the UI appearance and language.
16. Training Data Format
Modern Format (Recommended)
Legacy Format (Still Supported)
Fields:
input— The question or trigger textoutput— The answer, asText,Markdown, orJsonweight— Importance factor (higher = preferred in ranking, default: 1.0)metadata— Optional JSON object for source tracking, page numbers, etc.
Summary: Training data is stored as JSON arrays. Each entry has an input (question), output (answer in Text/Markdown/JSON format), an importance weight, and optional metadata. Legacy formats are auto-converted.
17. Project Structure
airust/
├── Cargo.toml # Package manifest & dependencies
├── Cargo.lock # Dependency lock file
├── build.rs # Build script (embeds train.json)
├── Dockerfile # Multi-stage Docker build
├── .dockerignore # Docker excludes
├── README.md # This file
├── knowledge/
│ └── train.json # Embedded training data
├── src/
│ ├── lib.rs # Library exports & public API
│ ├── agent.rs # Core traits & text utilities
│ ├── match_agent.rs # Exact & fuzzy matching agent
│ ├── tfidf_agent.rs # BM25 semantic search agent
│ ├── context_agent.rs # Conversational memory wrapper
│ ├── knowledge.rs # Knowledge base management
│ ├── pdf_loader.rs # PDF → training data conversion
│ ├── bin/
│ │ ├── airust.rs # Main CLI & web server binary
│ │ ├── pdf2kb.rs # PDF converter CLI tool
│ │ └── merge_kb.rs # Knowledge base merger tool
│ └── web/
│ ├── mod.rs # Server initialization & routing
│ ├── state.rs # Application state & agent wrapper
│ ├── routes.rs # API endpoint handlers
│ ├── db.rs # SQLite database layer
│ ├── console.rs # WebSocket console logging
│ ├── vectordb.rs # Vector database operations
│ ├── static/
│ │ └── index.html # Web dashboard (single-page app)
│ └── bots/
│ ├── mod.rs # Bot module exports
│ ├── models.rs # Bot data structures
│ ├── db.rs # Bot database operations
│ ├── crawler.rs # Web scraping engine
│ ├── processor.rs # Data processing pipeline
│ ├── scheduler.rs # Automated execution
│ ├── vectordb.rs # Vector operations
│ └── routes.rs # Bot API endpoints
└── airust.db # SQLite database (auto-created)
Summary: The project is cleanly organized: core AI logic in
src/, web server insrc/web/, CLI tools insrc/bin/, and training data inknowledge/. The web dashboard is a single HTML file served directly from memory.
18. Use Cases & Ideas
- FAQ Bot — Train with frequently asked questions, deploy as a web widget
- Document Search — Load PDFs, build a searchable knowledge base
- Customer Support — Context-aware agent remembers the conversation
- Internal Wiki Bot — Scrape your company wiki, auto-build knowledge
- Developer Documentation Assistant — Load API docs as PDFs
- Educational Tool — Students ask questions about course material
- IoT Device Assistant — Minimal binary, runs on embedded systems
- Privacy-First AI — No cloud, no data leaving your network
- Competitive Intelligence — Bots scrape public sources, review & learn
Summary: AIRust is flexible enough for FAQ bots, document search, customer support, education, IoT, and any scenario where you need a private, trainable AI without cloud dependencies.
19. Version History
| Version | Highlights |
|---|---|
| 0.1.7 | Real-time WebSocket console, shell command execution, drag-to-resize UI |
| 0.1.6 | PDF processing improvements, web dashboard |
| 0.1.5 | ContextAgent, ResponseFormat, advanced matching, TF-IDF |
| 0.1.4 | TF-IDF/BM25 agent |
| 0.1.3 | English language support |
| 0.1.2 | Initial release |
20. License
MIT — Free for personal and commercial use.
Author: LEVOGNE Repository: github.com/LEVOGNE/airust Documentation: docs.rs/airust
Deutsch
Inhaltsverzeichnis
- Was ist AIRust?
- Funktionen im Ueberblick
- Installation & Einrichtung
- Architektur-Uebersicht
- Agenten-Typen — Das Gehirn von AIRust
- Wissensdatenbank — Das Gedaechtnis
- PDF-Verarbeitung — Aus Dokumenten lernen
- Web-Dashboard — Die Steuerzentrale
- Bot-System — Automatische Datensammlung
- CLI — Kommandozeile
- AIRust als Bibliothek nutzen
- Textverarbeitung
- Docker-Deployment
- API-Referenz
- Konfiguration & Feature-Flags
- Trainingsdaten-Format
- Projektstruktur
- Anwendungsbeispiele
- Versionshistorie
- Lizenz
1. Was ist AIRust?
AIRust ist eine eigenstaendige KI-Engine, komplett in Rust geschrieben. Im Gegensatz zu Cloud-basierten KI-Loesungen laeuft AIRust 100% lokal — kein OpenAI, keine API-Schluessel, kein Internet noetig. Du trainierst es mit deinen eigenen Daten, und es beantwortet Fragen mit Musterabgleich, unscharfer Suche und semantischen Aehnlichkeitsalgorithmen.
Stell es dir so vor: Dein eigener privater KI-Assistent, den du selbst unterrichtest.
Es bringt mit:
- Mehrere intelligente Agenten-Typen (exakter Abgleich, unscharfer Abgleich, semantische Suche)
- Ein eingebautes Web-Dashboard mit Chat-Oberflaeche
- PDF-Dokumentenverarbeitung fuer automatische Wissensextraktion
- Web-Scraping-Bots fuer automatisierte Datensammlung
- Eine SQLite-Datenbank fuer persistente Speicherung
- Vollstaendige REST-API fuer Integration mit anderen Systemen
Zusammenfassung: AIRust ist eine lokale, trainierbare KI-Engine in Rust. Du fuetterst sie mit Wissen (Texte, PDFs, Web-Scraping) und sie beantwortet Fragen intelligent — ohne Cloud, ohne API-Schluessel, vollstaendig privat.
2. Funktionen im Ueberblick
| Funktion | Beschreibung |
|---|---|
| 4 Agenten-Typen | Exakt, Unscharf (Fuzzy), TF-IDF/BM25 Semantisch, Kontextbewusst |
| Wissensdatenbank | JSON-basiert, zur Kompilierzeit eingebettet, zur Laufzeit erweiterbar |
| PDF-Verarbeitung | PDFs in strukturierte Trainingsdaten umwandeln |
| Web-Dashboard | Vollstaendige UI mit Chat, Training, Bot-Steuerung, Dateibrowser |
| Bot-System | Automatisches Web-Scraping mit Pruefungs-Workflow |
| Vektor-Datenbank | Embedding-Speicher und Aehnlichkeitssuche |
| Chat-Verlauf | Persistente Gespraeche mit Archivierung |
| Mehrsprachige UI | Englisch, Deutsch, Tuerkisch |
| REST-API | 50+ Endpunkte fuer volle programmatische Kontrolle |
| WebSocket-Konsole | Live-Terminal mit Server-Logs, Shell-Zugriff, eingebauten Befehlen |
| Docker-Support | Deployment mit einem Befehl |
| CLI-Tools | Interaktiver Modus, Abfrage-Tools, PDF-Konvertierung |
Zusammenfassung: AIRust bietet alles, was du brauchst, um ein KI-System zu bauen, zu trainieren und bereitzustellen: von Agenten und Wissensverwaltung bis hin zur vollstaendigen Web-Oberflaeche und automatischer Datensammlung.
3. Installation & Einrichtung
Als Rust-Bibliothek
In deiner Cargo.toml:
[]
= "0.1.7"
Aus dem Quellcode bauen
Web-Server starten
# Standard-Port 7070
# Eigener Port
# Im Hintergrund starten
# Landing Page + Dashboard anzeigen (Standard: nur Dashboard)
# Hintergrund-Server stoppen
Dann oeffne http://localhost:7070 im Browser.
Mit Docker
Zusammenfassung: Du kannst AIRust als Bibliothek in eigenen Rust-Projekten nutzen, als eigenstaendigen Web-Server starten oder in Docker deployen. Das Web-Dashboard ist standardmaessig auf Port 7070 erreichbar.
4. Architektur-Uebersicht
┌──────────────────────────────────────┐
│ AIRust Engine │
├──────────┬───────────┬───────────────┤
│MatchAgent│TfidfAgent │ ContextAgent │
│(exakt/ │(BM25 │ (wickelt │
│ unscharf)│ semantisch│ jeden Agent) │
├──────────┴───────────┴───────────────┤
│ Wissensdatenbank │
│ (JSON / Eingebettet / Laufzeit) │
├──────────────────────────────────────┤
│ Textverarbeitung │
│ (Tokenisierung, Stoppwoerter, etc.) │
└────────┬──────────┬──────────────────┘
│ │
┌────────▼──┐ ┌────▼──────────────┐
│ CLI-Tool │ │ Web-Server (Axum) │
│ (airust) │ │ + REST-API │
└───────────┘ │ + WebSocket │
│ + SQLite-DB │
│ + Bot-Scheduler │
└───────────────────┘
Kern-Traits — Jeder Agent implementiert diese Schnittstellen:
| Trait | Zweck |
|---|---|
Agent |
Basis: predict(), confidence(), can_answer() |
TrainableAgent |
Fuegt train(), add_example() hinzu |
ContextualAgent |
Fuegt add_context(), clear_context() fuer Gespraechsspeicher hinzu |
ConfidenceAgent |
Fuegt calculate_confidence(), predict_top_n() hinzu |
Zusammenfassung: AIRust hat eine geschichtete Architektur: Agenten denken, die Wissensdatenbank speichert, und der Web-Server stellt die Oberflaeche bereit. Alles kommuniziert ueber saubere Rust-Traits.
5. Agenten-Typen — Das Gehirn von AIRust
5.1 MatchAgent (Exakt & Unscharf)
Der einfachste und schnellste Agent. Vergleicht deine Frage direkt mit den Trainingsdaten.
Exakt-Modus — findet Antworten nur bei genauer Uebereinstimmung (Gross-/Kleinschreibung egal):
let agent = new_exact;
Unscharf-Modus — toleriert Tippfehler mittels Levenshtein-Distanz:
let agent = new_fuzzy;
// Mit eigener Toleranz
let agent = new;
Wann nutzen: FAQ-Bots, Befehlserkennung, strukturierte Frage-Antwort-Systeme.
5.2 TfidfAgent (Semantische Suche mit BM25)
Nutzt den BM25-Algorithmus (derselbe wie in Suchmaschinen wie Elasticsearch), um die relevanteste Antwort anhand von Termhaeufigkeit und Dokumentenwichtigkeit zu finden.
let agent = new;
// Algorithmus feintunen
let agent = new
.with_bm25_params; // k1 = Term-Skalierung, b = Laengennorm
Wann nutzen: Dokumentensuche, Wissensdatenbanken mit natuerlichsprachlichen Fragen.
5.3 ContextAgent (Gespraechsspeicher)
Wickelt jeden anderen Agenten und fuegt Gespraechsspeicher hinzu. Erinnert sich an die letzten N Austausche, sodass Folgefragen natuerlich funktionieren.
let base = new;
let agent = new // 5 Runden merken
.with_context_format;
Kontext-Formate:
| Format | Beispiel |
|---|---|
QAPairs |
Q: Was ist Rust? A: Eine Programmiersprache. Q: ... |
List |
[Was ist Rust? -> Eine Programmiersprache, ...] |
Sentence |
Vorherige Fragen: Was ist Rust? - Eine Programmiersprache; ... |
Custom |
Eigene Formatierungsfunktion |
Wann nutzen: Chatbots, interaktive Assistenten, Folgefragen.
Zusammenfassung: AIRust bietet drei Agenten-Typen: MatchAgent fuer schnellen exakten/unscharfen Abgleich, TfidfAgent fuer intelligente semantische Suche, und ContextAgent fuer Gespraechsspeicher. Waehle je nach Anwendungsfall oder kombiniere sie.
6. Wissensdatenbank — Das Gedaechtnis
Die Wissensdatenbank ist der Ort, an dem alle Trainingsdaten liegen. Sie unterstuetzt zwei Modi:
Kompilierzeit-Einbettung
Daten aus knowledge/train.json werden beim Build in die Binaerdatei eingebettet:
let kb = from_embedded;
Laufzeit-Verwaltung
let mut kb = new;
kb.add_example;
kb.save?;
let kb = load?;
kb.merge;
Datenformat
Zusammenfassung: Die Wissensdatenbank speichert alles, was die KI weiss. Daten koennen zur Kompilierzeit eingebettet oder dynamisch zur Laufzeit verwaltet werden. Eintraege haben Gewichte (Wichtigkeit) und optionale Metadaten.
7. PDF-Verarbeitung — Aus Dokumenten lernen
AIRust kann automatisch Wissen aus PDF-Dokumenten extrahieren. Es teilt Text in intelligente Abschnitte und erstellt Trainingsbeispiele.
Kommandozeilen-Tool
# Einfache Konvertierung
# Eigener Ausgabepfad
# Volle Konfiguration
Im Code
let config = PdfLoaderConfig ;
let loader = with_config;
let kb = loader.pdf_to_knowledge_base?;
Mehrere Quellen zusammenfuehren
Zusammenfassung: Fuettere PDFs in AIRust und es erstellt automatisch strukturierte Trainingsdaten. Intelligentes Chunking beachtet Satzgrenzen und erhaelt Kontext durch ueberlappende Segmente.
8. Web-Dashboard — Die Steuerzentrale
Starte den Server mit cargo run und oeffne http://localhost:7070.
Dashboard-Tabs
| Tab | Was es tut |
|---|---|
| Chat | Mit dem KI-Agenten sprechen, Konfidenz-Scores sehen |
| Training | Trainingsdaten mit Kategorien verwalten, JSON importieren/exportieren |
| Knowledge | Wissensdatenbank durchsuchen, hinzufuegen, loeschen |
| Bots | Web-Scraping-Bots erstellen und verwalten |
| Data Review | Vom Bot gesammelte Daten genehmigen/ablehnen |
| Vectors | Vektor-Sammlungen und Embeddings verwalten |
| Files | Projektdateien und SQLite-Datenbank durchsuchen |
| Console | Echtzeit-WebSocket-Log-Viewer mit Shell-Zugriff |
| Settings | Theme (dunkel/hell), Sprache (EN/DE/TR), Akzentfarben |
Smarte Einstellungen via Chat
Du kannst Einstellungen aendern, indem du einfach schreibst:
- "Mach die Seite dunkel" — wechselt zum Dark Theme
- "Aendere zu gruenem Hintergrund" — aendert Akzentfarbe
- "Stelle auf Deutsch" — aendert UI-Sprache
Konsole — Echtzeit-Server-Terminal
Das Konsolen-Panel befindet sich am unteren Rand des Dashboards und funktioniert als Live-Terminal, das ueber WebSocket (/ws/console) mit deinem AIRust-Server verbunden ist.
Was es kann:
- Streamt jeden Server-Log (Anfragen, Fehler, Agenten-Aktivitaet) in Echtzeit in den Browser
- Du kannst Befehle direkt eingeben — sowohl eingebaute Befehle als auch beliebige Shell-Befehle
- Farbcodierte Ausgabe:
info(blau),warn(gelb),error(rot),cmd(gruen),stdout/stderr
Eingebaute Befehle:
| Befehl | Was er tut |
|---|---|
help oder ? |
Zeigt Liste der verfuegbaren Befehle |
clear |
Loescht die gesamte Konsolen-Ausgabe |
status |
Zeigt AIRust-Version, Server-Status und Arbeitsverzeichnis |
stop |
Faehrt den Server kontrolliert herunter |
restart |
Startet den Server-Prozess neu |
| alles andere | Wird als Shell-Befehl ausgefuehrt (z.B. ls, df -h, cat knowledge/train.json) |
UI-Funktionen:
| Funktion | Wie es funktioniert |
|---|---|
| Groesse aendern | Konsolen-Kopfzeile nach oben/unten ziehen |
| Minimieren/Erweitern | Klick auf den Toggle-Button in der Kopfzeile |
| Befehlsverlauf | Pfeiltasten (hoch/runter) zum Durchblaettern vorheriger Befehle |
| Auto-Reconnect | Bei WebSocket-Trennung wird automatisch alle 2 Sekunden neu verbunden |
| Verbindungsanzeige | Gruener Punkt = verbunden, roter Punkt = getrennt |
Technische Details:
- Server haelt einen Ringpuffer von 500 Log-Eintraegen — bei Verbindung erhaeltst du die gesamte Historie
- Client begrenzt auf 1000 DOM-Knoten, damit der Browser schnell bleibt
- WebSocket-Broadcast mit Fan-out: mehrere Browser-Tabs empfangen Logs gleichzeitig
- Unscharfe Befehlserkennung: bei Tippfehlern (z.B.
stausstattstatus) wird der korrekte Befehl vorgeschlagen - Shell-Befehle laufen asynchron ueber
tokio::spawn— lang laufende Befehle blockieren den Server nicht
┌────────────────────────────────────────────────────┐
│ Konsole [─] [drag] │
├────────────────────────────────────────────────────┤
│ 12:34:01 [info] Server gestartet auf Port 7070 │
│ 12:34:05 [info] POST /api/query → 200 (12ms) │
│ 12:35:10 [cmd] $ status │
│ 12:35:10 [info] AIRust v0.1.7 │
│ 12:35:10 [info] Server: running │
│ 12:36:00 [cmd] $ ls knowledge/ │
│ 12:36:00 [stdout] train.json │
├────────────────────────────────────────────────────┤
│ $ _ │
└────────────────────────────────────────────────────┘
Zusammenfassung: Das Web-Dashboard ist eine vollstaendige Steuerzentrale fuer deine KI: Chatten, Trainingsdaten verwalten, Bots steuern, gesammelte Daten pruefen, Dateien durchsuchen und Server-Befehle ueber die eingebaute Live-Konsole ausfuehren — alles im Browser.
9. Bot-System — Automatische Datensammlung
AIRust enthaelt ein eingebautes Web-Scraping-System zur automatischen Sammlung von Trainingsdaten.
Ablauf
1. Bot erstellen → URL und Konfiguration definieren
2. Bot-Lauf starten → Scraper sammelt Inhalte
3. Rohdaten pruefen → Eintraege genehmigen oder ablehnen
4. In KB konvertieren → Genehmigte Daten zum Training hinzufuegen
5. Agent neu trainieren → Agent lernt neues Wissen
Funktionen
- Web-Crawling: Konfigurierbare Tiefe und URL-Muster
- Deduplizierung: Content-Hashing verhindert Duplikate
- Manuelle Pruefung: Genehmigungs-Workflow sichert Datenqualitaet
- Lauf-Historie: Jede Bot-Ausfuehrung wird mit Statistiken verfolgt
- Zeitplanung: Automatische periodische Ausfuehrung
Zusammenfassung: Bots crawlen Websites und sammeln Textdaten automatisch. Ein manueller Pruefschritt sichert die Qualitaet, bevor die Daten in die Wissensdatenbank aufgenommen werden.
10. CLI — Kommandozeile
Abfrage-Modi
# Exakter Abgleich
# Unscharfer Abgleich (toleriert Tippfehler)
# Semantische Suche (am besten fuer natuerliche Sprache)
Interaktiver Modus
Oeffnet eine interaktive Sitzung mit Agenten-Auswahl und Echtzeit-Antworten.
Wissensdatenbank-Verwaltung
Zusammenfassung: Die CLI bietet schnellen Zugriff auf alle Agenten-Typen, einen interaktiven Chat-Modus, Wissensverwaltung und PDF-Konvertierung — perfekt zum Testen ohne Web-Server.
11. AIRust als Bibliothek nutzen
Einfaches Beispiel
use ;
Mit TF-IDF und Kontext
use *;
Zusammenfassung: Als Bibliothek gibt dir AIRust volle programmatische Kontrolle. Agenten erstellen, Wissen laden, trainieren und abfragen — alles in wenigen Zeilen Rust-Code.
12. Textverarbeitung
use text_utils;
// Tokenisierung
let tokens = tokenize;
// Stoppwort-Entfernung (Deutsch unterstuetzt)
let gefiltert = remove_stopwords;
// Entfernt: der, die, das, und, in, ist, von, mit, zu, ...
// Zeichenkettenaehnlichkeit
let dist = levenshtein_distance;
let sim = jaccard_similarity;
// N-Gramme
let bigramme = create_ngrams;
// Unicode-Normalisierung
let normalisiert = normalize_text;
Zusammenfassung: Eingebaute Text-Werkzeuge handhaben Tokenisierung, Stoppwort-Entfernung (EN/DE), Aehnlichkeitsmetriken, N-Gramme und Unicode-Normalisierung — ohne externe NLP-Bibliotheken.
13. Docker-Deployment
# Mit persistenter Datenbank
Zusammenfassung: Docker ermoeglicht ein sauberes Deployment: mehrstufiger Build haelt das Image klein. Ein Volume fuer die Datenbank-Persistenz mounten.
14. API-Referenz
Die vollstaendige API-Referenz findest du in der englischen Sektion. Alle Endpunkte sind identisch — ueber 50 REST-Endpunkte fuer Agenten, Wissen, Training, Bots, Chats, Vektoren, Dateien und Einstellungen.
Zusammenfassung: Ueber 50 REST-API-Endpunkte geben dir volle Kontrolle ueber das gesamte System. Dazu kommt ein WebSocket-Endpunkt fuer Echtzeit-Logging.
15. Konfiguration & Feature-Flags
| Flag | Standard | Beschreibung |
|---|---|---|
colors |
Ja | Farbige Terminal-Ausgabe |
web |
Ja | Web-Server + SQLite + Bot-System |
bots |
Ja (ueber web) | Web-Scraping |
async |
Ja (ueber web) | Async-Laufzeit (tokio) |
plotting |
Nein | Datenvisualisierung |
# Minimal (nur Bibliothek)
= { = "0.1.7", = false }
# Alles mit Plotting
= { = "0.1.7", = ["plotting"] }
Zusammenfassung: Feature-Flags kontrollieren, was kompiliert wird — von einer minimalen Bibliothek bis zur vollstaendigen Web-Plattform.
16. Trainingsdaten-Format
Identisch mit dem englischen Abschnitt. Unterstuetzt Text, Markdown, Json als Ausgabeformate. Gewichte und Metadaten sind optional. Legacy-Formate werden automatisch konvertiert.
Zusammenfassung: Trainingsdaten werden als JSON-Arrays gespeichert. Jeder Eintrag hat eine Eingabe (Frage), Ausgabe (Antwort), ein Gewicht und optionale Metadaten.
17. Projektstruktur
Siehe englische Sektion fuer den vollstaendigen Verzeichnisbaum.
Zusammenfassung: Das Projekt ist sauber organisiert: KI-Kernlogik in
src/, Web-Server insrc/web/, CLI-Tools insrc/bin/, Trainingsdaten inknowledge/.
18. Anwendungsbeispiele
- FAQ-Bot — Mit haeufig gestellten Fragen trainieren, als Web-Widget deployen
- Dokumentensuche — PDFs laden, durchsuchbare Wissensdatenbank aufbauen
- Kundensupport — Kontextbewusster Agent erinnert sich an das Gespraech
- Internes Wiki — Firmen-Wiki automatisch scrapen und Wissen aufbauen
- Entwickler-Dokumentation — API-Docs als PDFs laden
- Lernwerkzeug — Schueler stellen Fragen zu Kursmaterial
- IoT-Assistent — Minimale Binaerdatei, laeuft auf Embedded-Systemen
- Datenschutz-KI — Keine Cloud, keine Daten verlassen dein Netzwerk
Zusammenfassung: AIRust ist flexibel genug fuer FAQ-Bots, Dokumentensuche, Kundensupport, Bildung, IoT und jedes Szenario, in dem du eine private, trainierbare KI ohne Cloud-Abhaengigkeiten brauchst.
19. Versionshistorie
| Version | Neuerungen |
|---|---|
| 0.1.7 | Echtzeit-WebSocket-Konsole, Shell-Befehlsausfuehrung, Drag-to-Resize UI |
| 0.1.6 | PDF-Verarbeitung verbessert, Web-Dashboard |
| 0.1.5 | ContextAgent, ResponseFormat, erweitertes Matching, TF-IDF |
| 0.1.4 | TF-IDF/BM25-Agent |
| 0.1.3 | Englische Sprachunterstuetzung |
| 0.1.2 | Erstveroeffentlichung |
20. Lizenz
MIT — Frei fuer private und kommerzielle Nutzung.
Autor: LEVOGNE
Turkce
Icindekiler
- AIRust nedir?
- Ozellikler
- Kurulum
- Mimari Genel Bakis
- Ajan Turleri — AIRust'in Beyni
- Bilgi Tabani — Hafiza
- PDF Isleme — Belgelerden Ogrenme
- Web Paneli — Kontrol Merkezi
- Bot Sistemi — Otomatik Veri Toplama
- CLI — Komut Satiri
- AIRust'i Kutuphane Olarak Kullanma
- Metin Isleme
- Docker ile Dagitim
- API Referansi
- Yapilandirma & Ozellik Bayraklari
- Egitim Verisi Formati
- Proje Yapisi
- Kullanim Senaryolari
- Surum Gecmisi
- Lisans
1. AIRust nedir?
AIRust, tamamen Rust ile yazilmis bagimsiz bir yapay zeka motorudur. Bulut tabanli yapay zeka cozumlerinin aksine, AIRust %100 yerel calisir — OpenAI yok, API anahtari yok, internet gerekli degil. Kendi verilerinle egitirsin ve desen eslestirme, bulanik arama ve anlamsal benzerlik algoritmalari kullanarak sorulari yanitlar.
Sunu dusun: Kendin egittigin, kendi ozel yapay zeka asistanin.
Icerdikleri:
- Birden fazla akilli ajan turu (tam eslestirme, bulanik eslestirme, anlamsal arama)
- Sohbet arayuzlu yerlesik web paneli
- Otomatik bilgi cikarimi icin PDF belge isleme
- Otomatik veri toplama icin web kazima botlari
- Kalici depolama icin SQLite veritabani
- Diger sistemlerle entegrasyon icin tam REST API
Ozet: AIRust, Rust ile yazilmis yerel, egitilebilir bir yapay zeka motorudur. Bilgi beslersin (metin, PDF, web kazima) ve sorulari akilli bir sekilde yanitlar — bulut yok, API anahtari yok, tamamen gizli.
2. Ozellikler
| Ozellik | Aciklama |
|---|---|
| 4 Ajan Turu | Tam Eslestirme, Bulanik, TF-IDF/BM25 Anlamsal, Baglamsal |
| Bilgi Tabani | JSON tabanli, derleme zamaninda gomulu, calisma zamaninda genisletilebilir |
| PDF Isleme | PDF'leri yapilandirilmis egitim verisine donusturme |
| Web Paneli | Sohbet, egitim yoneticisi, bot kontrolu, dosya gezgini |
| Bot Sistemi | Inceleme is akisiyla otomatik web kazima |
| Vektor Veritabani | Gomme depolama ve benzerlik arama |
| Sohbet Gecmisi | Arsivleme ile kalici konusmalar |
| Cok Dilli Arayuz | Ingilizce, Almanca, Turkce |
| REST API | Tam programatik kontrol icin 50'den fazla ucnokta |
| WebSocket Konsol | Sunucu gunlukleri, kabuk erisimi ve yerlesik komutlarla canli terminal |
| Docker Destegi | Tek komutla dagitim |
| CLI Araclari | Etkilesimli mod, sorgulama, PDF donusturme |
Ozet: AIRust, bir yapay zeka sistemi olusturmak, egitmek ve dagitmak icin ihtiyaciniz olan her seyi saglar: ajanlar ve bilgi yonetiminden tam web arayuzu ve otomatik veri toplamaya kadar.
3. Kurulum
Rust Kutuphanesi Olarak
Cargo.toml dosyaniza ekleyin:
[]
= "0.1.7"
Kaynaktan Derleme
Web Sunucuyu Baslatma
# Varsayilan port 7070
# Ozel port
# Arka planda calistirma
# Karsilama sayfasi + pano goster (varsayilan: sadece pano)
# Arka plan sunucusunu durdurma
Ardindan tarayicinizda http://localhost:7070 adresini acin.
Docker ile
Ozet: AIRust'i kendi Rust projelerinizde kutuphane olarak kullanabilir, bagimsiz web sunucusu olarak calistirabilir veya Docker'da dagitabilirsiniz. Web paneli varsayilan olarak 7070 portunda erisilebildir.
4. Mimari Genel Bakis
┌──────────────────────────────────────┐
│ AIRust Motoru │
├──────────┬───────────┬───────────────┤
│MatchAgent│TfidfAgent │ ContextAgent │
│(tam/ │(BM25 │ (herhangi bir │
│ bulanik) │ anlamsal) │ ajani sarar) │
├──────────┴───────────┴───────────────┤
│ Bilgi Tabani │
│ (JSON / Gomulu / Calisma Zamani) │
├──────────────────────────────────────┤
│ Metin Isleme │
│ (tokenizasyon, durma sozcukleri, vb.) │
└────────┬──────────┬──────────────────┘
│ │
┌────────▼──┐ ┌────▼──────────────┐
│ CLI Araci│ │ Web Sunucu (Axum) │
│ (airust) │ │ + REST API │
└───────────┘ │ + WebSocket │
│ + SQLite DB │
│ + Bot Zamanlayici │
└───────────────────┘
Temel Trait'ler — Her ajan bu arayuzleri uygular:
| Trait | Amac |
|---|---|
Agent |
Temel: predict(), confidence(), can_answer() |
TrainableAgent |
train(), add_example() ekler |
ContextualAgent |
Konusma hafizasi icin add_context(), clear_context() ekler |
ConfidenceAgent |
calculate_confidence(), predict_top_n() ekler |
Ozet: AIRust katmanli bir mimariye sahiptir: ajanlar dusunur, bilgi tabani verileri depolar ve web sunucu arayuzu saglar. Her sey temiz Rust trait'leri uzerinden iletisim kurar.
5. Ajan Turleri — AIRust'in Beyni
5.1 MatchAgent (Tam & Bulanik)
En basit ve en hizli ajan. Sorunuzu dogrudan egitim verileriyle karsilastirir.
Tam Mod — sadece soru tam eslesmediginde yanitlar (buyuk/kucuk harf onemli degil):
let agent = new_exact;
Bulanik Mod — Levenshtein mesafesi kullanarak yazim hatalarini tolere eder:
let agent = new_fuzzy;
// Ozel tolerans ile
let agent = new;
Ne zaman kullanilir: SSS botlari, komut tanima, yapilandirilmis soru-cevap sistemleri.
5.2 TfidfAgent (BM25 ile Anlamsal Arama)
Terim sikligi ve belge onemine dayali en alakali yanitlari bulmak icin BM25 algortimasini (Elasticsearch gibi arama motorlarinda kullanilan ayni algoritma) kullanir.
let agent = new;
// Algortimayi ince ayarlama
let agent = new
.with_bm25_params; // k1 = terim olcekleme, b = uzunluk normalizasyonu
Ne zaman kullanilir: Belge arama, dogal dil sorulari olan bilgi tabanlari.
5.3 ContextAgent (Konusma Hafizasi)
Herhangi bir ajani sarar ve konusma hafizasi ekler. Son N konusmayi hatirlayarak takip sorularinin dogal bir sekilde calismalisini saglar.
let base = new;
let agent = new // 5 tur hatirla
.with_context_format;
Baglam Formatlari:
| Format | Ornek |
|---|---|
QAPairs |
S: Rust nedir? C: Bir programlama dili. S: ... |
List |
[Rust nedir? -> Bir programlama dili, ...] |
Sentence |
Onceki sorular: Rust nedir? - Bir programlama dili; ... |
Custom |
Kendi formatlama fonksiyonunuz |
Ne zaman kullanilir: Sohbet botlari, etkilesimli asistanlar, takip sorulari.
Ozet: AIRust uc ajan turu sunar: hizli tam/bulanik eslestirme icin MatchAgent, akilli anlamsal arama icin TfidfAgent ve konusma hafizasi icin ContextAgent. Kullanim durumunuza gore secin veya birlestirin.
6. Bilgi Tabani — Hafiza
Bilgi Tabani tum egitim verilerinin tutuldugu yerdir. Iki mod destekler:
Derleme Zamani Gomme
knowledge/train.json dosyasindaki veriler derleme sirasinda ikili dosyaya gomulur:
let kb = from_embedded;
Calisma Zamani Yonetimi
let mut kb = new;
kb.add_example;
kb.save?;
let kb = load?;
kb.merge;
Ozet: Bilgi Tabani, yapay zekanin bildigi her seyi depolar. Veriler derleme zamaninda gomulubilir veya calisma zamaninda dinamik olarak yonetilebilir. Girisler agirliklara ve opsiyonel meta verilere sahiptir.
7. PDF Isleme — Belgelerden Ogrenme
AIRust, PDF belgelerinden otomatik olarak bilgi cikarabilir.
Komut Satiri Araci
# Temel donusturme
# Ozel cikis yolu
# Tam yapilandirma
Kodda
let config = PdfLoaderConfig ;
let loader = with_config;
let kb = loader.pdf_to_knowledge_base?;
Birden Fazla Kaynak Birlestirme
Ozet: PDF'leri AIRust'a besleyin ve otomatik olarak yapilandirilmis egitim verileri olusturur. Akilli parcalama cumle sinirlarini dikkate alir ve cakisan segmentler araciligiyla baglami korur.
8. Web Paneli — Kontrol Merkezi
Sunucuyu cargo run ile baslatin ve http://localhost:7070 adresini acin.
Panel Sekmeleri
| Sekme | Ne yapar |
|---|---|
| Chat | Yapay zeka ajaninizla sohbet edin, guven puanlarini gorun |
| Training | Kategorilerle egitim verilerini yonetin, JSON iceri/disa aktar |
| Knowledge | Bilgi tabani girislerini arayin, ekleyin, silin |
| Bots | Web kazima botlari olusturun ve yonetin |
| Data Review | Botlarin topladigi verileri onaylayin/reddedin |
| Vectors | Vektor koleksiyonlari ve gommeleri yonetin |
| Files | Proje dosyalarini ve SQLite veritabanini gezin |
| Console | Kabuk erisimli gercek zamanli WebSocket gunluk goruntuleyici |
| Settings | Tema (karanlik/aydinlik), dil (EN/DE/TR), vurgu renkleri |
Sohbet ile Akilli Ayarlar
Ayarlari dogal bir sekilde yazarak degistirebilirsiniz:
- "Sayfayi karanlik yap" — karanlik temaya gecer
- "Turkce yap" — arayuz dilini degistirir
Konsol — Gercek Zamanli Sunucu Terminali
Konsol paneli, pano altinda yer alir ve WebSocket (/ws/console) uzerinden AIRust sunucunuza bagli bir canli terminal olarak calisir.
Ne yapar:
- Her sunucu gunlugunu (istekler, hatalar, ajan etkinligi) gercek zamanli olarak tarayiciniza aktarir
- Dogrudan komut yazabilirsiniz — hem yerlesik komutlar hem de rastgele kabuk komutlari
- Renk kodlu cikti:
info(mavi),warn(sari),error(kirmizi),cmd(yesil),stdout/stderr
Yerlesik Komutlar:
| Komut | Ne yapar |
|---|---|
help veya ? |
Kullanilabilir komutlarin listesini goster |
clear |
Tum konsol ciktisini temizle |
status |
AIRust surumu, sunucu durumu ve calisma dizinini goster |
stop |
Sunucuyu duzgun bir sekilde kapat |
restart |
Sunucu islemini yeniden baslat |
| diger her sey | Kabuk komutu olarak calistirilir (orn. ls, df -h, cat knowledge/train.json) |
Arayuz Ozellikleri:
| Ozellik | Nasil calisir |
|---|---|
| Boyut degistirme | Konsol baslik cubugunu yukari/asagi surukleyin |
| Kucultme/Genisletme | Baslikdaki degistirme dugmesine tiklayin |
| Komut gecmisi | Ok tuslari (yukari/asagi) ile onceki komutlar arasinda gezin |
| Otomatik yeniden baglanma | WebSocket baglantisi kesilirse her 2 saniyede otomatik yeniden baglanir |
| Baglanti gostergesi | Yesil nokta = bagli, kirmizi nokta = bagli degil |
Teknik Detaylar:
- Sunucu 500 gunluk girislik bir halka tamponu tutar — baglandiginizda tum gecmisi alirsiniz
- Istemci, tarayiciyi hizli tutmak icin 1000 DOM dugumune sinirlandirilmistir
- WebSocket yayini: birden fazla tarayici sekmesi ayni anda gunlukleri alir
- Bulanik komut eslestirme: yerlesik bir komutu yanlis yazarsaniz (orn.
stausyerinestatus), dogru komutu onerir - Kabuk komutlari
tokio::spawnile asenkron calisir — uzun sureli komutlar sunucuyu engellemez
┌────────────────────────────────────────────────────┐
│ Konsol [─] [drag] │
├────────────────────────────────────────────────────┤
│ 12:34:01 [info] Sunucu 7070 portunda baslatildi │
│ 12:34:05 [info] POST /api/query → 200 (12ms) │
│ 12:35:10 [cmd] $ status │
│ 12:35:10 [info] AIRust v0.1.7 │
│ 12:35:10 [info] Server: running │
│ 12:36:00 [cmd] $ ls knowledge/ │
│ 12:36:00 [stdout] train.json │
├────────────────────────────────────────────────────┤
│ $ _ │
└────────────────────────────────────────────────────┘
Ozet: Web paneli, yapay zekaniz icin eksiksiz bir kontrol merkezidir: sohbet edin, egitim verilerini yonetin, botlari kontrol edin, toplanan verileri inceleyin, dosyalari gezin ve yerlesik canli konsol uzerinden sunucu komutlari calistirin — hepsi tarayicida.
9. Bot Sistemi — Otomatik Veri Toplama
AIRust, web sitelerinden otomatik olarak egitim verisi toplamak icin yerlesik bir web kazima sistemi icerir.
Is Akisi
1. Bot olustur → URL ve yapilandirma tanimla
2. Bot calistir → Kaziyici icerik toplar
3. Ham verileri incele → Girisleri onayla veya reddet
4. KB'ye donustur → Onaylanan verileri egitime ekle
5. Ajani yeniden egit → Ajan yeni bilgi ogrenir
Ozellikler
- Web Tarama: Yapilandirilabilir derinlik ve URL kaliplari
- Tekrar Onleme: Icerik karma (hash) islemleri tekrarlari engeller
- Manuel Inceleme: Onay is akisi veri kalitesini saglar
- Calistirma Gecmisi: Her bot calistirmasini istatistiklerle izleyin
- Zamanlama: Otomatik periyodik calistirma
Ozet: Botlar web sitelerini tarar ve metin verilerini otomatik olarak toplar. Manuel inceleme adimi, veriler bilgi tabaniniza girmeden once kaliteyi saglar.
10. CLI — Komut Satiri
Sorgu Modlari
# Tam eslestirme
# Bulanik eslestirme (yazim hatalarini tolere eder)
# Anlamsal arama (dogal dil icin en iyisi)
Etkilesimli Mod
Ajan secimi ve gercek zamanli yanitlarla etkilesimli bir oturum acar.
Bilgi Tabani Yonetimi
Ozet: CLI, web sunucusu baslatmadan tum ajan turlerine, etkilesimli sohbet moduna, bilgi yonetimine ve PDF donusturmeye hizli erisim saglar.
11. AIRust'i Kutuphane Olarak Kullanma
Temel Ornek
use ;
TF-IDF ve Baglam ile
use *;
Ozet: Kutuphane olarak AIRust size tam programatik kontrol verir. Ajan olusturun, bilgi yukleyin, egitin ve sorgulayun — birkac satir Rust koduyla.
12. Metin Isleme
use text_utils;
// Tokenizasyon
let tokenlar = tokenize;
// Durma sozcugu kaldirma (Ingilizce ve Almanca desteklenir)
let filtrelenmis = remove_stopwords;
// Karakter dizisi benzerligi
let mesafe = levenshtein_distance;
let benzerlik = jaccard_similarity;
// N-gramlar
let bigramlar = create_ngrams;
// Unicode normalizasyonu
let normallesmis = normalize_text;
Ozet: Yerlesik metin araclari tokenizasyon, durma sozcugu kaldirma (EN/DE), benzerlik metrikleri, n-gramlar ve Unicode normalizasyonu islemlerini gerceklestirir — harici NLP kutuphanelerine gerek yok.
13. Docker ile Dagitim
# Kalici veritabani ile
Ozet: Docker temiz bir dagitim yolu saglar: cok asamali derleme imaji kucuk tutar. Veritabani kaliciligi icin bir birim baglayun.
14. API Referansi
Tam API referansi icin Ingilizce bolume bakin. Tum ucnoktalar aynidir — ajanlar, bilgi, egitim, botlar, sohbetler, vektorler, dosyalar ve ayarlar icin 50'den fazla REST ucnoktasi.
Ozet: 50'den fazla REST API ucnoktasi tum sistem uzerinde tam kontrol saglar. Ayrica gercek zamanli gunluk icin bir WebSocket ucnoktasi da vardir.
15. Yapilandirma & Ozellik Bayraklari
| Bayrak | Varsayilan | Aciklama |
|---|---|---|
colors |
Evet | Renkli terminal ciktisi |
web |
Evet | Web sunucu + SQLite + Bot sistemi |
bots |
Evet (web ile) | Web kazima |
async |
Evet (web ile) | Asenkron calisma zamani (tokio) |
plotting |
Hayir | Veri gorsellestirme |
# Minimal (sadece kutuphane)
= { = "0.1.7", = false }
# Grafik dahil her sey
= { = "0.1.7", = ["plotting"] }
Ozet: Ozellik bayraklari neyin derlenegecini kontrol eder — minimal bir kutuphaneden tam bir web platformuna kadar.
16. Egitim Verisi Formati
Ingilizce bolumle aynidir. Cikis formatlari olarak Text, Markdown, Json destekler. Agirliklar ve meta veriler opsiyoneldir. Eski formatlar otomatik olarak donusturulur.
Ozet: Egitim verileri JSON dizileri olarak depolanir. Her giris bir soru, yanit, agirlik ve opsiyonel meta verilere sahiptir.
17. Proje Yapisi
Tam dizin agaci icin Ingilizce bolume bakin.
Ozet: Proje temiz bir sekilde organize edilmistir: temel yapay zeka mantigi
src/icinde, web sunucusrc/web/icinde, CLI araclarisrc/bin/icinde, egitim verileriknowledge/icinde.
18. Kullanim Senaryolari
- SSS Botu — Sik sorulan sorularla egitin, web widget'i olarak dagitin
- Belge Arama — PDF'leri yukleyin, aranabilir bilgi tabani olusturun
- Musteri Destegi — Baglamsal ajan konusmayi hatirlar
- Dahili Wiki Botu — Sirket wiki'sini otomatik kaziyip bilgi olusturun
- Gelistirici Dokumantasyon Asistani — API belgelerini PDF olarak yukleyin
- Egitim Araci — Ogrenciler ders materyali hakkinda soru sorar
- IoT Cihaz Asistani — Minimal ikili dosya, gomulu sistemlerde calisir
- Gizlilik Oncelikli Yapay Zeka — Bulut yok, veri aginizi terk etmez
Ozet: AIRust; SSS botlari, belge arama, musteri destegi, egitim, IoT ve bulut bagimliligiolmadan ozel, egitilebilir bir yapay zekaya ihtiyac duydugunuz her senaryo icin yeterince esnektir.
19. Surum Gecmisi
| Surum | Yenilikler |
|---|---|
| 0.1.7 | Gercek zamanli WebSocket konsol, kabuk komut calistirma, surukle-boyutlandir UI |
| 0.1.6 | PDF isleme iyilestirmeleri, web paneli |
| 0.1.5 | ContextAgent, ResponseFormat, gelismis eslestirme, TF-IDF |
| 0.1.4 | TF-IDF/BM25 ajani |
| 0.1.3 | Ingilizce dil destegi |
| 0.1.2 | Ilk yayin |
20. Lisans
MIT — Kisisel ve ticari kullanim icin ucretsiz.
Yazar: LEVOGNE Depo: github.com/LEVOGNE/airust Dokumantasyon: docs.rs/airust