Offline Intelligence Library

A high-performance library for offline AI inference with context management, memory optimization, and multi-format model support.

Overview

Offline Intelligence Library provides developers with powerful tools for running large language models locally without internet connectivity. The library offers intelligent context management, memory optimization, and hybrid search capabilities across multiple programming languages.

Features

Offline AI Inference: Run LLMs locally without internet connection
Context Management: Intelligent conversation context optimization
Memory Search: Hybrid semantic and keyword search across conversations
Multi-format Support: Support for GGUF, GGML, ONNX, TensorRT, and Safetensors models
Cross-platform: Works on Windows, macOS, and Linux
Multi-language: Native bindings for Python, Java, JavaScript/Node.js, and C++

Supported Languages

Python (PyO3 bindings)
Java (JNI bindings)
JavaScript/Node.js (N-API bindings)
C++ (C FFI bindings)
Rust (Native library)

Quick Start

Prerequisites

Rust toolchain (latest stable)
For language-specific bindings, see individual requirements below

Building

Using Build Scripts

# Linux/macOS

./build.sh


# Windows

build.bat


# Using Make (Linux/macOS)

make all

Manual Build

# Build core library

cargo build --release


# Build specific language bindings

cd crates/python-bindings && cargo build --release

cd crates/java-bindings && cargo build --release

cd crates/js-bindings && cargo build --release

cd crates/cpp-bindings && cargo build --release

Language-Specific Usage

Python

from offline_intelligence_py import OfflineIntelligence, Message

oi = OfflineIntelligence()
messages = [Message("user", "Hello!"), Message("assistant", "Hi there!")]
result = oi.optimize_context("session123", messages, "Hello")

Java

import com.offlineintelligence.*;

OfflineIntelligence oi = new OfflineIntelligence();
Message[] messages = {
    new Message("user", "Hello!"),
    new Message("assistant", "Hi there!")
};
OptimizationResult result = oi.optimizeContext("session123", messages, "Hello");

JavaScript/Node.js

const { OfflineIntelligence, Message } = require('offline-intelligence');

const oi = new OfflineIntelligence();
const messages = [
    new Message('user', 'Hello!'),
    new Message('assistant', 'Hi there!')
];
const result = await oi.optimizeContext('session123', messages, 'Hello');

C++

#include "offline_intelligence_cpp.h"

using namespace offline_intelligence;

OfflineIntelligence oi;
std::vector<Message> messages = {
    Message("user", "Hello!"),
    Message("assistant", "Hi there!")
};
auto result = oi.optimize_context("session123", messages, "Hello");

Configuration

Set environment variables before using the library:

export LLAMA_BIN="/path/to/llama-binary"

export MODEL_PATH="/path/to/model.gguf"

export CTX_SIZE="8192"

export BATCH_SIZE="256"

export THREADS="6"

export GPU_LAYERS="20"

Documentation

Development

Running Tests

# Run all tests

cargo test --release


# Run tests for specific crate

cd crates/offline-intelligence && cargo test

Code Formatting

# Format all code

cargo fmt --all


# Check formatting

cargo fmt --all -- --check

Linting

# Run clippy

cargo clippy --all-targets --all-features -- -D warnings

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Acknowledgments

Built with Rust for performance and reliability
Uses various ML frameworks for model support
Inspired by the need for offline AI capabilities

offline-intelligence 0.1.2