offline-intelligence 0.1.2

High-performance library for offline AI inference with context management and memory optimization
Documentation

Offline Intelligence Library

A high-performance library for offline AI inference with context management, memory optimization, and multi-format model support.

Overview

Offline Intelligence Library provides developers with powerful tools for running large language models locally without internet connectivity. The library offers intelligent context management, memory optimization, and hybrid search capabilities across multiple programming languages.

Features

  • Offline AI Inference: Run LLMs locally without internet connection
  • Context Management: Intelligent conversation context optimization
  • Memory Search: Hybrid semantic and keyword search across conversations
  • Multi-format Support: Support for GGUF, GGML, ONNX, TensorRT, and Safetensors models
  • Cross-platform: Works on Windows, macOS, and Linux
  • Multi-language: Native bindings for Python, Java, JavaScript/Node.js, and C++

Supported Languages

  • Python (PyO3 bindings)
  • Java (JNI bindings)
  • JavaScript/Node.js (N-API bindings)
  • C++ (C FFI bindings)
  • Rust (Native library)

Quick Start

Prerequisites

  • Rust toolchain (latest stable)
  • For language-specific bindings, see individual requirements below

Building

Using Build Scripts

# Linux/macOS

./build.sh


# Windows

build.bat


# Using Make (Linux/macOS)

make all

Manual Build

# Build core library

cargo build --release


# Build specific language bindings

cd crates/python-bindings && cargo build --release

cd crates/java-bindings && cargo build --release

cd crates/js-bindings && cargo build --release

cd crates/cpp-bindings && cargo build --release

Language-Specific Usage

Python

from offline_intelligence_py import OfflineIntelligence, Message

oi = OfflineIntelligence()
messages = [Message("user", "Hello!"), Message("assistant", "Hi there!")]
result = oi.optimize_context("session123", messages, "Hello")

Java

import com.offlineintelligence.*;

OfflineIntelligence oi = new OfflineIntelligence();
Message[] messages = {
    new Message("user", "Hello!"),
    new Message("assistant", "Hi there!")
};
OptimizationResult result = oi.optimizeContext("session123", messages, "Hello");

JavaScript/Node.js

const { OfflineIntelligence, Message } = require('offline-intelligence');

const oi = new OfflineIntelligence();
const messages = [
    new Message('user', 'Hello!'),
    new Message('assistant', 'Hi there!')
];
const result = await oi.optimizeContext('session123', messages, 'Hello');

C++

#include "offline_intelligence_cpp.h"

using namespace offline_intelligence;

OfflineIntelligence oi;
std::vector<Message> messages = {
    Message("user", "Hello!"),
    Message("assistant", "Hi there!")
};
auto result = oi.optimize_context("session123", messages, "Hello");

Configuration

Set environment variables before using the library:

export LLAMA_BIN="/path/to/llama-binary"

export MODEL_PATH="/path/to/model.gguf"

export CTX_SIZE="8192"

export BATCH_SIZE="256"

export THREADS="6"

export GPU_LAYERS="20"

Documentation

Development

Running Tests

# Run all tests

cargo test --release


# Run tests for specific crate

cd crates/offline-intelligence && cargo test

Code Formatting

# Format all code

cargo fmt --all


# Check formatting

cargo fmt --all -- --check

Linting

# Run clippy

cargo clippy --all-targets --all-features -- -D warnings

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Acknowledgments

  • Built with Rust for performance and reliability
  • Uses various ML frameworks for model support
  • Inspired by the need for offline AI capabilities