sipp-sys 0.1.0

Native llama.cpp FFI layer for Sipp
docs.rs failed to build sipp-sys-0.1.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

[!WARNING] Sipp is under active development. Breaking changes are expected as we optimize the runtime layers. It might not be suitable for mission-critical production environments yet. If you find issues, bugs, or missing features, please open a GitHub issue.

Read the documentation →

中文文档 →

What is Sipp?

Sipp is an all-in-one, high-performance AI framework for building web, desktop, and edge applications. It ships as a cohesive SDK with a unified, symmetric API for local, provider, and cloud gateway inference.

At its core is Sipp Engine, a blazing-fast runtime built to run anywhere: in the browser, on the desktop, or on bare-metal cloud infrastructure, that delivers low startup times and a minimal memory footprint.

import { SippClient } from '@sipp/sipp';
const blender = new SippClient();

// 1. Initialize high-speed, local WebGPU or CUDA inference
const juice = await blender.add('edge', { kind: 'local', source: '/models/llama3.gguf' });

// 2. Or connect to a secure cloud proxy using the exact same interface
const ice = await blender.add('cloud', { kind: 'gateway', baseUrl: 'https://gateway.example.com/v1/' });

// Run inference on either endpoint seamlessly with a symmetric API
const [smoothie, snowcone] = await Promise.all([
  blender.chat([{ role: 'user', content: 'Explain Sipp.' }], { endpoint: juice }),
  blender.chat([{ role: 'user', content: 'Create a Sipp app.' }], { endpoint: ice })
]);

The unified SDK lets you dynamically partition and optimize complex application logic between local and cloud compute. Instead of wrestling with fragmented web runtimes, disconnected native wrappers for desktop, or custom middleware to protect API keys, you only need Sipp.

It packages a high-performance WebGPU engine, with a secure container gateway proxy into a single, neat toolkit. Future releases will focus on embedded vector memory, on-device PII masking, and automated smart routing. See Roadmap.

sipp build wasm                # Compile high-performance WebGPU assets
sipp run demos serve chat      # Launch a local, hardware-accelerated test canvas

Performance Benchmarks

Run them yourself here: benchmark.sipp.sh/benchmark

Runtime / Framework TTFT (ms) ↓ Decode (tok/s) ↑ E2E Latency (ms) ↓
Sipp 24.3 (Best) 77.07 (Best) 6,655 (Best)
WebLLM 160.0 (6.55x) 25.80 (2.99x) 19,930 (2.99x)
Transformers.js 301.0 (12.38x) 33.25 (2.32x) 15,670 (2.35x)

Disclaimer & Metric Notes:

  • TTFT (Time to First Token): Measured in milliseconds (ms). Lower is better.
  • Decode: Measured in tokens per second (tok/s). Higher is better.
  • E2E Latency (End-to-End Latency): Measured in milliseconds (ms). Lower is better.
  • Performed on a Nvidia GTX 3080, 1 warm up, 3 measured runs. Results avg. of all measured runs.

Install

Sipp supports web browsers, desktop application wrappers, server environments, and native runtimes. Install the specific implementation layer for your surface environment:

# For Web Browsers, Next.js, and TanStack applications
npm install @sipp/sipp

# For Node.js backend deployments (with native CUDA/Metal compilation)
npm install @sipp/sipp-server

# For native systems development and application embedding
cargo add sipp-rs

# For Python automation and data engineering pipelines
# (sippy wheels ship from GitHub Releases today; full PyPI build matrix in progress)
# pip install sipppy

# Deploy the secure cloud gateway server instance via Docker
# (cloud gateway will be available in the future, currently building from source)
# docker pull noumena/sipp-gateway


Runtimes & Flavors

Most developers should start with our pre-built, published packages rather than compiling directly from the monorepo source.

Surface Module Install Docs
Browser Sipp Edge npm install @sipp/sipp Browser package
Node.js Sipp Core npm install @sipp/sipp-server Node.js package
Rust Sipp Core cargo add sipp-rs Rust package
Python Sipp Core Wheels available on release page Python package
Gateway Server Sipp Cloud Source-built Gateway Server
Gateway Toolkit Sipp Cloud Source-built Gateway toolkit

Quick Starts

1. Edge Quick Start (Hardware-Accelerated Client Inference)

Initialize the local engine client to execute model weights directly on the client machine's shader cores using WebGPU.

npm install @sipp/sipp

import { Client } from '@sipp/sipp';

const messages = [
  { role: 'system', content: 'Answer concisely.' },
  { role: 'user', content: 'Explain Sipp in one sentence.' },
];

const client = new Client();
const endpoint = await client.add('default', {
  kind: 'local',
  source: '/models/model.gguf',
});

const run = client.chat(messages, {
  endpoint,
  maxTokens: 64,
});

console.log((await run.response).text);
await client.close();

2. Cloud Gateway Quick Start (Preemptive Cloud Proxying)

Cloud gateway clients use the exact same Client API layout. The gateway owns model paths, provider credentials, access policies, and centralized metrics tracking; your client application code only needs the gateway routing target URL.

import { Client } from '@sipp/sipp';

const client = new Client();
const endpoint = await client.add('gateway', {
  kind: 'gateway',
  target: 'upstream-cluster',
  baseUrl: 'https://gateway.example.com/v1/',
  authentication: { kind: 'bearer', value: await getGatewayToken() },
});

const run = client.query('Explain gateway inference.', {
  endpoint,
  maxTokens: 64,
});

console.log((await run.response).text);
await client.close();


Native Web Framework Blueprints

Sipp includes native integration blueprints to handle Server-Sent Events (SSE) streaming, serverless route orchestration, and client hydration patterns out of the box.

  • Next.js: App Router route handlers, Client Components, gateway proxies, and streaming.
  • TanStack: TanStack Start server functions and TanStack Query patterns.
  • React And Vite: Browser package setup, WASM assets, OPFS model loading, and gateway examples.

Documentation

The full documentation lives in docs/en. From a source checkout, use the sipp docs CLI tool utility to build or serve the book resource:

sipp docs build
sipp docs serve

sipp docs automatically evaluates and installs required mdBook tooling when missing and configures the Mermaid compilation assets used by the technical book layout.


Technical Roadmap

Our core development trajectory is oriented around expanding the edge-cloud infrastructure for running hybrid systems, where local and cloud resources are orchestrated seamlessly.

For a detailed structural breakdown of milestones, memory architectures, and long-term research initiatives, see the full Sipp Technical Roadmap.


Maintainers & Contributors

To bootstrap the workspace workspace environment, initialize cross-platform profiles, and run structural unit assertions, utilize the integrated CLI environment scripts:

source ./setup.sh
sipp doctor
sipp test list

(On Windows platforms, execute .\setup.ps1 inside PowerShell or setup.cmd via classic CMD if not using Git Bash or WSL).

Common Architecture Compilation Tasks:

sipp build wasm && sipp run examples serve browser
sipp build node --backend cpu && node examples/node/query.mjs <model.gguf> "Explain Sipp."
sipp build python --backend cpu && python examples/python/query.py <model.gguf> "Explain Sipp."
sipp run demos serve chat

For thorough verification steps, consult the Source Builds Documentation and our full Testing Framework Suite.


Repository Layout

  • crates: The published core sipp-rs and low-level backend sipp-sys Rust crates.
  • lib: High-level language package surfaces and gateway proxy toolkit.
  • bindings: Native Node.js bindings, Python extensions, and browser-compiled WASM targets.
  • apps: First-party user interfaces and monitoring implementations.
  • examples: Small, runable framework integration blueprints.
  • demos: Advanced browser sandboxes running on public package surfaces.
  • tools/playground: Live browser-runtime profiling and hardware execution diagnostics.
  • xtask/: Internal cargo automation engine driving build, test, and package deployment pipelines.

License

Sipp is licensed under the Apache-2.0 License. Vendored third-party dependencies preserve their respective upstream open-source licensing constraints and documentation requirements; see the third-party notices.