Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
[!WARNING] Sipp is under active development. Breaking changes are expected as we optimize the runtime layers. It might not be suitable for mission-critical production environments yet. If you find issues, bugs, or missing features, please open a GitHub issue.
Read the documentation →
中文文档 →
What is Sipp?
Sipp is an all-in-one, high-performance AI framework for building web, desktop, and edge applications. It ships as a cohesive SDK with a unified, symmetric API for local, provider, and cloud gateway inference.
At its core is Sipp Engine, a blazing-fast runtime built to run anywhere: in the browser, on the desktop, or on bare-metal cloud infrastructure, that delivers low startup times and a minimal memory footprint.
import from '@sipp/sipp';
const blender = ;
// 1. Initialize high-speed, local WebGPU or CUDA inference
const juice = await blender.;
// 2. Or connect to a secure cloud proxy using the exact same interface
const ice = await blender.;
// Run inference on either endpoint seamlessly with a symmetric API
const = await Promise.;
The unified SDK lets you dynamically partition and optimize complex application logic between local and cloud compute. Instead of wrestling with fragmented web runtimes, disconnected native wrappers for desktop, or custom middleware to protect API keys, you only need Sipp.
It packages a high-performance WebGPU engine, with a secure container gateway proxy into a single, neat toolkit. Future releases will focus on embedded vector memory, on-device PII masking, and automated smart routing. See Roadmap.
Performance Benchmarks
Run them yourself here: benchmark.sipp.sh/benchmark
| Runtime / Framework | TTFT (ms) ↓ | Decode (tok/s) ↑ | E2E Latency (ms) ↓ |
|---|---|---|---|
| Sipp | 24.3 (Best) | 77.07 (Best) | 6,655 (Best) |
| WebLLM | 160.0 (6.55x) | 25.80 (2.99x) | 19,930 (2.99x) |
| Transformers.js | 301.0 (12.38x) | 33.25 (2.32x) | 15,670 (2.35x) |
Disclaimer & Metric Notes:
- TTFT (Time to First Token): Measured in milliseconds (ms). Lower is better.
- Decode: Measured in tokens per second (tok/s). Higher is better.
- E2E Latency (End-to-End Latency): Measured in milliseconds (ms). Lower is better.
- Performed on a Nvidia GTX 3080, 1 warm up, 3 measured runs. Results avg. of all measured runs.
Install
Sipp supports web browsers, desktop application wrappers, server environments, and native runtimes. Install the specific implementation layer for your surface environment:
# For Web Browsers, Next.js, and TanStack applications
# For Node.js backend deployments (with native CUDA/Metal compilation)
# For native systems development and application embedding
# For Python automation and data engineering pipelines
# (sippy wheels ship from GitHub Releases today; full PyPI build matrix in progress)
# pip install sipppy
# Deploy the secure cloud gateway server instance via Docker
# (cloud gateway will be available in the future, currently building from source)
# docker pull noumena/sipp-gateway
Runtimes & Flavors
Most developers should start with our pre-built, published packages rather than compiling directly from the monorepo source.
| Surface | Module | Install | Docs |
|---|---|---|---|
| Browser | Sipp Edge | npm install @sipp/sipp |
Browser package |
| Node.js | Sipp Core | npm install @sipp/sipp-server |
Node.js package |
| Rust | Sipp Core | cargo add sipp-rs |
Rust package |
| Python | Sipp Core | Wheels available on release page | Python package |
| Gateway Server | Sipp Cloud | Source-built | Gateway Server |
| Gateway Toolkit | Sipp Cloud | Source-built | Gateway toolkit |
Quick Starts
1. Edge Quick Start (Hardware-Accelerated Client Inference)
Initialize the local engine client to execute model weights directly on the client machine's shader cores using WebGPU.
import from '@sipp/sipp';
const messages = ;
const client = ;
const endpoint = await client.;
const run = client.;
console.log;
await client.;
2. Cloud Gateway Quick Start (Preemptive Cloud Proxying)
Cloud gateway clients use the exact same Client API layout. The gateway owns model paths, provider credentials, access policies, and centralized metrics tracking; your client application code only needs the gateway routing target URL.
import from '@sipp/sipp';
const client = ;
const endpoint = await client.;
const run = client.;
console.log;
await client.;
Native Web Framework Blueprints
Sipp includes native integration blueprints to handle Server-Sent Events (SSE) streaming, serverless route orchestration, and client hydration patterns out of the box.
- Next.js: App Router route handlers, Client Components, gateway proxies, and streaming.
- TanStack: TanStack Start server functions and TanStack Query patterns.
- React And Vite: Browser package setup, WASM assets, OPFS model loading, and gateway examples.
Documentation
The full documentation lives in docs/en. From a source checkout, use the sipp docs CLI tool utility to build or serve the book resource:
sipp docs automatically evaluates and installs required mdBook tooling when missing and configures the Mermaid compilation assets used by the technical book layout.
Technical Roadmap
Our core development trajectory is oriented around expanding the edge-cloud infrastructure for running hybrid systems, where local and cloud resources are orchestrated seamlessly.
For a detailed structural breakdown of milestones, memory architectures, and long-term research initiatives, see the full Sipp Technical Roadmap.
Maintainers & Contributors
To bootstrap the workspace workspace environment, initialize cross-platform profiles, and run structural unit assertions, utilize the integrated CLI environment scripts:
(On Windows platforms, execute .\setup.ps1 inside PowerShell or setup.cmd via classic CMD if not using Git Bash or WSL).
Common Architecture Compilation Tasks:
&&
&&
&&
For thorough verification steps, consult the Source Builds Documentation and our full Testing Framework Suite.
Repository Layout
- crates: The published core
sipp-rsand low-level backendsipp-sysRust crates. - lib: High-level language package surfaces and gateway proxy toolkit.
- bindings: Native Node.js bindings, Python extensions, and browser-compiled WASM targets.
- apps: First-party user interfaces and monitoring implementations.
- examples: Small, runable framework integration blueprints.
- demos: Advanced browser sandboxes running on public package surfaces.
- tools/playground: Live browser-runtime profiling and hardware execution diagnostics.
xtask/: Internal cargo automation engine driving build, test, and package deployment pipelines.
License
Sipp is licensed under the Apache-2.0 License. Vendored third-party dependencies preserve their respective upstream open-source licensing constraints and documentation requirements; see the third-party notices.