# Aurora DB Schema Design Guide
Aurora DB uses a strictly enforced but flexible schema system. This guide provides an in-depth look at how to design robust schemas that ensure data integrity while maintaining high performance.
## Collections & The Type System
In Aurora, every collection must have a schema. This schema acts as a contract for all documents stored within that collection.
### Fundamental Scalar Types
| `String` | `String` | Valid UTF-8 |
| `Int` | `i64` | 64-bit integer range |
| `Float` | `f64` | 64-bit floating point |
| `Boolean` | `bool` | `true` or `false` |
| `Uuid` | `uuid::Uuid` | Strict UUID format (v4 or v7) |
| `DateTime` | `chrono::DateTime<Utc>` | ISO 8601 UTC format |
| `Date` | `chrono::NaiveDate` | `YYYY-MM-DD` |
| `Time` | `chrono::NaiveTime` | `HH:MM:SS` |
### Specialized & Advanced Types
* **`ID`**: A specialized string type optimized for identifiers.
* **`Email`**: Automatically validates that the value is a correctly formatted email address.
* **`URL`**: Validates the value against standard URL specifications.
* **`JSON`**: Stores a structured object but requires it to be a valid map or array.
* **`Any`**: The "escape hatch." Any data type can be stored here. Useful for truly dynamic fields or during rapid prototyping.
* **`[Type]`**: Defines an array of a specific type (e.g., `[String]`, `[Int]`).
## Defining a Collection
Use the `define collection` statement to initialize a new collection.
```graphql
mutation {
schema {
define collection users {
id: Uuid! @primary
username: String! @unique
email: Email! @unique
age: Int @validate(min: 13, max: 120)
bio: String
created_at: DateTime = "now"
tags: [String]
metadata: JSON
}
}
}
```
### Type Modifiers
* **`!` (Required)**: By default, all fields are optional (nullable). Appending `!` makes the field mandatory.
* **`= value` (Default)**: Provides a default value if the field is omitted during insertion. Use `"now"` for current timestamps.
## Directives & Constraints
Directives are decorators that modify field behavior or add validation logic.
### Structural Directives
* **`@primary`**: Defines the application-level primary key. Only one field can be primary.
* **`@unique`**: Ensures no two documents have the same value for this field. Aurora uses optimized O(1) indices for unique checks.
* **`@indexed`**: Explicitly tells Aurora to create a secondary index (Roaring Bitmap) for this field to speed up queries.
### Validation Directives
* **`@validate(min: N, max: M)`**: Enforces range constraints on `Int` and `Float` fields.
* **`@validate(minLength: N, maxLength: M)`**: Enforces length constraints on `String` fields.
* **`@validate(pattern: "regex")`**: Validates `String` fields against a Regular Expression.
## Relationships & Foreign Keys
Aurora supports typed relations between collections using the `@relation` directive.
```graphql
mutation {
schema {
define collection posts {
id: Uuid! @primary
title: String!
content: String!
# Link to the users collection
author_id: Uuid! @relation(to: "users", key: "id")
}
}
}
```
**Why use `@relation`?**
1. **Referential Integrity**: Aurora can optionally check if the referenced ID exists in the target collection.
2. **Automatic Lookups**: Enables the `lookup` operation in AQL to perform manual joins efficiently.
3. **Graph Capabilities**: Forms the basis for future graph traversal features.
## The Data Purity Philosophy
Aurora follows a **"Pure Data"** philosophy.
### Internal `_sid` vs. App `id`
Every document has a private internal system ID called `_sid` (a time-ordered UUIDv7).
* **`_sid`**: Used by the engine for storage ordering, indexing, and efficient pagination. It is hidden from your query results by default.
* **`id`**: Your application's primary key. Aurora treats your `id` as just another data field. It never overwrites your `id` with the internal `_sid`.
This ensures that your application logic is never coupled to the database's internal tracking mechanisms.
## Modifying Schemas (Alteration)
Aurora supports non-destructive schema changes via `alter collection`.
```graphql
mutation {
schema {
alter collection users {
add status: String = "active"
drop bio
rename username to handle
}
}
}
```
### Best Practices for Schema Design
1. **Index Strategic Fields**: Only index fields used frequently in `where` or `order_by`. Every index adds a small write overhead.
2. **Use Specialized Types**: Prefer `Email` or `Uuid` over `String` for better validation and storage optimization.
3. **Required Fields**: Use `!` liberally for critical data to catch application bugs early.
4. **Relationships**: Always define `@relation` for foreign keys to enable optimized join performance.
# Aurora DB CRUD Operations Guide
This guide details how to perform Create, Read, Update, and Delete operations using the Aurora Query Language (AQL) and the Rust API.
## 1. Creating Documents
### Single Insert
Use `insertInto` to add a single document. Aurora automatically validates the data against the collection schema.
```graphql
mutation {
insertInto(
collection: "users",
data: {
id: "u1",
name: "Alice",
email: "alice@example.com",
tags: ["new-user", "beta-tester"]
}
) {
id
}
}
```
### Batch Insert (`insertMany`)
For high-volume ingestion, use `insertMany`. This is significantly faster than multiple single inserts as it minimizes write-ahead log (WAL) overhead.
```graphql
mutation {
insertMany(
collection: "products",
data: [
{ id: "p1", name: "Laptop", price: 1200.0 },
{ id: "p2", name: "Mouse", price: 25.0 }
]
) {
affected # Number of documents inserted
}
}
```
## 2. Reading Documents
Querying is performed via the `query` operation.
### Basic Retrieval
```graphql
query {
users {
name
email
}
}
```
### Filtering & Sorting
```graphql
query {
products(
where: { price: { lt: 50.0 } },
orderBy: { field: "price", direction: ASC },
limit: 10
) {
name
price
}
}
```
## 3. Updating Documents
### Update by ID or Filter
The `update` mutation modifies existing documents that match a specific criteria.
```graphql
mutation {
update(
collection: "users",
where: { id: { eq: "u1" } },
data: {
active: true,
last_login: "now"
}
) {
affected
}
}
```
### Atomic Modifiers
Aurora supports atomic field operations, preventing race conditions when multiple clients update the same document.
| `increment: N` | Adds N to a numeric field. |
| `decrement: N` | Subtracts N from a numeric field. |
| `push: val` | Appends a value to an array. |
| `pull: val` | Removes all instances of a value from an array. |
```graphql
mutation {
update(
collection: "posts",
where: { id: { eq: "post-123" } },
data: {
views: { increment: 1 },
tags: { push: "trending" }
}
)
}
```
## 4. Upsert (Update or Insert)
Upsert is an "idempotent" operation. If a document matching the `where` filter exists, it is updated. If not, a new document is created using the provided `data`.
```graphql
mutation {
upsert(
collection: "user_stats",
where: { user_id: { eq: "u1" } },
data: {
user_id: "u1",
login_count: { increment: 1 }
}
)
}
```
## 5. Deleting Documents
Use `deleteFrom` to permanently remove documents.
```graphql
mutation {
deleteFrom(
collection: "sessions",
where: { expires_at: { lt: "now" } }
) {
affected
}
}
```
## 6. ACID Transactions
Transactions ensure that a group of operations are treated as a single atomic unit. If any operation fails, the entire transaction is rolled back.
### The `transaction` Block
```graphql
mutation {
transaction {
# Operation 1: Deduct from sender
debit: update(
collection: "accounts",
where: { id: { eq: "acc-1" } },
data: { balance: { decrement: 100.0 } }
)
# Operation 2: Add to receiver
credit: update(
collection: "accounts",
where: { id: { eq: "acc-2" } },
data: { balance: { increment: 100.0 } }
)
}
}
```
### Why use transactions?
- **Consistency**: Prevents partial updates (e.g., money leaving one account but not arriving in the other).
- **Isolation**: Changes made within a transaction are invisible to other clients until the transaction is committed.
- **Durability**: Aurora's WAL ensures that committed transactions survive system crashes.
## 7. Using the Rust API
The Rust API provides macros for ultra-efficient CRUD operations.
### Efficient Parametrized Mutation
```rust
use aurora_db::doc;
let user_id = "u1";
let new_name = "Alice Updated";
db.execute(doc!(
"mutation($id: String, $name: String) {
update(collection: \"users\", where: { id: { eq: $id } }, data: { name: $name }) {
affected
}
}",
{ "id": user_id, "name": new_name }
)).await?;
```
For more details on macros, see [Variables and Macros](./querying.md#variables-and-macros).
# Aurora DB Querying Guide
This guide provides a comprehensive deep dive into the Aurora Query Language (AQL), exploring everything from basic retrieval to advanced aggregations and full-text search.
## 1. AQL Query Structure
AQL is inspired by GraphQL but optimized for document-database performance. A query consists of an operation name (`query`), a target collection, optional arguments (`where`, `limit`), and a **selection set** (the fields you want back).
```graphql
query {
users(limit: 5) {
name
email
}
}
```
## 2. Advanced Filtering (`where`)
Aurora supports rich, nested filtering logic using operator objects.
### Comparison Operators
| `eq` | Equal to |
| `ne` | Not equal to |
| `gt` / `gte` | Greater than / Greater than or equal to |
| `lt` / `lte` | Less than / Less than or equal to |
| `in` | Value exists in a provided list |
| `contains` | Substring match (strings) or element exists (arrays) |
| `startsWith` | Prefix match |
### Logical Operators (`and`, `or`, `not`)
```graphql
query {
users(where: {
or: [
{ and: [ { age: { gte: 18 } }, { status: { eq: "active" } } ] },
{ role: { eq: "admin" } }
]
}) {
name
}
}
```
## 3. Projections & Aliases
You can rename fields in your result set using aliases.
```graphql
query {
products {
displayName: name
current_price: price
}
}
```
### The `@defer` Directive
If a field is computationally expensive or large (like a long `bio`), you can mark it with `@defer`. Aurora will exclude it from the primary document result and list it in the `deferred_fields` metadata.
```graphql
query {
users {
name
bio @defer
}
}
```
## 4. Sorting & Pagination
### Ordering
```graphql
query {
posts(orderBy: { field: "created_at", direction: DESC }) {
title
}
}
```
### Offset Pagination
```graphql
query {
products(limit: 10, offset: 20) {
name
}
}
```
## 5. Aggregation & Grouping
Aurora can perform calculations across entire collections or result sets.
### Global Aggregation
```graphql
query {
sales {
stats: aggregate {
count
total_revenue: sum(field: "amount")
average_sale: avg(field: "amount")
max_sale: max(field: "amount")
}
}
}
```
### Group By
Group documents by a field and perform aggregations per group.
```graphql
query {
products {
groupBy(field: "category") {
key # The category name
count # Number of products in this category
stats: aggregate {
avg_price: avg(field: "price")
}
}
}
}
```
## 6. Manual Joins (`lookup`)
While Aurora is a document store, you can perform manual joins using the `lookup` selection. This is highly efficient when the join field is indexed.
```graphql
query {
orders {
id
total
# "Join" with the users collection
user: lookup(collection: "users", localField: "user_id", foreignField: "id") {
name
email
}
}
}
```
## 7. Full-Text Search
Aurora features a built-in search engine. To use it, you must first create a text index on the target fields.
```graphql
query {
articles(search: {
query: "distributed systems",
fields: ["title", "content"],
fuzzy: 1 # Allows for 1-character typos
}) {
title
score # Aurora returns a relevance score for search results
}
}
```
## 8. Computed Fields
You can generate new fields on the fly using **Templates** or **Pipes**.
### Template Strings
```graphql
query {
users {
# String interpolation
fullName: "${firstName} ${lastName}"
}
}
```
### Pipe Transformations
```graphql
query {
users {
# Transform data using built-in functions
upperName: name | uppercase
preview: bio | truncate(length: 50)
}
}
```
## 9. Variables and Rust Integration
Never concatenate strings to build queries. Use **Variables** for safety and speed.
```graphql
query($id: Uuid!) {
users(where: { id: { eq: $id } }) {
name
}
}
```
In Rust, use the `doc!` macro to bind values:
```rust
let id = "550e8400-e29b-41d4-a716-446655440000";
let result = db.execute(doc!(
"query($id: Uuid!) { users(where: { id: { eq: $id } }) { name } }",
{ "id": id }
)).await?;
```
# Aurora DB Computed Fields Guide
Computed fields allow you to derive new values from existing document data at retrieval time. This is perfect for formatting, aggregations, or business logic that doesn't need to be persisted but is frequently needed by the application.
## 1. Defining Computed Fields in AQL
The simplest way to use computed fields is directly within your query selection set.
### Template Strings
Ideal for basic string concatenation and interpolation.
```graphql
query {
users {
# Combine fields into a display label
label: "${firstName} ${lastName} (${email})"
}
}
```
### Pipe Expressions
Inspired by Unix pipes, this syntax allows you to chain transformations in a readable way.
```graphql
query {
posts {
title
# Convert to uppercase and truncate for a preview
header: title | uppercase | truncate(length: 20)
}
}
```
### Logical & Math Functions
You can use standard functional syntax for more complex logic.
```graphql
query {
products {
name
# Calculate final price after tax
total: multiply(price, 1.15) | round(decimals: 2)
# Conditional labels
status: if(gt(stock, 0), "In Stock", "Out of Stock")
}
}
```
## 2. Permanent Computed Fields (Retrieval-Time)
If you find yourself writing the same computed field in every query, you can register it permanently in the collection's schema. These fields will be automatically calculated and included in *every* query result for that collection.
### Registering via Rust API
```rust
use aurora_db::computed::ComputedExpression;
db.register_computed_field(
"users",
"fullName",
ComputedExpression::TemplateString("${firstName} ${lastName}".to_string())
).await?;
```
### Benefits of Permanent Fields
- **Consistency**: The logic lives in one place (the database schema).
- **Simplicity**: Clients don't need to know the calculation logic; they just request the field name.
- **Performance**: Aurora optimizes the calculation of registered fields during the retrieval phase.
## 3. Available Built-in Functions
Aurora provides a rich library of functions for your expressions:
| **String** | `uppercase`, `lowercase`, `trim`, `truncate`, `concat`, `replace` |
| **Math** | `add`, `subtract`, `multiply`, `divide`, `abs`, `round`, `floor`, `ceil` |
| **Logic** | `if`, `coalesce` (returns first non-null), `isNull`, `isNotNull` |
| **Date** | `now`, `formatDate`, `year`, `month`, `day` |
| **Type** | `toString`, `toInt`, `toFloat` |
## 4. Best Practices
1. **Prefer Retrieval-Time for Logic**: If the logic might change (e.g., tax rates), keep it as a computed field rather than persisting the result.
2. **Use `coalesce` for Defaults**: When interpolating optional fields, use `coalesce` to provide a fallback value.
- Example: `display: coalesce(nickname, firstName, "User")`
3. **Watch Selection Depth**: Computed fields that depend on `lookup` results can be expensive. Use them sparingly in large result sets.
4. **Use Pipes for Readability**: `name | trim | uppercase` is much easier to read than `uppercase(trim(name))`.
# Aurora DB Reactive Guide
Aurora DB features a first-class reactive system that allows your application to "watch" queries and receive updated result sets automatically whenever the underlying data changes.
## 1. The `QueryWatcher` (Rust API)
The `QueryWatcher` is the most powerful way to build reactive systems in Aurora. It automates the process of fetching initial data and subscribing to subsequent changes.
### Basic Usage
```rust
let mut watcher = db.query("users")
.filter(|f| f.eq("active", true))
.watch()
.await?;
// Receive initial results
let initial_users = watcher.initial_results();
println!("Initial active users: {}", initial_users.len());
// Listen for updates
while let Some(updated_list) = watcher.next().await {
println!("List updated! New count: {}", updated_list.len());
// update_ui(updated_list);
}
```
## 2. How it Works Internally
1. **Baseline**: When you call `.watch()`, Aurora executes the query and captures the initial result set.
2. **Subscription**: It automatically opens a PubSub listener for the target collection.
3. **Diffing**: Every time a mutation occurs in the collection, the watcher evaluates the change against your query filters.
4. **Re-Evaluation**: If the change affects your result set (e.g., a new document matches, or an existing one is deleted), the watcher re-calculates the query and emits the entire new list.
## 3. Debouncing for Performance
In high-write environments, a query might change 100 times per second. To prevent overwhelming your UI, you can use `.debounce()` to group updates together.
```rust
let mut watcher = db.query("logs")
.filter(|f| f.eq("level", "error"))
.debounce(std::time::Duration::from_millis(500)) // Emit at most twice per second
.watch()
.await?;
```
## 4. AQL Subscriptions (Network/CLI)
If you are building a client that connects to Aurora over a network (e.g., a WebSocket), you can use the `subscription` operation directly.
```graphql
subscription {
# Watch all active products
products(where: { active: { eq: true } }) {
mutation # "INSERT", "UPDATE", or "DELETE"
id # Document ID
node { # The full updated document
name
price
}
}
}
```
### Response Format
The subscription yields a stream of events:
```json
{
"mutation": "UPDATE",
"id": "p123",
"node": { "name": "Laptop", "price": 999.0, "active": true }
}
```
## 5. Use Cases
- **Live Dashboards**: Watch "stats" or "logs" collections to update charts in real-time.
- **Collaborative Apps**: Use reactive queries to show a list of online users or active tasks.
- **Cache Invalidation**: Automatically clear local application caches when the database state changes.
- **Reactive UIs**: Perfect for frameworks like React or SwiftUI—simply pipe the `QueryWatcher` stream into your state management system.
## 6. Performance Considerations
- **Filter Complexity**: Keep reactive query filters simple. The more complex the filter, the more work the watcher has to do for every database mutation.
- **Index Usage**: Ensure fields used in reactive filters are **indexed**. This allows the watcher to discard irrelevant mutations instantly.
- **Result Set Size**: `QueryWatcher` emits the **entire list** on every change. If your result set contains thousands of documents, this can consume significant memory and CPU. For large lists, consider using `limit` or subscribing to individual document IDs.
# Aurora DB PubSub Guide
Aurora DB's PubSub (Publish-Subscribe) system is the engine behind its real-time capabilities. It provides a low-latency, asynchronous way to react to data changes across the entire database.
## 1. Core Concepts
- **Topic**: In Aurora, topics are mapped to **Collection Names**.
- **Events**: Every mutation (`insertInto`, `update`, `deleteFrom`, `upsert`) generates a `ChangeEvent`.
- **Listeners**: Asynchronous consumers that receive events for specific collections or the entire database.
## 2. Using PubSub in Rust
The `Aurora` handle provides methods to create `ChangeListener` streams.
### Listening to a Specific Collection
```rust
let mut listener = db.listen("orders");
tokio::spawn(async move {
while let Ok(event) = listener.recv().await {
println!("Order {} was {}", event._sid, event.change_type);
}
});
```
### Listening to All Changes
Perfect for global audit logs or system monitoring.
```rust
let mut global_listener = db.listen_all();
```
## 3. The `ChangeEvent` Structure
When a change occurs, listeners receive a `ChangeEvent` object:
| `collection` | `String` | The name of the affected collection. |
| `change_type` | `ChangeType` | `Insert`, `Update`, or `Delete`. |
| `_sid` | `String` | The internal system ID of the affected document. |
| `document` | `Option<Document>` | The updated document data (for Insert/Update). |
| `old_document` | `Option<Document>` | The previous version of the document (for Update). |
## 4. Advanced Filtering (`EventFilter`)
You can attach filters to a listener to reduce noise and improve performance. Filtering happens on the publisher side, so irrelevant events are never even sent to your listener's channel.
```rust
use aurora_db::pubsub::EventFilter;
let mut admin_listener = db.listen("users")
.filter(EventFilter::FieldEquals("role".into(), Value::String("admin".into())));
```
### Available Filters
- `ChangeType(type)`: Only listen for specific operations (e.g., `Insert`).
- `FieldEquals(field, value)`: Only listen for documents where a field matches a specific value.
- `FieldChanged(field)`: Only listen for updates that modified a specific field.
## 5. Subscriptions (AQL)
For network-based clients (like a WebSocket-connected frontend), use the AQL `subscription` operation.
```graphql
subscription {
# Listen for new high-value orders
orders(where: { total: { gt: 1000.0 } }) {
mutation # "INSERT", "UPDATE", "DELETE"
id
node {
total
customer_name
}
}
}
```
## 6. Performance & Architecture
- **Asynchronous Delivery**: Change events are published to an internal broadcast channel. This ensures that write operations (mutations) are not slowed down by slow listeners.
- **Zero-Copy when possible**: Aurora uses `Arc` internally to share document data between multiple listeners without expensive cloning.
- **Channel Capacity**: Each listener has a bounded internal buffer. If a listener is too slow and its buffer fills up, it will start missing events (lag detection).
## 7. Best Practices
1. **Filter Early**: Always use `EventFilter` or AQL `where` clauses to minimize the number of events your application logic has to process.
2. **Idempotent Handlers**: Design your event handlers to be idempotent. In rare crash scenarios, an event might be delivered more than once.
3. **Non-Blocking**: Never perform heavy I/O or long-running synchronous tasks directly inside an event loop. Spawn a new task instead.
4. **Use `listen_all` Sparingly**: Only use global listeners for truly cross-cutting concerns like auditing or replication.
# Aurora DB Durable Workers Guide
Aurora includes a built-in, persistent job queue system. This allows you to offload time-consuming tasks to the background while ensuring they survive system restarts and are retried automatically on failure.
## 1. Core Architecture
The worker system consists of three main components:
- **Job Queue**: A persistent collection in Aurora that stores pending, active, and failed jobs.
- **Handlers**: Logic defined in Rust that knows how to process a specific "job type."
- **Executor**: A background system that pulls jobs from the queue based on priority and schedules them for execution across a thread pool.
## 2. Enabling the System (Rust)
To use workers, you must enable them in your initial configuration.
```rust
let config = AuroraConfig {
workers_enabled: true, // Critical: enables the system
worker_threads: 4, // Parallel processing threads
..Default::default()
};
let db = Aurora::with_config(config).await?;
```
## 3. Registering Handlers
A handler is any struct that implements the `JobHandler` trait.
```rust
use aurora_db::workers::{Job, JobHandler, JobResult};
use async_trait::async_trait;
struct ImageProcessor;
#[async_trait]
impl JobHandler for ImageProcessor {
async fn handle(&self, job: &Job) -> JobResult {
let path = job.payload["path"].as_str().unwrap();
println!("Processing image at {}...", path);
// ... heavy processing logic ...
Ok(())
}
}
// Register the handler with the database's worker system
db.workers().unwrap().register_handler("process_image", Box::new(ImageProcessor));
```
## 4. Enqueuing Jobs
### Via AQL
```graphql
mutation {
enqueueJob(
type: "process_image",
payload: { path: "/uploads/cat.jpg" },
priority: NORMAL,
retries: 5
) {
jobId
}
}
```
### Via Rust API
```rust
db.enqueue_job("process_image", object!({ "path": "/uploads/dog.png" }), 0).await?;
```
## 5. Automation Handlers (`on` events)
Aurora can automatically enqueue jobs in response to database events. This is similar to "Triggers" in traditional databases but executes asynchronously in the background.
```graphql
# When a new product is added, automatically generate its thumbnails
define handler "auto_thumbnail" {
on: "insert:products",
action: {
enqueueJob(
type: "process_image",
payload: {
product_id: "${id}",
path: "${image_url}"
}
)
}
}
```
## 6. Priority & Retries
### Priority Levels
Jobs are processed in order of priority:
1. **CRITICAL**: Processed immediately, ahead of all others.
2. **HIGH**: Priority tasks.
3. **NORMAL**: Default level.
4. **LOW**: Background/maintenance tasks.
### Error Handling & Retries
- If a handler returns `Ok(())`, the job is marked as `COMPLETED`.
- If it returns `Err(e)`, Aurora increments the retry counter.
- If `retries` < `max_retries`, the job is scheduled for a **Backoff Retry** (the delay increases after each failure).
- If `retries` reaches `max_retries`, the job is marked as `FAILED`.
## 7. Monitoring Jobs
Since jobs are just documents in a special internal collection, you can query them using AQL:
```graphql
query {
_jobs(where: { status: { eq: "FAILED" } }) {
id
type
error
last_retry
}
}
```
## 8. Best Practices
1. **Keep Payloads Small**: Don't store large blobs in the job payload. Store a file path or document ID instead.
2. **Idempotency**: Ensure your handlers can safely run multiple times. If a job is retried after a partial failure, it should not cause inconsistent state.
3. **Graceful Shutdown**: The Aurora worker system automatically waits for active jobs to finish (up to a timeout) when the database is closed.
4. **Avoid Long-Running Sync Code**: Since the handlers are `async`, avoid `std::thread::sleep` or blocking I/O. Use `tokio::time::sleep` or `spawn_blocking` for heavy CPU tasks.
# Aurora DB Performance Optimization Guide
Aurora is designed for extreme performance in both read-heavy and write-heavy workloads. This guide provides an in-depth look at the engine's architecture and how to tune it for maximum efficiency.
## 1. Tiered Storage Architecture
Aurora uses a unique "Hot/Cold" storage system to balance speed and durability.
### The Hot Path (In-Memory)
- **DashMap/Moka**: Aurora keeps frequently accessed documents in a concurrent hash map or a high-performance LRU cache (`Moka`).
- **Performance**: Up to **1,000,000+ reads/sec**.
- **Optimization**: Ensure your `hot_cache_size_mb` is large enough to hold your most active "working set."
### The Cold Path (Persistent)
- **Sled/MMAP**: Persistent data lives in highly optimized B-Trees (`Sled`) or memory-mapped files.
- **Performance**: SSD-bound, typically **50,000+ reads/sec**.
- **Optimization**: Use high-performance NVMe drives for best results.
## 2. Advanced Indexing
Indexing is the single most important factor for query performance.
### Primary Index
- **Field**: `id` / `_sid`.
- **Performance**: Always O(1). Use direct ID lookups whenever possible.
### Secondary Indices (Roaring Bitmaps)
When you mark a field as `@indexed` or `@unique`, Aurora creates a compressed **Roaring Bitmap** index.
- **Bitwise Intersection**: When you query with multiple `where` filters (e.g., `age > 18 AND status == "active"`), Aurora performs a bitwise intersection of the two bitmaps. This is incredibly fast and completely avoids scanning the actual documents.
- **Memory Efficiency**: Roaring Bitmaps are highly compressed, allowing you to index millions of documents with minimal memory overhead.
### Full-Text Search Indices
- **Usage**: Use for keyword searching in large text fields.
- **Ranking**: Aurora uses a built-in relevance scoring algorithm based on term frequency and distance.
## 3. Query Optimization Best Practices
### Use `explain`
Before deploying a critical query, use `db.explain()` or the `@explain` directive in AQL to see how Aurora will execute it.
- **Look for**: `index_used: true`. If `false`, Aurora is performing a full collection scan (O(n)), which is slow for large collections.
### Selective Projections
Always select only the fields you need.
- **Why?**: Bypasses the overhead of serializing large documents.
- **Example**: `query { users { name } }` is much faster than fetching full user profiles.
### Use `@defer`
For fields containing large blobs or long text, use the `@defer` directive. This allows Aurora to return the primary document immediately while delaying the retrieval of the "heavy" fields.
## 4. Tuning the Engine (Rust)
```rust
let config = AuroraConfig {
// Increase for larger working sets
hot_cache_size_mb: 1024,
// Enable for 3-5x faster write throughput
enable_write_buffering: true,
// Adjust based on your durability requirements
durability_mode: DurabilityMode::WAL,
// Background worker threads for indexing/compaction
worker_threads: 4,
..Default::default()
};
```
## 5. Write Performance
### Batching
The overhead of a mutation is largely in the Write-Ahead Log (WAL) and disk sync.
- Use `insertMany` for bulk data.
- Use `transaction` blocks to group multiple updates.
### Write Buffering
Enabling `enable_write_buffering` allows Aurora to collect writes in memory and flush them to disk in large, contiguous chunks. This dramatically improves throughput on slow disks.
## 6. Benchmarks (Production-Ready)
| **Point Lookup (ID)** | ~1,200,000 ops/sec | O(1) |
| **Indexed Filter** | ~450,000 ops/sec | O(log N) |
| **Bitwise AND Filter** | ~300,000 ops/sec | O(log N) |
| **Full Collection Scan** | ~15,000 docs/sec | O(N) |
| **Concurrent Writes** | ~80,000 ops/sec | O(1) |
*Note: These benchmarks are achieved using the Rust API with `doc!` and `object!` macros.*