Application Bootstrap Provider
Overview
The Application Bootstrap Provider is a bootstrap plugin for Drasi that replays stored insert events during query subscription. It works in conjunction with the Application Source to provide initial data (bootstrap) to queries when they start or subscribe to a data source.
This provider implements in-memory replay of historical insert events, enabling queries to establish their initial state before processing real-time updates. It's designed primarily for testing scenarios, embedded applications, and situations where you need programmatic control over bootstrap data.
Key Capabilities
- In-Memory Event Replay: Stores and replays insert events that were previously sent through an Application Source
- Label-Based Filtering: Filters bootstrap events based on node and relation labels requested by queries
- Shared Data Storage: Can share bootstrap data with Application Source instances via
Arc<RwLock<Vec<SourceChange>>> - Isolated Storage Mode: Can operate independently with its own isolated bootstrap data storage
- Event Counting: Tracks and reports the number of bootstrap events replayed to each query
How It Works
- Event Storage: Insert events sent through an Application Source are stored in a shared vector
- Bootstrap Request: When a query subscribes to a source, it sends a bootstrap request with required labels
- Event Filtering: The provider filters stored events to match the requested node/relation labels
- Event Replay: Matching events are counted and made available to the query
- Completion: The provider reports the count of replayed events
Use Cases
Ideal for:
- Testing query behavior with pre-populated data
- Embedded Drasi applications that manage their own data
- Programmatic data generation scenarios
- Development and debugging environments
- Scenarios requiring precise control over bootstrap data
Not suitable for:
- Production systems requiring persistent bootstrap state (use PostgreSQL or Platform sources instead)
- Large datasets (all data is stored in memory)
- Distributed systems where bootstrap data must survive restarts
Architecture
The Application Bootstrap Provider uses a simple architecture:
ApplicationSource
|
| Stores Insert Events
v
bootstrap_data: Arc<RwLock<Vec<SourceChange>>>
^
| Shared Reference
|
ApplicationBootstrapProvider
|
| Replays Events on Bootstrap Request
v
Query Subscription
Components
- ApplicationBootstrapProvider: The main provider struct that implements
BootstrapProvidertrait - bootstrap_data: Shared vector of
SourceChange::Insertevents - ApplicationBootstrapProviderBuilder: Fluent builder for creating provider instances
Data Sharing
The provider can operate in two modes:
- Shared Data Mode: Shares
bootstrap_datawith an Application Source, allowing it to replay actual insert events - Isolated Mode: Uses its own independent storage (useful for testing the provider itself)
Configuration
The Application Bootstrap Provider does not use YAML configuration. It is created programmatically and configured through code.
Builder Configuration
The provider uses a builder pattern for configuration:
use ApplicationBootstrapProvider;
// Create with isolated storage (for testing)
let provider = builder.build;
// Create with shared storage (for production use)
use Arc;
use RwLock;
use SourceChange;
let shared_data = new;
let provider = builder
.with_shared_data
.build;
Constructor Options
The provider offers multiple construction methods:
| Method | Description | Use Case |
|---|---|---|
new() |
Creates provider with isolated storage | Testing provider behavior |
with_shared_data(Arc<RwLock<Vec<SourceChange>>>) |
Creates provider sharing storage with Application Source | Normal usage |
builder() |
Returns builder for fluent configuration | Flexible configuration |
default() |
Same as new() |
Default construction |
Usage Examples
Basic Setup with Application Source
The most common usage is connecting the bootstrap provider to an Application Source:
use ApplicationBootstrapProvider;
use ApplicationSource;
use SourceConfig;
use HashMap;
use mpsc;
// Create Application Source
let config = SourceConfig ;
let = channel;
let = channel;
let = new;
// Create Bootstrap Provider sharing the source's bootstrap data
// Note: This requires accessing the source's internal bootstrap_data field
// In practice, ApplicationSource handles this internally
let bootstrap_data = source.get_bootstrap_data;
let provider = with_shared_data;
Using the Builder Pattern
use ApplicationBootstrapProvider;
use Arc;
use RwLock;
use SourceChange;
// Create shared bootstrap storage
let bootstrap_data = new;
// Build provider with shared data
let provider = builder
.with_shared_data
.build;
// The bootstrap_data can now be populated by other components
// (typically by ApplicationSource when it receives insert events)
Testing with Isolated Storage
use ApplicationBootstrapProvider;
use ;
async
Storing Insert Events
The provider offers methods to manage stored events (typically called by Application Source):
use ApplicationBootstrapProvider;
use ;
let provider = new;
// Store an insert event
let element = Node ;
provider.store_insert_event.await;
// Get all stored events (for inspection or testing)
let events = provider.get_stored_events.await;
println!;
// Clear stored events (for testing or reset)
provider.clear_stored_events.await;
Label-Based Filtering
The provider automatically filters events based on query requirements:
// When a query requests bootstrap with specific labels:
// BootstrapRequest {
// query_id: "my-query",
// node_labels: vec!["Person", "Employee"],
// relation_labels: vec!["WORKS_FOR"],
// }
// The provider will only replay events that match:
// - Nodes with labels "Person" OR "Employee"
// - Relations with label "WORKS_FOR"
// - All events if both label lists are empty
API Reference
ApplicationBootstrapProvider
Constructor Methods
new() -> Self
Creates a new provider with isolated bootstrap data storage.
Returns: A provider instance with its own independent storage
Example:
let provider = new;
with_shared_data(bootstrap_data: Arc<RwLock<Vec<SourceChange>>>) -> Self
Creates a provider sharing bootstrap data with an Application Source.
Parameters:
bootstrap_data: Shared reference to the Application Source's bootstrap data
Returns: A provider instance connected to shared storage
Example:
let shared_data = new;
let provider = with_shared_data;
builder() -> ApplicationBootstrapProviderBuilder
Creates a builder for fluent configuration.
Returns: Builder instance for constructing the provider
Example:
let provider = builder
.with_shared_data
.build;
default() -> Self
Creates a provider using default settings (same as new()).
Returns: A provider instance with isolated storage
Event Management Methods
store_insert_event(&self, change: SourceChange) -> ()
Stores an insert event for future bootstrap replay.
Parameters:
change: ASourceChange::Insertevent (other variants are ignored)
Behavior:
- Only stores
Insertevents; ignoresUpdate,Delete, andFutureevents - Events are appended to the bootstrap data vector
- Thread-safe (uses async write lock)
Example:
let element = Node ;
provider.store_insert_event.await;
get_stored_events(&self) -> Vec<SourceChange>
Returns a copy of all stored insert events.
Returns: Vector of stored SourceChange::Insert events
Use Cases:
- Testing and verification
- Debugging bootstrap behavior
- Inspecting stored data
Example:
let events = provider.get_stored_events.await;
println!;
clear_stored_events(&self) -> ()
Removes all stored events from bootstrap storage.
Use Cases:
- Resetting state between tests
- Clearing stale data
- Reinitializing bootstrap state
Example:
provider.clear_stored_events.await;
assert_eq!;
ApplicationBootstrapProviderBuilder
Methods
new() -> Self
Creates a new builder instance.
Example:
let builder = new;
with_shared_data(self, data: Arc<RwLock<Vec<SourceChange>>>) -> Self
Configures the builder to use shared bootstrap data.
Parameters:
data: Shared reference to bootstrap data storage
Returns: Builder instance for method chaining
Example:
let builder = new
.with_shared_data;
build(self) -> ApplicationBootstrapProvider
Constructs the final ApplicationBootstrapProvider instance.
Returns: Configured provider instance
Example:
let provider = new
.with_shared_data
.build;
default() -> Self
Creates a builder using default settings.
Example:
let provider = default.build;
BootstrapProvider Trait Implementation
bootstrap(&self, request: BootstrapRequest, context: &BootstrapContext, event_tx: BootstrapEventSender) -> Result<usize>
Processes a bootstrap request from a query subscription.
Parameters:
request: Bootstrap request containing query ID and required labelscontext: Bootstrap context (currently unused by this provider)event_tx: Channel for sending bootstrap events (currently unused - ApplicationSource handles event sending)
Returns:
Result<usize>: Number of matching events found, or error if bootstrap fails
Behavior:
- Logs the bootstrap request details
- Reads stored insert events from shared storage
- Filters events based on requested node and relation labels
- Counts matching events
- Returns the count (actual event transmission is handled by ApplicationSource)
Label Matching Logic:
- Node labels: Event matches if any node label matches any requested label
- Relation labels: Event matches if any relation label matches any requested label
- Empty label lists: Matches all events of that type
- Both lists empty: Matches all events
Integration with Application Source
The Application Bootstrap Provider is designed to work seamlessly with the Application Source:
Event Flow
Application Code
|
| handle.send_node_insert(...)
v
ApplicationSource
|
+-- Stores Insert Event in bootstrap_data
|
+-- Forwards Event to Query Engine
v
Query Engine
Bootstrap Flow
Query Subscription Request
|
v
ApplicationSource.subscribe()
|
| Reads bootstrap_data
|
+-- Filters by Requested Labels
|
+-- Sends Bootstrap Events
v
Query Initialization
Important Notes
-
Current Implementation: As of the current version, ApplicationSource handles bootstrap directly in its
subscribe()method (lines 337-384 insources/application/mod.rs). The ApplicationBootstrapProvider exists for testing and potential future integration where bootstrap logic might be delegated to the provider system. -
Data Connection: To connect the provider to an Application Source, you must share the same
Arc<RwLock<Vec<SourceChange>>>reference between them. -
Event Storage: Only
Insertevents are stored.UpdateandDeleteevents are not included in bootstrap data, as bootstrap represents the initial creation state, not the current state.
Data Types
SourceChange
Events that can be stored and replayed:
BootstrapRequest
Request sent by queries when subscribing:
Performance Considerations
Memory Usage
- All events stored in memory: The provider stores every insert event in a
Vec<SourceChange> - No size limits: The vector grows unbounded with the number of insert events
- Recommendation: For large datasets, consider using a persistent bootstrap provider (PostgreSQL, Platform)
Concurrency
- Thread-safe: Uses
RwLockfor concurrent access to bootstrap data - Read-heavy optimization: Multiple readers can access bootstrap data simultaneously
- Write blocking: Storing events acquires a write lock, blocking all readers and writers
Scalability
| Scenario | Performance Impact |
|---|---|
| Few insert events (< 1000) | Excellent - minimal overhead |
| Medium datasets (1000-10000) | Good - vector operations efficient |
| Large datasets (> 10000) | Poor - consider persistent storage |
| High write rate | Moderate - write lock contention possible |
| Many concurrent queries | Good - read locks don't block each other |
Comparison with Other Bootstrap Providers
Application Bootstrap vs PostgreSQL Bootstrap
| Feature | Application | PostgreSQL |
|---|---|---|
| Storage | In-memory | Database snapshot |
| Persistence | Lost on restart | Survives restarts |
| Data Source | Programmatic | Database tables |
| Performance | Fastest (in-memory) | Slower (database query) |
| Use Case | Testing, embedded | Production systems |
Application Bootstrap vs Platform Bootstrap
| Feature | Application | Platform |
|---|---|---|
| Storage | In-memory vector | Remote Query API + Redis Streams |
| Data Source | Direct API calls | External Drasi instance |
| Complexity | Minimal | High (requires external services) |
| Use Case | Single-instance apps | Multi-environment integration |
Known Limitations
- No Persistence: All bootstrap data is lost when the application restarts
- Memory Bounded: Large datasets will consume significant memory
- Insert Events Only: Updates and deletes are not reflected in bootstrap data
- No Deduplication: Duplicate insert events with the same element ID are stored
- Manual Management: Application code must ensure bootstrap data is populated correctly
- No Automatic Cleanup: Old or deleted elements remain in bootstrap storage
- Testing Focused: Primarily designed for testing scenarios, not production use
Best Practices
For Testing
// Clear bootstrap data between tests
async
For Embedded Applications
// Share bootstrap data with Application Source
let bootstrap_data = new;
let provider = with_shared_data;
let = with_bootstrap_data;
// Bootstrap data is automatically populated as you send events
handle.send_node_insert.await?;
For Development
// Inspect bootstrap data during debugging
let events = provider.get_stored_events.await;
for in events.iter.enumerate
Troubleshooting
Bootstrap Data Not Replaying
Problem: Queries don't receive expected bootstrap data
Solutions:
- Verify the provider shares the same
Arc<RwLock<Vec<SourceChange>>>as the Application Source - Check that insert events are being stored (use
get_stored_events()) - Ensure query label filters match stored event labels
- Verify the provider is registered with the bootstrap system
Memory Usage Growing
Problem: Application memory usage increases over time
Solutions:
- Periodically call
clear_stored_events()if bootstrap data is no longer needed - Consider implementing a maximum event limit
- Switch to a persistent bootstrap provider for production
- Use a different source type (PostgreSQL, Platform) for large datasets
Bootstrap Events Don't Match Current State
Problem: Bootstrap data reflects old state after updates/deletes
Explanation: This is expected behavior. Bootstrap only stores insert events, not the cumulative effect of updates and deletes.
Solutions:
- Accept that bootstrap represents creation events, not current state
- Use a different source type that supports state snapshots (PostgreSQL)
- Manually rebuild bootstrap data from current state when needed
Thread Safety
The Application Bootstrap Provider is fully thread-safe:
- Concurrent Reads: Multiple threads can call
get_stored_events()simultaneously - Concurrent Writes:
store_insert_event()safely serializes writes - Async-Safe: All methods use async locks compatible with Tokio runtime
- Clone-Safe: The underlying
Arc<RwLock<>>can be cloned and shared
Dependencies
The provider requires these crates:
[]
= { = "../../../lib" }
= { = "../../../core" }
= "1.0"
= "0.1"
= "0.4"
= { = "1.0", = ["sync"] }
License
Licensed under the Apache License, Version 2.0. See the LICENSE file for details.
Further Reading
- Application Source Documentation: See
/drasi-core/components/sources/application/src/README.mdfor details on the Application Source - Bootstrap System: See
/drasi-lib/src/bootstrap/mod.rsfor the bootstrap provider trait - Source Changes: See
/drasi-core/core/src/models/forSourceChangeandElementdefinitions