1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
//! Async wrappers for blocking Collection operations.
//!
//! EPIC-034/US-005: Provides async bulk insert API using `spawn_blocking`
//! to avoid blocking the async executor during I/O-intensive operations.
//!
//! # Why `spawn_blocking`?
//!
//! Collection operations like bulk insert involve:
//! - Memory-mapped file writes (blocking syscalls)
//! - HNSW index updates (CPU-intensive)
//! - Payload storage writes (blocking I/O)
//!
//! These operations can stall the async runtime. This module wraps them
//! to run on Tokio's blocking thread pool.
//!
//! # Usage
//!
//! ```rust,ignore
//! use velesdb_core::{VectorCollection, Point};
//! use velesdb_core::collection::async_ops;
//! use std::sync::Arc;
//!
//! async fn bulk_import(collection: Arc<VectorCollection>) -> Result<usize, Error> {
//! let points: Vec<Point> = generate_points();
//! async_ops::upsert_bulk_async(collection, points).await
//! }
//! ```
use Arc;
use crate;
use cratePoint;
use Collection;
/// Asynchronously inserts or updates multiple points in bulk.
///
/// Wraps `Collection::upsert_bulk()` in `spawn_blocking` to avoid
/// blocking the async executor during I/O and HNSW operations.
///
/// # Performance
///
/// - Uses parallel HNSW insertion internally
/// - Single batch WAL write for efficiency
/// - Benchmark: 25-30 Kvec/s on 768D vectors
///
/// # Arguments
///
/// * `collection` - Arc-wrapped collection instance
/// * `points` - Vector of points to insert/update
///
/// # Errors
///
/// Returns an error if any point has a mismatched dimension or if
/// the blocking task panics.
pub async
/// Asynchronously inserts or updates points with streaming support.
///
/// Processes points in chunks to provide progress updates and
/// avoid memory pressure for very large imports.
///
/// # Arguments
///
/// * `collection` - Arc-wrapped collection instance
/// * `points` - Vector of points to insert/update
/// * `chunk_size` - Number of points per batch (default: 10000)
/// * `on_progress` - Optional callback for progress updates
///
/// # Returns
///
/// Total number of points successfully inserted.
///
/// # Errors
///
/// Returns an error if any insert operation fails.
pub async
/// Asynchronously flushes the collection to disk.
///
/// Wraps `Collection::flush()` in `spawn_blocking` to avoid blocking
/// the async executor during disk sync operations.
///
/// # Errors
///
/// Returns an error if file operations fail or if the blocking task panics.
pub async
/// Asynchronously searches for nearest neighbors.
///
/// Wraps `Collection::search()` in `spawn_blocking` for consistency,
/// though search is typically fast enough that this may not be necessary.
///
/// # Arguments
///
/// * `collection` - Arc-wrapped collection instance
/// * `query` - Query vector
/// * `k` - Number of nearest neighbors to return
///
/// # Errors
///
/// Returns an error if the query dimension doesn't match.
pub async
// Tests moved to async_ops_tests.rs per project rules