1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
//! Pluggable token counting for context budget management.
//!
//! Provides the [`TokenCounter`] trait for text-to-token-count conversion,
//! with [`EstimateTokenCounter`] as a simple default.
use Arc;
/// Trait for counting tokens in text.
///
/// Implement this to plug in tiktoken, sentencepiece, or any
/// model-specific tokenizer for accurate context budget management.
///
/// # Examples
///
/// ```rust
/// use semantic_memory::TokenCounter;
///
/// struct MyTokenizer;
/// impl TokenCounter for MyTokenizer {
/// fn count_tokens(&self, text: &str) -> usize {
/// text.split_whitespace().count()
/// }
/// }
/// ```
/// Default token counter: estimates tokens as `len / 4`.
///
/// Acceptable for English prose (~4 chars per token on average).
/// Inaccurate for CJK text (~1 token per char), code, or structured data.
/// Replace with a real tokenizer for accurate budget management.
;
/// Create the default token counter (estimate-based).
pub