pub const EMBEDDING_LOAD_EXPECTED_RSS_MB: u64 = 1_100;Expand description
Expected RSS in MiB for a single instance with the ONNX model loaded via fastembed.
v1.0.75 (G18 solution): preserved for embedding-legacy builds. LLM-only
builds use a much lower LLM_WORKER_RSS_MB constant. The formula
min(cpus, available_memory_mb / EMBEDDING_LOAD_EXPECTED_RSS_MB) * 0.5
was also reworked to drop the halving factor (the 0.5 margin was the
root cause of G18). See crate::lock::calculate_safe_concurrency.