# Heap Allocation Audit Report for ADIF Crate
## Executive Summary
Total Issues Identified: 32
- High Impact: 6 issues
- Medium Impact: 13 issues
- Low Impact: 13 issues
Hot Path Allocations per Record (estimated):
- Parsing: 3-5 allocations per field
- Writing: 3 allocations per field
Quick Win Opportunities: 8 issues fixable with minimal code changes
================================================================================
HIGH IMPACT ISSUES
================================================================================
1. FIELD NAME CASE CONVERSION IN PARSER (HOT PATH) ✅ DONE
Location: src/parse/mod.rs:97
Current:
let name = String::from_utf8_lossy(name).to_lowercase();
Why it allocates:
- from_utf8_lossy() returns Cow that may allocate
- to_lowercase() always allocates new String
- Called for EVERY field parsed
Optimization:
let name = str::from_utf8(name)
.map_err(|_| Self::invalid_tag(tag))?
.to_ascii_lowercase();
Impact: HIGH - Called once per field in every record
Status: FIXED - Now uses str::from_utf8 and to_ascii_lowercase()
---
2. TYPE SPECIFIER CASE CONVERSION IN PARSER ✅ DONE
Location: src/parse/mod.rs:102
Current:
let typ = typ.map(|t| String::from_utf8_lossy(t).to_lowercase());
Why it allocates:
- Same pattern as field names
- Type specifiers are typically single chars ('b', 'n', 'd', 't', 's')
Optimization:
let typ = typ.map(|t| str::from_utf8(t)).transpose().map_err(|_| Self::invalid_tag(tag))?;
Impact: HIGH - Called once per typed field
Status: FIXED - Now uses str::from_utf8 (type lowercasing handled elsewhere)
---
3. STRING VALUE ALLOCATION IN PARSER ✅ DONE
Location: src/parse/mod.rs:81
Current:
_ => Ok(Datum::String(s.to_string())),
Context: s is already Cow<str> from line 110
Why it allocates:
- When s is already owned, we allocate again with to_string()
- Should use into_owned() to avoid double allocation
Optimization:
_ => Ok(Datum::String(s.to_owned())),
Impact: HIGH - Called for every untyped string field (most common)
Status: FIXED - Changed caller to use str::from_utf8 instead of String::from_utf8_lossy,
eliminating the Cow allocation. Now single allocation from &str → String.
---
4. BOOLEAN STRING CONVERSION DURING PARSING ✅ DONE
Location: src/parse/mod.rs:64
Current:
let b = match s.to_uppercase().as_str() {
"Y" => true,
"N" => false,
_ => return Err(Self::invalid_tag(tag)),
};
Why it allocates:
- to_uppercase() allocates for single character comparison
- Boolean values are always 'Y' or 'N' (single ASCII char)
Optimization:
let b = match s {
"Y" | "y" => true,
"N" | "n" => false,
_ => return Err(Self::invalid_tag(tag)),
};
Impact: HIGH - Called for every boolean field
Status: FIXED - Now uses pattern matching instead of to_uppercase()
BONUS: Also refactored .map_err(|_| Self::invalid_tag(tag))? duplication
into a helper closure: let err = || Self::invalid_tag(tag);
---
5. RECORD FIELD CLONING IN WRITE PATH ✅ DONE
Location: src/write/mod.rs:221
Current:
for (name, value) in item.fields() {
let field = Field::new(name.clone(), value.clone());
Pin::new(&mut self.inner).start_send(Tag::Field(field))?;
}
Why it allocates:
- Clones both name (String) and value (Datum) for every field
- Field is immediately consumed, so cloning unnecessary
Optimization:
Change Field to support borrowing or encode directly without intermediate Field
Impact: HIGH - Called for every field during writing
Status: FIXED - Added Record::into_fields() consuming iterator that returns owned values,
eliminating both clone() calls in src/write/mod.rs:221
---
6. VALUE LENGTH CONVERSION TO STRING IN WRITER ✅ DONE
Location: src/write/mod.rs:133
Current:
dst.put_slice(value.len().to_string().as_bytes());
Why it allocates:
- to_string() allocates for simple integer
- Frequent allocation in write path
Optimization:
Use itoa crate or manual buffer:
let mut buf = itoa::Buffer::new();
dst.put_slice(buf.format(value.len()).as_bytes());
Impact: MEDIUM-HIGH - Called once per field during writing
Status: FIXED - Added itoa dependency and replaced to_string() with stack-allocated
itoa::Buffer in src/write/mod.rs:133-134
================================================================================
MEDIUM IMPACT ISSUES
================================================================================
7. DATUM::AS_STR() FORMAT ALLOCATIONS
Location: src/lib.rs:146-150
Current:
Self::Number(n) => Some(Cow::Owned(n.to_string())),
Self::Date(d) => Some(Cow::Owned(d.format("%Y%m%d").to_string())),
Self::Time(t) => Some(Cow::Owned(t.format("%H%M%S").to_string())),
Self::DateTime(dt) => Some(Cow::Owned(dt.format("%Y%m%d %H%M%S").to_string())),
Why it allocates:
- Every format call allocates
- Called during write operations and conversions
Optimization:
Pre-allocate buffer or use thread-local buffer for formatting
Impact: MEDIUM - Called during write operations
---
8. DATUM::AS_BOOL() STRING ALLOCATION ✅ DONE
Location: src/lib.rs:83
Current:
Self::String(s) => match s.to_uppercase().as_str() {
"Y" => Some(true),
"N" => Some(false),
_ => None,
},
Why it allocates:
- to_uppercase() allocates unnecessarily
Optimization:
Self::String(s) => match s.as_str() {
"Y" | "y" => Some(true),
"N" | "n" => Some(false),
_ => None,
},
Impact: MEDIUM - Called during type coercion
Status: FIXED - Uses pattern matching instead of to_uppercase()
---
9. FILTER: BAND NORMALIZATION ✅ DONE
Location: src/filter/mod.rs:224
Current:
let _ = record
.insert(":band".to_string(), Datum::String(band.to_uppercase()));
Why it allocates:
- to_uppercase() always allocates
- ":band".to_string() allocates literal string
- Wasteful if band already uppercase
Optimization:
const BAND_FIELD: &str = ":band";
let band = if band.chars().all(|c| c.is_uppercase() || !c.is_alphabetic()) {
band.to_string()
} else {
band.to_uppercase()
};
let _ = record.insert(BAND_FIELD, Datum::String(band));
Impact: MEDIUM - Called per record when band normalization used
Status: FIXED - Uses const for field name, only uppercases if needed
---
10. FILTER: MODE NORMALIZATION ✅ DONE
Location: src/filter/mod.rs:193
Current:
let _ = record.insert(":mode".to_string(), Datum::String(mode.to_string()));
Why it allocates:
- ":mode".to_string() allocates literal
- mode is &str, must allocate String
Optimization:
const MODE_FIELD: &str = ":mode";
let _ = record.insert(MODE_FIELD, Datum::String(mode.into_owned()));
Impact: MEDIUM - Called per record when mode normalization used
Status: FIXED - Uses const for field name and .into_owned() instead of .to_string()
---
11. FILTER: TIME FIELD NAME ALLOCATIONS ✅ DONE
Location: src/filter/mod.rs:136, 151
Current:
let _ = record.insert(":time_on".to_string(), Datum::DateTime(dt));
let _ = record.insert(":time_off".to_string(), Datum::DateTime(dt));
Why it allocates:
- Literal strings allocated every time
Optimization:
const TIME_ON_FIELD: &str = ":time_on";
const TIME_OFF_FIELD: &str = ":time_off";
let _ = record.insert(TIME_ON_FIELD, Datum::DateTime(dt));
let _ = record.insert(TIME_OFF_FIELD, Datum::DateTime(dt));
Impact: MEDIUM - Called per record when time normalization used
Status: FIXED - Uses const for both field names
---
12. FILTER: EXCLUDE CALLSIGNS HASHSET
Location: src/filter/mod.rs:237-238
Current:
let exclude: HashSet<String> =
callsigns.iter().map(|c| c.to_uppercase()).collect();
Why it allocates:
- Creates HashSet with uppercase copies of all callsigns
- Could use case-insensitive comparison instead
Optimization:
Use case-insensitive wrapper or just iterate if list is small:
stream.filter(move |record| {
let Some(call) = record.get("call").and_then(|c| c.as_str()) else {
return true;
};
!callsigns.iter().any(|e| e.eq_ignore_ascii_case(&call))
})
Impact: MEDIUM - One-time allocation, proportional to excluded callsign count
---
13. ERROR MESSAGE FORMATTING
Location: src/lib.rs:382
Current:
Err(Error::InvalidFormat(format!("duplicate key: {}", e.key())))
Why it allocates:
- format!() in error path
Optimization:
let mut msg = String::with_capacity(20 + e.key().len());
msg.push_str("duplicate key: ");
msg.push_str(e.key());
Err(Error::InvalidFormat(msg))
Impact: LOW-MEDIUM - Error paths only
---
14. PARSE ERROR: INVALID TAG
Location: src/parse/mod.rs:51
Current:
Error::InvalidFormat(String::from_utf8_lossy(tag).to_string())
Why it allocates:
- Double allocation: from_utf8_lossy creates Cow, then to_string()
Optimization:
Error::InvalidFormat(String::from_utf8_lossy(tag).into_owned())
Impact: LOW - Error path only
---
15. PARSE ERROR: PARTIAL DATA
Location: src/parse/mod.rs:171
Current:
Err(Error::InvalidFormat("partial data at end of stream".to_string()))
Why it allocates:
- Allocating literal string constant
Optimization:
const PARTIAL_DATA_MSG: &str = "partial data at end of stream";
Err(Error::InvalidFormat(PARTIAL_DATA_MSG.to_owned()))
Impact: LOW - Error path only
---
16. WRITE ERROR: DATETIME MESSAGE
Location: src/write/mod.rs:118-121
Current:
return Err(Error::InvalidFormat(
"DateTime cannot be output directly; split into date and time fields"
.to_string(),
));
Optimization:
const DATETIME_ERROR_MSG: &str =
"DateTime cannot be output directly; split into date and time fields";
return Err(Error::InvalidFormat(DATETIME_ERROR_MSG.to_owned()));
Impact: LOW - Error path only
---
17. WRITE ERROR: STRING CONVERSION
Location: src/write/mod.rs:126
Current:
let e = "Cannot convert value to string".to_string();
Error::InvalidFormat(e)
Optimization:
const CONVERT_ERROR_MSG: &str = "Cannot convert value to string";
Error::InvalidFormat(CONVERT_ERROR_MSG.to_owned())
Impact: LOW - Error path only
---
18. PARSER: TAG SPLIT INTO VEC ✅ DONE
Location: src/parse/mod.rs:88
Current:
let parts: Vec<&[u8]> = tag.split(|&b| b == b':').collect();
Why it allocates:
- Collects into Vec for simple 2-3 part split
- Could use iterator directly
Optimization:
let mut parts = tag.split(|&b| b == b':');
let (name, len, typ) = match (parts.next(), parts.next(), parts.next(), parts.next()) {
(Some(name), Some(len), None, None) => (name, len, None),
(Some(name), Some(len), Some(typ), None) => (name, len, Some(typ)),
_ => return Err(Self::invalid_tag(tag)),
};
Impact: MEDIUM - Called once per field during parsing
Status: FIXED - Replaced Vec collection with direct iterator pattern matching in src/parse/mod.rs:90-95
---
19. FIELD::NAME STORED AS STRING
Location: src/lib.rs:201
Current:
pub struct Field {
name: String,
value: Datum,
}
Why it allocates:
- Field names always allocated
- Could use Box<str> for immutable strings (saves capacity overhead)
Optimization:
pub struct Field {
name: Box<str>, // or Arc<str> for sharing
value: Datum,
}
Impact: LOW-MEDIUM - Saves 8 bytes per field (capacity), adds complexity
================================================================================
LOW IMPACT ISSUES (20-32)
================================================================================
Issues 20-32: Various allocations in test code and error paths
- Test code allocations don't affect production performance
- Error message literals in test assertions
- format!() calls in test code
Impact: NONE - Test code only
================================================================================
RECOMMENDATIONS BY PRIORITY
================================================================================
PRIORITY 1: IMMEDIATE WINS (High Impact, Low Risk)
--------------------------------------------------
1. Fix parser case conversions (#1, #2, #4)
- Use ASCII operations instead of UTF-8 uppercase/lowercase
2. Fix Datum::String double allocation (#3)
- Use into_owned() instead of to_string()
3. Fix filter field name literals (#9, #10, #11)
- Use const &str instead of .to_string()
4. Fix as_bool allocation (#8)
- Pattern match instead of .to_uppercase()
Estimated Impact: 4-6 allocations eliminated per field during parsing
PRIORITY 2: MEDIUM EFFORT, HIGH REWARD
---------------------------------------
1. Avoid cloning in RecordSink (#5)
- Requires API redesign but eliminates 2 allocations per field during write
2. Optimize tag split (#18)
- Use iterator instead of collecting Vec
3. Optimize itoa (#6)
- Use fast integer formatting
Estimated Impact: 3-4 allocations eliminated per field during writing
PRIORITY 3: ADVANCED OPTIMIZATIONS
-----------------------------------
1. Datum formatting optimizations (#7)
- Use buffers or thread-locals
2. Case-insensitive comparisons (#12)
- Avoid HashSet with uppercase strings
3. Field name interning (#19)
- Use Arc<str> or Box<str> for field names
Estimated Impact: Reduces memory footprint in specialized use cases
================================================================================
TESTING RECOMMENDATIONS
================================================================================
After implementing optimizations:
1. Run cargo test to ensure correctness
2. Run cargo bench if benchmarks exist
3. Profile with real ADIF files using:
- valgrind --tool=massif for heap profiling
- cargo flamegraph for allocation flamegraphs
- dhat for detailed allocation tracking
4. Measure before/after:
- Total allocations per record
- Peak memory usage
- Parsing/writing throughput