tsdb_timon 1.1.1

## 1. Library Purpose & Core Functionality

Timon is a specialized library built on top of Apache DataFusion that provides:

1. Efficient local file storage for time-series data using Parquet format
2. S3-compatible cloud storage synchronization capabilities
3. SQL query capabilities powered by DataFusion
4. Cross-platform support via JNI (Android) and C-compatible interfaces (iOS)

The library serves as a middleware that simplifies working with time-series data by handling:

- Database and table management with schema validation
- Data partitioning by time intervals
- Local storage with efficient querying
- Cloud synchronization capabilities
- Time-based filtering of data

## 2. Core Library Design & API Review

### API Ergonomics

The public API is well-structured and divided into two clear categories:

1. __File Storage Functions__:

   - `init_timon`, `create_database`, `create_table`, `list_databases`, etc.
   - Focused on local data management

2. __S3-Compatible Storage Functions__:

   - `init_bucket`, `cloud_sink_parquet`, `cloud_fetch_parquet`, etc.
   - Focused on cloud synchronization

The API design follows a logical flow, making it intuitive for common operations:

1. Initialize storage
2. Create databases and tables
3. Insert data
4. Query data
5. Synchronize with cloud storage

### Modularity

The codebase is well-organized into modules:

- `timon_engine`: Core functionality

  - `mod.rs`: Public API implementation
  - `db_manager.rs`: Database management
  - `cloud_sync.rs`: S3 synchronization
  - `helpers.rs`: Utility functions

There's clear separation of concerns:

- Database operations are isolated from cloud operations
- Error handling is consistently implemented
- Manager classes have well-defined responsibilities

### Error Handling

Error handling is robust, with:

- Consistent use of `Result` types throughout the codebase
- Proper error propagation
- User-friendly error messages
- Error wrapping to maintain context

The `TimonResult` struct provides a standardized return format for all API functions:

```rust
struct TimonResult {
  pub status: u16,
  pub message: String,
  pub json_value: Option<Value>,
}
```

### Configuration

Configuration is straightforward through the initializers:

- `init_timon` for local storage with configurable bucket interval
- `init_bucket` for S3-compatible storage with endpoint, credentials, etc.

## 3. DataFusion Integration

### Efficient API Usage

The library generally uses DataFusion's APIs effectively:

__Strengths:__

- Uses lazy execution through the DataFrame API
- Creates appropriate DataFusion contexts for operations
- Registers Parquet files properly with DataFusion
- Transforms SQL queries appropriately
- Handles schema extraction and conversion well

__Areas for Improvement:__

- ✅ There are some cases where data is collected early, which could be optimized *(Fixed: Eliminated intermediate data collection using CTE)*
- ✅ The `query` method could potentially be more efficient with how it handles partitions *(Fixed: Optimized partition handling with single query execution)*

### Custom Logic

The library extends DataFusion with:

1. __Partition Management__:

   - Time-based partitioning for efficient querying
   - Merges data from multiple partitions when needed

2. __Query Transformation__:

   - Modifies queries to work with the partitioned structure
   - Handles special cases like JOINs across partitions

3. __Schema Management__:

   - Custom schema validation and enforcement
   - Type validation against defined schemas

### State Management

The `SessionContext` is handled with per-username isolation:

- Each `DatabaseManager` instance maintains its own `SessionContext` (one per username)
- When cloning `DatabaseManager`, a fresh `SessionContext` is created for isolation (see `Clone` implementation in `db_manager.rs:96-113`)
- Tables are registered per-username context, allowing concurrent queries for different users
- This design ensures thread-safety and isolation between different user contexts

### Data Source Interaction

The library creates Parquet files for DataFusion to query:

- Files are properly partitioned for time-based queries
- File management is handled logically
- Predicate pushdown is leveraged through date filtering

### Query Path Resolution

The library supports two data path structures:

- **Default Path**: `{storage_path}/data/{db_name}/{table_name}/` - for local user data
- **Group Path**: `{storage_path}/group/{username}/{db_name}/{table_name}/` - for group/shared user data

**Current Implementation** (`resolve_table_dir` in `db_manager.rs:674-738`):
- For default user (username=None): Only queries default path
- For group users (username provided): Only queries group path
- ⚠️ **Known Limitation**: See [Issue #64](https://github.com/mongrov/timon/issues/64) - queries do NOT merge data from both paths
  - If same file exists in both locations, only one path is queried
  - This can lead to missing data if files exist in both default and group paths
  - Potential solutions: merge both files or query both paths with duplicate removal at query time

## 4. Rust Code Quality & Best Practices

### Ownership and Borrowing

The code demonstrates good understanding of Rust's ownership model:

- Appropriate use of borrowing where possible
- Strategic cloning when necessary
- Proper use of `Arc` for shared ownership

✅ **Fixed: Ownership and borrowing optimizations implemented**

- Removed unnecessary clones in hot paths (create_database, create_table, delete_database, delete_table, insert, list_databases, list_tables)
- Optimized function calls to pass mutable references instead of cloning DatabaseManager instances
- Improved performance by eliminating redundant memory allocations in frequently called functions

### Use of `unsafe`

The library uses `unsafe` code primarily in the JNI and iOS interfaces, which is unavoidable when working with foreign function interfaces. The unsafe code is:

- Well-encapsulated
- Used only where necessary
- Properly documented

### Dependencies

The dependencies are appropriate for the library's needs:

- `datafusion` for query processing
- `object_store` for S3 integration
- `tokio` for async operations
- `serde` for serialization

### Performance Considerations

Several performance optimizations are evident:

- Time-based partitioning for efficient querying
- Proper file management to avoid unnecessary operations
- File locking for concurrent access safety
- Query filtering to minimize data processing

**Performance Benchmarks** (see `INSERT_PERFORMANCE_REPORT.md` and `TIMON_QUERY_REPORT.md`):

- **Insert Operations**: 
  - Fresh inserts: ~15K records/sec (consistent across 1K-1M records)
  - Updates: Performance degrades with existing data size (594-11,713 records/sec)
  - Table loading: 29K-42K records/sec (2-3x faster than inserts)
  
- **Query Operations** (tested with 1M records per table):
  - Simple queries: 17-23ms (excellent)
  - COUNT aggregations: 17.7ms (excellent)
  - JOIN queries: 435-510ms (good, scales linearly)
  - Complex analytics: 87-490ms (good)
  - Partition filtering: 20.7ms (excellent, found 112K records efficiently)
  
- **Realistic Use Case** (20K records/month, typical for ZivaApp-Ring):
  - Insert: ~1.4s
  - Load: ~0.5s
  - Query: <25ms for simple queries

### Concurrency & Race Condition Handling

The library implements robust concurrency controls:

- ✅ **Atomic File Writes**: Implemented using temp file strategy (`parquet_file_writer_locked` in `db_manager.rs:853-904`)
  - Writes to temporary file first, then atomically renames to final location
  - Prevents corruption if process crashes mid-write
  - Uses unique timestamp-based temp file names to avoid conflicts
- ✅ **File-Level Locking**: Process-wide mutex map for file-level locking (`atomic_file_insert` in `db_manager.rs:742-825`)
  - Each file path has its own mutex to prevent concurrent writes
  - Lock is held during read-merge-write cycle to ensure atomicity
  - Prevents race conditions when multiple threads insert to same partition
- ⚠️ **Known Limitation**: See [Issue #135](https://github.com/mongrov/timon/issues/135) - Insert-Query Race Condition
  - **Problem**: Queries may attempt to read parquet files while inserts are writing them, even with atomic writes
  - **Impact**: Can cause ("Corrupt Footer", "Protocol Error" and "Out of Range of File") errors when DataFusion reads incomplete files
  - **Status**: Marked as "wontfix" - atomic write strategy is implemented but insert-query race condition remains

## 5. Testing & Documentation

### Test Coverage

The test suite is comprehensive:

- Tests for normal operations
- Tests for error handling
- Tests for edge cases
- Tests for concurrent operations

### Documentation

Documentation could be improved:

- More in-code documentation would be beneficial
- Function-level documentation is sparse in some areas
- Example usage could be expanded beyond the README

## Recommendations for Improvement

1. __DataFusion Integration:__

   - ✅ Optimize the partition selection logic *(Fixed: Implemented efficient CTE-based partition handling)*
   - ⚠️ **SessionContext Management**: Each `DatabaseManager` clone creates a fresh `SessionContext` for isolation
     - This ensures thread-safety but means table registrations are not shared across clones
     - Current design prioritizes isolation over reuse, which is appropriate for multi-user scenarios
   - Evaluate using DataFusion's more advanced features like caching

2. __Error Handling:__

   - ✅ Standardize error types across the codebase *(Completed: Implemented comprehensive TimonError system with standardized error kinds)*
   - ✅ Consider more granular error types for better error handling *(Completed: Added 30+ granular error types covering all operation categories)*

   **Implementation Details:**
   - Created comprehensive `TimonErrorKind` enum with granular error categories for initialization, database operations, table operations, data operations, query operations, file system operations, cloud storage operations, synchronization, concurrency, metadata, validation, and internal errors
   - Implemented structured `TimonError` with kind, message, details, source, and context fields
   - Created conversion implementations from common error types (DataFusionError, std::io::Error, serde_json::Error)
   - Added helper macros and functions for creating common error types
   - Comprehensive test coverage with 21+ error handling tests all passing

3. __API Ergonomics:__

   - ✅ Add higher-level functions for common operations *(Not needed - server code only)*
   - ✅ Consider builder patterns for complex configurations *(Not needed - server code only)*

   **Note:** These improvements are server-specific and not needed for library usage. Higher-level functions and builder patterns are beneficial for HTTP API consumers and complex server configurations, but the core library API is already ergonomic for direct programmatic usage.

4. __Performance:__

   - ✅ Benchmark and optimize the query path *(Not needed - server code only)*
   - ✅ Look for opportunities to reduce cloning *(Already optimized - see section 4)*
   - ✅ Consider more aggressive caching strategies *(Not needed - server code only)*

   **Note:** These performance optimizations are server-specific and not needed for library usage. Benchmarking and aggressive caching are beneficial for multi-user server workloads with concurrent requests, but the library's performance is already optimized for embedded usage through efficient ownership patterns and minimal allocations.

5. __Documentation:__

   - Add more comprehensive function-level documentation
   - Include examples for complex operations
   - ✅ **Performance characteristics documented** - See `INSERT_PERFORMANCE_REPORT.md` and `QUERY_PERFORMANCE_REPORT.md`
   
   **Performance Summary:**
   - **Insert Performance**: ~15K records/sec for fresh inserts, consistent across sizes (1K-1M)
   - **Update Performance**: 594-11,713 records/sec depending on existing data size (degrades with larger datasets)
   - **Query Performance**: Excellent for simple queries (<25ms with 1M records), good for JOINs (<520ms)
   - **Table Loading**: 29K-42K records/sec (2-3x faster than inserts)
   - **Realistic Use Case** (20K records/month): Insert ~1.4s, Load ~0.5s

6. __Known Issues & Limitations:__

   - ⚠️ **Issue #64**: Query default user missing group files
     - **Problem**: Queries only check one path (default OR group), not both
     - **Impact**: Can miss data if same file exists in both locations
     - **Status**: Open issue - see [GitHub Issue #64](https://github.com/mongrov/timon/issues/64)
     - **Potential Solutions**: Merge both files or query both paths with duplicate removal
   
   - ⚠️ **Issue #135**: Insert-Query Race Condition
     - **Problem**: Queries may read parquet files while inserts are writing them, causing "Invalid partitioning found on disk" errors
     - **Impact**: DataFusion can encounter incomplete or corrupted files during query execution
     - **Status**: Marked as "wontfix" - see [GitHub Issue #135](https://github.com/mongrov/timon/issues/135)
     - **Note**: Atomic temp file strategy IS implemented (`parquet_file_writer_locked`) for write-write safety, but insert-query race condition remains

## 7. Additional Code Review Findings

### Potential Issues Identified

1. **Mutex Poisoning Risk** (`db_manager.rs:753, 758`):
   - ⚠️ **Issue**: `unwrap()` on mutex locks can panic if mutex is poisoned (thread panicked while holding lock)
   - **Location**: `atomic_file_insert()` function
   - **Impact**: Could cause entire insert operation to fail if a previous thread panicked
   - **Recommendation**: Use `lock().map_err()` or `lock().unwrap_or_else()` to handle poisoned mutexes gracefully
   - **Code**: 
     ```rust
     let mut locks = get_file_locks().lock().unwrap(); // Line 753
     let _guard = file_mutex.lock().unwrap(); // Line 758
     ```

2. **Incomplete Edge Case Handling** (`helpers.rs:525`):
   - ⚠️ **Issue**: `todo!()` macro in `filter_files_by_date_range()` for date pattern `(None, Some(day))`
   - **Location**: Date parsing logic when month is missing but day is present
   - **Impact**: Will panic if this edge case is encountered
   - **Recommendation**: Implement proper handling or return an error for invalid date patterns
   - **Code**: `(None, Some(_)) => todo!(),`

3. **Empty Partition Directory Cleanup** (`db_manager.rs:437`):
   - ⚠️ **Issue**: Partition directories are created but not cleaned up if insert fails
   - **Location**: `fs::create_dir_all(&partition_dir).ok()` creates directory even if subsequent write fails
   - **Impact**: Empty partition directories can cause "Invalid partitioning found on disk" errors (see Issue #135)
   - **Recommendation**: Clean up empty partition directories on insert failure, or validate directory has files before querying

4. **Path Validation**:
   - ⚠️ **Issue**: No explicit validation of database/table names for path traversal attacks
   - **Location**: Database and table names are used directly in file paths
   - **Impact**: Potential security risk if user input contains `../` or other path components
   - **Recommendation**: Add validation to reject names containing path separators, `..`, or other dangerous characters
   - **Note**: Current implementation uses these values in `format!()` which may be safe, but explicit validation is recommended

5. **Error Message Consistency**:
   - ⚠️ **Issue**: Some functions return `Result<Value, String>` while others use `TimonError`
   - **Location**: Public API functions in `mod.rs` convert errors to strings
   - **Impact**: Loss of structured error information at API boundary
   - **Recommendation**: Consider standardizing error types across all public APIs

6. **Metadata Cache Invalidation**:
   - ✅ **Good**: Metadata cache is properly invalidated after writes (`save_metadata()` calls `invalidate_cache()`)
   - **Note**: Cache uses infinite TTL with manual invalidation, which is appropriate for this use case

7. **File Lock Map Growth**:
   - ⚠️ **Issue**: `FILE_LOCKS` HashMap grows indefinitely as new file paths are encountered
   - **Location**: `get_file_locks()` static mutex map
   - **Impact**: Memory usage grows over time, though likely minimal for typical use cases
   - **Recommendation**: Consider implementing LRU cache or periodic cleanup for unused locks (low priority)

8. **Timestamp Calculation Edge Cases**:
   - ✅ **Good**: `rounded_timestamp()` handles various interval sizes correctly
   - **Note**: Uses `expect()` for timestamp conversion which could panic on invalid timestamps, but this is acceptable given the context

### Code Quality Observations

- **Positive**: Comprehensive error handling with `TimonError` system
- **Positive**: Good use of atomic operations for file writes
- **Positive**: Proper use of `Arc` and `Mutex` for thread safety
- **Positive**: Metadata caching with proper invalidation
- **Positive**: Schema validation and type coercion handling

### Recommendations Summary

**High Priority:**
1. Handle mutex poisoning gracefully in `atomic_file_insert()`
2. Implement or document the `todo!()` case in date filtering
3. Add path validation for database/table names

**Medium Priority:**
4. Clean up empty partition directories on insert failure
5. Consider standardizing error types in public API

**Low Priority:**
6. Monitor file lock map growth (likely not an issue in practice)
7. Add more comprehensive logging for debugging race conditions

Overall, Timon is a well-designed library that effectively leverages DataFusion for time-series data management, with a clean API and good error handling. The library implements robust concurrency controls with atomic file writes and file-level locking. The main areas for improvement are in query path resolution (merging default and group paths), documentation, handling edge cases around data synchronization between paths, and addressing the mutex poisoning and incomplete edge case handling issues identified above.