# FeatherDB Core
Core types, traits, and utilities shared across all FeatherDB crates.
## Overview
This crate provides the foundational types that all other FeatherDB components depend on. It has minimal dependencies to avoid circular dependencies in the workspace.
## Components
### 1. Value Types (`value.rs`)
The fundamental data type for all database values.
```rust
pub enum Value {
Null,
Boolean(bool),
Integer(i64),
Real(f64),
Text(String),
Blob(Vec<u8>),
Timestamp(i64), // Unix milliseconds
}
```
**Capabilities:**
- Type conversions (`as_bool()`, `as_i64()`, `as_f64()`, `as_str()`, `as_bytes()`)
- Serialization/deserialization to bytes
- Ordering and comparison (implements `Ord`)
- Hashing (implements `Hash`)
- Display formatting
- Memory size estimation for GC tracking
**Column Types:**
```rust
pub enum ColumnType {
Boolean,
Integer,
Real,
Text { max_len: Option<usize> },
Blob { max_len: Option<usize> },
Timestamp,
}
```
**Type Compatibility and Coercion:**
```rust
// Check if a value is compatible with a column type
let text_type = ColumnType::Text { max_len: Some(100) };
assert!(text_type.is_compatible(&Value::Text("hello".into())));
assert!(text_type.is_compatible(&Value::Null)); // NULL compatible with any type
// Coerce values between types
let int_type = ColumnType::Integer;
let coerced = int_type.coerce(Value::Real(3.7))?; // Value::Integer(3)
let coerced = int_type.coerce(Value::Boolean(true))?; // Value::Integer(1)
let text_type = ColumnType::Text { max_len: None };
let coerced = text_type.coerce(Value::Integer(42))?; // Value::Text("42")
```
**Supported Coercions:**
| Integer | Boolean | 0 = false, non-zero = true |
| Real | Integer | Truncates decimal |
| Boolean | Integer | false = 0, true = 1 |
| Integer | Real | Exact conversion |
| Integer/Real | Text | String representation |
| Integer | Timestamp | Unix milliseconds |
### 2. Error Types (`error.rs`)
Comprehensive error handling with context and intelligent suggestions.
```rust
pub enum Error {
// User-facing errors (actionable)
TableNotFound { table: String, suggestion: Option<String> },
ColumnNotFound { column: String, table: String, suggestion: Option<String> },
AmbiguousColumn { column: String, tables: String },
IndexNotFound { index: String, table: String },
SyntaxError { message: String, line: usize, column: usize },
TypeError { expected: String, actual: String },
UniqueViolation { table: String, column: String },
PrimaryKeyViolation { table: String },
NotNullViolation { table: String, column: String },
ForeignKeyViolation { details: String },
TransactionConflict,
TransactionEnded,
SavepointNotFound { name: String },
TableAlreadyExists { table: String },
IndexAlreadyExists { index: String },
ReadOnly,
DatabaseLocked,
InvalidQuery { message: String },
Query(QueryError),
Unsupported { feature: String },
// Internal errors
CorruptedPage(PageId),
CorruptedWal { message: String },
InvalidPageType { page_id: PageId, page_type: u8 },
PageNotFound(PageId),
DoubleFree(PageId),
BufferPoolFull,
InvalidDatabaseFile { message: String },
VersionMismatch { file_version: u32, expected: u32 },
Internal(String),
// I/O errors
Io(std::io::Error),
FileNotFound { path: String },
Serialization(String),
// Compression errors
CompressionError { message: String },
DecompressionError { message: String, expected_size: usize },
}
```
**Error Helpers:**
```rust
// Check if error is retriable (TransactionConflict, DatabaseLocked)
if error.is_retriable() {
// Retry the operation
}
// Check if user error vs system error
if error.is_user_error() {
// Show to user
} else {
// Log and handle internally
}
// Access rich query error context
if let Some(query_err) = error.as_query_error() {
println!("{}", query_err.format_with_context());
}
```
**Rich Query Errors with Context:**
```rust
pub struct QueryError {
pub message: String,
pub sql: String,
pub line: usize,
pub column: usize,
pub suggestion: Option<String>,
pub help: Option<String>,
pub span_length: usize,
}
```
Query errors produce formatted output similar to Rust compiler errors:
```
Error: Column 'naem' not found
--> query:1:8
|
|
= suggestion: Did you mean 'name'?
```
**Levenshtein Distance for Suggestions:**
```rust
use featherdb_core::{levenshtein_distance, find_best_match, suggest_keyword};
// Calculate edit distance between strings
let distance = levenshtein_distance("naem", "name"); // 2
// Find best match from candidates
let columns = vec!["name", "email", "age", "id"];
let suggestion = find_best_match("naem", &columns, 2); // Some("name")
// Suggest corrections for SQL keyword typos
let keyword = suggest_keyword("SELEKT"); // Some("SELECT")
let keyword = suggest_keyword("FORM"); // Some("FROM")
let keyword = suggest_keyword("WHRE"); // Some("WHERE")
```
The `suggest_keyword` function handles 100+ common SQL keyword typos including:
- SELECT, FROM, WHERE, INSERT, UPDATE, DELETE
- CREATE, TABLE, INDEX, DROP, ALTER
- JOIN, LEFT, RIGHT, INNER, OUTER
- ORDER, GROUP, HAVING, LIMIT, OFFSET
- PRIMARY, FOREIGN, UNIQUE, CONSTRAINT
- BEGIN, COMMIT, ROLLBACK, TRANSACTION
### 3. Configuration (`lib.rs`)
Database configuration with builder pattern.
```rust
pub struct Config {
pub page_size: usize, // Default: 4096
pub buffer_pool_pages: usize, // Default: 16384 (64MB with 4KB pages)
pub path: PathBuf,
pub create_if_missing: bool, // Default: true
pub enable_wal: bool, // Default: true
pub sync_on_commit: bool, // Default: true (deprecated, use wal_config)
pub encryption: EncryptionConfig,
pub wal_config: WalGroupCommitConfig,
pub compression: CompressionConfig,
pub eviction_policy: EvictionPolicyType,
}
```
**Encryption Configuration:**
```rust
pub enum EncryptionConfig {
None,
Password(String), // Derives key using Argon2id
Key([u8; 32]), // Direct 256-bit key
}
```
**WAL Sync Modes:**
```rust
pub enum WalSyncMode {
Immediate, // Sync on every commit (safest, slowest) - default
GroupCommit, // Batch commits into single fsync (2-5x faster)
NoSync, // No sync - data loss on crash (fastest)
}
pub struct WalGroupCommitConfig {
pub sync_mode: WalSyncMode,
pub group_commit_interval_ms: u64, // Default: 10ms
pub group_commit_max_batch: usize, // Default: 1000 records
}
```
**Compression Configuration:**
```rust
pub enum CompressionType {
None,
Lz4, // Fast compression
Zstd { level: i32 }, // Level 1-22 (higher = better compression)
}
pub struct CompressionConfig {
pub compression_type: CompressionType,
pub threshold: usize, // Minimum size to compress (default: 512 bytes)
}
```
**Eviction Policy Types:**
```rust
pub enum EvictionPolicyType {
Clock, // Second-chance algorithm - simple and fast (default)
Lru2, // Tracks last 2 access times - 5-15% higher hit rate
Lirs, // Adapts to workload patterns
}
```
**Builder Pattern:**
```rust
let config = Config::new("mydb.db")
.create_if_missing(true)
.page_size(8192)
.buffer_pool_size_mb(256)
.with_password("secret")
// WAL configuration
.with_group_commit()
.group_commit_interval_ms(20)
.group_commit_max_batch(2000)
// Compression
.with_zstd_compression_level(9)
.compression_threshold(256)
// Eviction policy
.with_lru2_eviction();
// Or use specific compression presets
let config = Config::new("mydb.db")
.with_lz4_compression(); // Fast compression
.with_zstd_compression(); // Default level (3)
// Check configuration
assert!(config.is_encrypted());
assert!(config.is_compressed());
```
### 4. Core Identifiers
Type-safe identifiers for database objects:
```rust
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, PartialOrd, Ord)]
pub struct PageId(pub u64);
impl PageId {
pub const INVALID: PageId = PageId(u64::MAX);
pub fn is_valid(&self) -> bool;
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, PartialOrd, Ord)]
pub struct TransactionId(pub u64);
impl TransactionId {
pub const NONE: TransactionId = TransactionId(0);
pub fn next(&self) -> TransactionId;
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, PartialOrd, Ord, Default)]
pub struct Lsn(pub u64); // Log Sequence Number
impl Lsn {
pub const ZERO: Lsn = Lsn(0);
pub fn next(&self) -> Lsn;
}
```
All identifiers implement `From<u64>` and `Into<u64>` for easy conversion.
### 5. Row Traits (`row.rs`)
Traits for converting between Rust types and database rows.
```rust
/// Convert a single Rust value to/from database Value
pub trait ToValue {
fn to_value(&self) -> Value;
fn column_type() -> ColumnType;
}
pub trait FromValue: Sized {
fn from_value(value: &Value) -> Result<Self>;
}
/// Convert a Rust struct to/from database rows
pub trait ToRow {
fn to_values(&self) -> Vec<Value>;
fn column_names() -> &'static [&'static str];
}
pub trait FromRow: Sized {
fn from_values(values: &[Value]) -> Result<Self>;
}
/// Table schema information
pub trait TableSchema {
fn table_name() -> &'static str;
fn columns() -> Vec<ColumnDef>;
fn primary_key() -> &'static [&'static str];
}
/// Column definition for schema generation
pub struct ColumnDef {
pub name: &'static str,
pub column_type: ColumnType,
pub primary_key: bool,
pub not_null: bool,
pub unique: bool,
pub default: Option<Value>,
}
```
**Built-in ToValue/FromValue Implementations:**
- `bool`
- `i32`, `i64`
- `f64`
- `String`, `&str`
- `Vec<u8>`
- `Option<T>` where T: ToValue/FromValue
**Usage with derive macro:**
```rust
use featherdb::Table;
#[derive(Table)]
struct User {
#[primary_key]
id: i64,
name: String,
email: Option<String>,
}
// Automatically implements ToRow, FromRow, TableSchema
let user = User { id: 1, name: "Alice".into(), email: None };
let values = user.to_values();
let restored = User::from_values(&values)?;
```
### 6. Serialization Format
Values are serialized with a type tag followed by the data:
| Null | 0 | (no data) |
| Boolean | 1 | 1 byte (0 or 1) |
| Integer | 2 | 8 bytes little-endian i64 |
| Real | 3 | 8 bytes little-endian f64 |
| Text | 4 | 4 bytes length (LE u32) + UTF-8 bytes |
| Blob | 5 | 4 bytes length (LE u32) + raw bytes |
| Timestamp | 6 | 8 bytes little-endian i64 |
**Serialized Sizes:**
- Null: 1 byte
- Boolean: 2 bytes
- Integer/Real/Timestamp: 9 bytes
- Text/Blob: 5 + content length
### 7. Constants
System-wide constants:
```rust
pub mod constants {
pub const MAGIC: &[u8; 10] = b"FEATHERDB\0";
pub const FORMAT_VERSION: u32 = 1;
pub const DEFAULT_PAGE_SIZE: usize = 4096;
pub const SUPERBLOCK_SIZE: usize = 512;
pub const PAGE_HEADER_SIZE: usize = 64;
pub const SLOT_SIZE: usize = 6; // offset:u16, len:u16, flags:u8, reserved:u8
pub const SCHEMA_ROOT_PAGE: u64 = 1;
pub const FREELIST_ROOT_PAGE: u64 = 2;
pub const FIRST_DATA_PAGE: u64 = 3;
}
```
## Usage Examples
### Working with Values
```rust
use featherdb_core::Value;
// Create values
let int_val = Value::Integer(42);
let text_val = Value::Text("hello".into());
let null_val = Value::Null;
// Type checking
assert!(!int_val.is_null());
assert_eq!(int_val.as_i64(), Some(42));
assert_eq!(text_val.as_str(), Some("hello"));
// Comparison (NULL is less than everything)
assert!(Value::Integer(1) < Value::Integer(2));
assert!(Value::Null < Value::Integer(0));
// Cross-type numeric comparison
assert!(Value::Integer(3) < Value::Real(3.5));
// Serialization
let mut buf = Vec::new();
int_val.serialize(&mut buf);
let restored = Value::deserialize(&mut &buf[..])?;
assert_eq!(int_val, restored);
// Memory estimation for GC
let size = text_val.size_estimate(); // 24 + string length
```
### Error Handling with Suggestions
```rust
use featherdb_core::{Error, Result, find_best_match, QueryError};
fn lookup_column(name: &str, columns: &[&str], table: &str) -> Result<usize> {
for (i, col) in columns.iter().enumerate() {
if *col == name {
return Ok(i);
}
}
// Find suggestion using Levenshtein distance
let suggestion = find_best_match(name, columns, 2).map(String::from);
Err(Error::ColumnNotFound {
column: name.into(),
table: table.into(),
suggestion,
})
}
// Rich query errors with formatting
let err = QueryError::new("Column 'naem' not found", "SELECT naem FROM users", 1, 8)
.with_span(4)
.with_suggestion("Did you mean 'name'?")
.with_help("Available columns: id, name, email");
println!("{}", err.format_with_context());
```
### Configuration
```rust
use featherdb_core::{Config, WalGroupCommitConfig, WalSyncMode, EvictionPolicyType};
// Simple configuration
let config = Config::new("app.db");
// Full configuration with all options
let config = Config::new("app.db")
.create_if_missing(true)
.page_size(8192)
.buffer_pool_size_mb(256)
.with_password("secret")
.with_group_commit()
.group_commit_interval_ms(20)
.with_zstd_compression_level(9)
.with_lru2_eviction();
// Custom WAL configuration
let wal_config = WalGroupCommitConfig::new()
.sync_mode(WalSyncMode::GroupCommit)
.group_commit_interval_ms(15)
.group_commit_max_batch(500);
let config = Config::new("app.db")
.wal_config(wal_config);
```
## Testing
```bash
# Run all core tests (using Make)
make test-crate CRATE=featherdb-core
# Or with cargo directly
cargo test -p featherdb-core
# Run with output
cargo test -p featherdb-core -- --nocapture
# Run coverage (from project root)
make coverage # or: cargo llvm-cov --workspace
```
## Design Principles
1. **Minimal Dependencies**: Only essential crates (thiserror, serde, bytes)
2. **No Circular Dependencies**: Core is at the bottom of the dependency graph
3. **Type Safety**: Strong typing for identifiers prevents mixing
4. **Serialization**: All types can round-trip to bytes
5. **Error Context**: Errors include enough context for debugging and suggestions
6. **Builder Pattern**: Configuration uses fluent builder API
## Public Exports
```rust
// From lib.rs
pub use error::{Error, QueryContext, QueryError, Result, find_best_match, levenshtein_distance, suggest_keyword};
pub use row::{ColumnDef, FromRow, FromValue, TableSchema, ToRow, ToValue};
pub use value::{ColumnType, Value};
pub struct PageId(pub u64);
pub struct TransactionId(pub u64);
pub struct Lsn(pub u64);
pub enum EncryptionConfig { None, Password(String), Key([u8; 32]) }
pub enum CompressionType { None, Lz4, Zstd { level: i32 } }
pub struct CompressionConfig { ... }
pub enum WalSyncMode { Immediate, GroupCommit, NoSync }
pub struct WalGroupCommitConfig { ... }
pub enum EvictionPolicyType { Clock, Lru2, Lirs }
pub struct Config { ... }
pub mod constants { ... }
```
## Future Improvements
- [ ] JSON value type
- [ ] Decimal/numeric type for financial data
- [ ] Array/nested types