# Contributing to RLM-RS
Thank you for your interest in contributing to `rlm-rs`! This document provides guidelines and instructions for contributing.
## Code of Conduct
Please be respectful and constructive in all interactions. We welcome contributors of all backgrounds and experience levels.
## Getting Started
### Prerequisites
- **Rust 1.88+** (2024 edition)
- **cargo-deny** for supply chain security checks
```bash
# Install Rust (if needed)
# Install cargo-deny
cargo install cargo-deny
```
### Setting Up the Development Environment
```bash
# Clone the repository
git clone https://github.com/zircote/rlm-rs.git
cd rlm-rs
# Build the project
cargo build
# Run tests
cargo test
# Run the full CI check
make ci
```
## Development Workflow
### Branch Strategy
- `main` - Stable release branch
- Feature branches - `feature/<description>`
- Bug fixes - `fix/<description>`
### Making Changes
1. **Fork the repository** and create a branch from `main`
2. **Write your code** following the style guidelines below
3. **Add tests** for any new functionality
4. **Run the full CI check** before submitting:
```bash
make ci
```
This runs:
- `cargo fmt -- --check` (formatting)
- `cargo clippy --all-targets --all-features` (linting)
- `cargo test` (tests)
- `cargo doc --no-deps` (documentation)
- `cargo deny check` (supply chain)
### Commit Messages
Use clear, descriptive commit messages:
```
<type>: <short description>
<optional longer description>
```
Types:
- `feat`: New feature
- `fix`: Bug fix
- `docs`: Documentation changes
- `refactor`: Code refactoring
- `test`: Adding or updating tests
- `chore`: Maintenance tasks
Examples:
```
feat: add parallel chunking strategy
fix: handle UTF-8 boundary in fixed chunker
docs: update API reference for Storage trait
```
## Code Style Guidelines
### General Rules
- **Line length**: 100 characters maximum
- **Edition**: Rust 2024
- **Unsafe code**: Forbidden unless explicitly justified with comments
- **Panics**: Not allowed in library code (`unwrap`, `expect`, `panic!`)
### Formatting
Code must pass `cargo fmt`:
```bash
# Check formatting
cargo fmt -- --check
# Auto-format
cargo fmt
```
### Linting
Code must pass strict clippy lints:
```bash
cargo clippy --all-targets --all-features -- -D warnings
```
The project uses pedantic and nursery lints. Key rules enforced:
| `unwrap_used` | Denied - use `Result` instead |
| `expect_used` | Denied - use `Result` instead |
| `panic` | Denied - handle errors gracefully |
| `todo` | Denied - complete implementation |
| `dbg_macro` | Denied - remove debug macros |
| `print_stdout` | Denied - use proper logging |
### Error Handling
Always use `Result` types for fallible operations:
```rust
// Good
pub fn parse(input: &str) -> Result<Value, ParseError> {
if input.is_empty() {
return Err(ParseError::EmptyInput);
}
Ok(value)
}
// Bad - panics
pub fn parse(input: &str) -> Value {
input.parse().unwrap() // Never do this
}
```
Use `thiserror` for custom error types:
```rust
use thiserror::Error;
#[derive(Error, Debug)]
pub enum MyError {
#[error("invalid input: {0}")]
InvalidInput(String),
#[error("operation failed")]
OperationFailed {
#[source]
source: std::io::Error,
},
}
```
### Documentation
All public items must have documentation:
```rust
/// Processes the input data according to the configuration.
///
/// # Arguments
///
/// * `input` - The data to process.
/// * `config` - Processing configuration.
///
/// # Returns
///
/// The processed result.
///
/// # Errors
///
/// Returns [`Error::InvalidInput`] if the input is malformed.
///
/// # Examples
///
/// ```rust
/// use rlm_rs::{process, Config};
///
/// let result = process("data", &Config::default())?;
/// assert!(!result.is_empty());
/// # Ok::<(), rlm_rs::Error>(())
/// ```
pub fn process(input: &str, config: &Config) -> Result<Output, Error> {
// implementation
}
```
### Ownership and Borrowing
Prefer borrowing over ownership when possible:
```rust
// Good - borrows
pub fn process(data: &[u8]) -> Vec<u8> { ... }
// Avoid - takes ownership unnecessarily
pub fn process(data: Vec<u8>) -> Vec<u8> { ... }
```
### Const Functions
Use `const fn` where possible:
```rust
#[must_use]
pub const fn new() -> Self {
Self {
size: DEFAULT_SIZE,
overlap: DEFAULT_OVERLAP,
}
}
```
## Testing
### Test Organization
- **Unit tests**: Inside `src/*.rs` with `#[cfg(test)]` modules
- **Integration tests**: `tests/` directory
- **Doc tests**: Examples in documentation
### Writing Tests
```rust
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_success_case() {
let result = function_under_test(valid_input);
assert_eq!(result, expected_output);
}
#[test]
fn test_error_case() {
let result = function_under_test(invalid_input);
assert!(matches!(result, Err(Error::InvalidInput(_))));
}
}
```
### Property-Based Testing
For complex invariants, use `proptest`:
```rust
use proptest::prelude::*;
proptest! {
#[test]
fn chunk_boundaries_valid(content in ".{1,1000}") {
let chunker = FixedChunker::with_size(100);
let chunks = chunker.chunk(1, &content, None).unwrap();
for chunk in chunks {
prop_assert!(chunk.byte_range.end <= content.len());
}
}
}
```
### Running Tests
```bash
# Run all tests
cargo test
# Run with output
cargo test -- --nocapture
# Run specific test
cargo test test_name
# Run tests for a specific module
cargo test chunking::
```
## Adding New Features
### Adding a New Chunking Strategy
1. Create `src/chunking/my_strategy.rs`:
```rust
use crate::chunking::traits::{Chunker, ChunkMetadata};
use crate::core::Chunk;
use crate::error::Result;
pub struct MyChunker {
chunk_size: usize,
}
impl MyChunker {
#[must_use]
pub const fn new() -> Self {
Self {
chunk_size: super::DEFAULT_CHUNK_SIZE,
}
}
}
impl Chunker for MyChunker {
fn chunk(
&self,
buffer_id: i64,
text: &str,
metadata: Option<&ChunkMetadata>,
) -> Result<Vec<Chunk>> {
// Implementation
}
fn name(&self) -> &'static str {
"my-strategy"
}
fn description(&self) -> &'static str {
"Description of my chunking strategy"
}
}
```
2. Export in `src/chunking/mod.rs`:
```rust
pub mod my_strategy;
pub use my_strategy::MyChunker;
```
3. Add to `create_chunker` factory function.
4. Add tests and documentation.
### Adding a New CLI Command
1. Add variant to `Commands` enum in `src/cli/parser.rs`:
```rust
#[derive(Subcommand, Debug)]
pub enum Commands {
// ...existing commands...
/// Description of my command.
MyCommand {
/// Argument description.
#[arg(short, long)]
my_arg: String,
},
}
```
2. Implement handler in `src/cli/commands.rs`:
```rust
Commands::MyCommand { my_arg } => {
// Implementation
Ok("Output".to_string())
}
```
3. Add tests and update CLI reference documentation.
## Pull Request Process
1. **Ensure all checks pass**:
```bash
make ci
```
2. **Update documentation** if needed:
- README.md for user-facing changes
- docs/ for detailed documentation
- Code comments for internal changes
3. **Create a pull request** with:
- Clear title describing the change
- Description of what and why
- Link to related issues (if any)
4. **Address review feedback** promptly
### PR Checklist
- [ ] Code follows style guidelines
- [ ] All tests pass (`cargo test`)
- [ ] Clippy passes (`cargo clippy`)
- [ ] Format is correct (`cargo fmt`)
- [ ] Documentation updated (if needed)
- [ ] No new warnings introduced
## Reporting Issues
### Bug Reports
Include:
- Rust version (`rustc --version`)
- OS and version
- Steps to reproduce
- Expected vs actual behavior
- Relevant logs or error messages
### Feature Requests
Include:
- Use case description
- Proposed solution (if any)
- Alternatives considered
## Project Structure
```
src/
├── lib.rs # Library entry point
├── main.rs # CLI entry point
├── error.rs # Error types
├── core/ # Core domain types
│ ├── buffer.rs # Buffer type
│ ├── chunk.rs # Chunk type
│ └── context.rs # Context/variables
├── chunking/ # Chunking strategies
│ ├── traits.rs # Chunker trait
│ ├── fixed.rs # Fixed chunker
│ ├── semantic.rs # Semantic chunker
│ └── parallel.rs # Parallel chunker
├── storage/ # Persistence
│ ├── traits.rs # Storage trait
│ └── sqlite.rs # SQLite backend
├── io/ # File I/O
│ ├── reader.rs # File reading
│ └── unicode.rs # Unicode utilities
└── cli/ # CLI layer
├── parser.rs # Argument parsing
├── commands.rs # Command handlers
└── output.rs # Output formatting
tests/
└── integration_test.rs
docs/
├── architecture.md # Internal architecture
├── cli-reference.md # CLI documentation
└── api.md # Library API reference
```
## License
By contributing, you agree that your contributions will be licensed under the MIT License.
## Questions?
- Open an issue for questions
- Check existing issues and documentation first
Thank you for contributing!