kizzasi-tokenizer 0.1.0

# Contributing to kizzasi-tokenizer

Thank you for your interest in contributing to kizzasi-tokenizer! This document provides guidelines and instructions for contributing.

## Development Setup

### Prerequisites

- Rust 1.70+ (stable)
- Git
- (Optional) Docker for local CI testing

### Initial Setup

1. Clone the repository:
```bash
git clone https://github.com/cool-japan/kizzasi.git
cd kizzasi/crates/kizzasi-tokenizer
```

2. Install development tools:
```bash
make install-tools
```

This installs:
- `cargo-tarpaulin` - Code coverage
- `cargo-criterion` - Benchmarking
- `critcmp` - Benchmark comparison
- `cargo-audit` - Security auditing
- `cargo-udeps` - Unused dependency detection
- `cargo-minimal-versions` - Minimum version testing

3. Run tests to verify setup:
```bash
make test
```

## Development Workflow

### Before Making Changes

1. Create a feature branch:
```bash
git checkout -b feature/my-feature
```

2. Ensure all tests pass:
```bash
make test
```

### Making Changes

1. Write code following Rust conventions:
   - Use `cargo fmt` to format code
   - Run `cargo clippy` to catch common mistakes
   - Add documentation for public APIs
   - Write tests for new functionality

2. Run pre-commit checks:
```bash
make pre-commit
```

This runs:
- Code formatting check
- Clippy lints
- All tests

### Testing

We have comprehensive testing requirements:

#### Unit Tests
```bash
make test-unit
```

#### Integration Tests
```bash
make test
```

#### Property-Based Tests
```bash
make test-proptest
```

#### Benchmarks
```bash
make bench
```

### Code Coverage

Generate a coverage report:
```bash
make coverage
```

Open `coverage/index.html` in your browser to view the report.

**Coverage Requirements:**
- New code should have >80% coverage
- Critical paths should have >95% coverage
- Property-based tests are encouraged for mathematical properties

### Performance Testing

Check for performance regressions:
```bash
make check-perf
```

This compares your branch's performance against `master`. Regressions >10% will fail the check.

### Documentation

Build and view documentation:
```bash
make docs
```

**Documentation Requirements:**
- All public APIs must have rustdoc comments
- Include examples in doc comments
- Use `//!` for module-level documentation
- Add usage examples to `examples/` directory

## Pull Request Process

### Before Submitting

1. Ensure all checks pass:
```bash
make ci-local
```

2. Update documentation if needed
3. Add examples for new features
4. Update CHANGELOG.md (if applicable)

### Submitting a PR

1. Push your branch:
```bash
git push origin feature/my-feature
```

2. Create a pull request on GitHub

3. Fill out the PR template with:
   - Description of changes
   - Related issues
   - Testing performed
   - Breaking changes (if any)

### PR Review Process

Your PR will be automatically checked for:
- ✅ All tests passing on Ubuntu, macOS, and Windows
- ✅ Clippy lints passing
- ✅ Code formatting
- ✅ Documentation building without warnings
- ✅ No performance regressions >150%
- ✅ Code coverage maintained or improved

Manual review will check for:
- Code quality and style
- Test coverage
- Documentation completeness
- API design
- Performance implications

## Code Style Guidelines

### Rust Style

Follow the [Rust API Guidelines](https://rust-lang.github.io/api-guidelines/):

```rust
// Good: Clear, documented, tested
/// Encodes a signal into tokens using linear quantization.
///
/// # Arguments
///
/// * `signal` - Input signal to encode
/// * `bits` - Number of bits for quantization (1-16)
///
/// # Returns
///
/// Encoded tokens as a 1D array
///
/// # Examples
///
/// ```
/// use kizzasi_tokenizer::LinearQuantizer;
///
/// let quantizer = LinearQuantizer::new(8, -1.0, 1.0)?;
/// let tokens = quantizer.encode(&signal)?;
/// # Ok::<(), kizzasi_tokenizer::TokenizerError>(())
/// ```
pub fn encode(&self, signal: &Array1<f32>) -> TokenizerResult<Array1<f32>> {
    // Implementation
}
```

### Naming Conventions

- **Types**: `PascalCase` (e.g., `LinearQuantizer`)
- **Functions**: `snake_case` (e.g., `encode_batch`)
- **Constants**: `SCREAMING_SNAKE_CASE` (e.g., `DEFAULT_BITS`)
- **Module files**: `snake_case` (e.g., `linear_quantizer.rs`)

### Error Handling

Use the `TokenizerError` enum for all errors:

```rust
// Good: Specific error with context
if signal.is_empty() {
    return Err(TokenizerError::invalid_input(
        "encode",
        "Signal cannot be empty"
    ));
}

// Bad: Generic error
if signal.is_empty() {
    return Err(TokenizerError::InvalidConfig("empty".into()));
}
```

### Testing Guidelines

```rust
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_encode_decode_roundtrip() {
        // Setup
        let quantizer = LinearQuantizer::new(8, -1.0, 1.0).unwrap();
        let signal = Array1::from_vec(vec![0.0, 0.5, -0.5, 1.0]);

        // Execute
        let encoded = quantizer.encode(&signal).unwrap();
        let decoded = quantizer.decode(&encoded).unwrap();

        // Verify
        assert_eq!(decoded.len(), signal.len());
        for (orig, dec) in signal.iter().zip(decoded.iter()) {
            assert!((orig - dec).abs() < 0.01); // Within quantization error
        }
    }
}
```

## Benchmarking

### Writing Benchmarks

Add benchmarks to `benches/comprehensive_benchmarks.rs`:

```rust
use criterion::{black_box, criterion_group, criterion_main, Criterion};

fn bench_my_feature(c: &mut Criterion) {
    let mut group = c.benchmark_group("my_feature");

    for size in [64, 256, 1024, 4096] {
        group.bench_function(format!("size_{}", size), |b| {
            let data = vec![0.0f32; size];
            b.iter(|| {
                // Benchmark code
                black_box(my_feature(&data))
            });
        });
    }

    group.finish();
}

criterion_group!(benches, bench_my_feature);
criterion_main!(benches);
```

### Running Benchmarks

```bash
# Run all benchmarks
make bench

# Save baseline
make bench-baseline

# Compare with baseline
make bench-compare
```

## CI/CD Pipeline

### Automated Checks

Every push and PR triggers:

1. **Test Suite** (`ci.yml`)
   - Multi-platform testing (Ubuntu, macOS, Windows)
   - Multiple Rust versions (stable, beta, nightly)
   - Clippy and rustfmt checks
   - Documentation build
   - Miri undefined behavior detection
   - Security audit
   - Minimal versions check
   - Unused dependencies check

2. **Benchmarks** (`benchmark.yml`)
   - Performance benchmarking
   - Regression detection
   - Memory profiling
   - Throughput analysis

3. **Coverage** (`coverage.yml`)
   - Code coverage with Tarpaulin
   - Coverage diff for PRs
   - Line-by-line coverage with grcov

### Local CI Testing

Test workflows locally with [act](https://github.com/nektos/act):

```bash
# Install act
brew install act  # macOS
# or curl -s https://raw.githubusercontent.com/nektos/act/master/install.sh | sudo bash

# Run CI tests
act -j test

# Run benchmarks
act -j benchmark

# Run coverage
act -j coverage
```

Or use the Makefile:

```bash
make ci-local
```

## Release Process

Releases are automated through GitHub Actions:

### Version Bump

1. Update version in `Cargo.toml`:
```toml
[package]
version = "0.2.0"  # Bump version
```

2. Update CHANGELOG.md with changes

3. Commit and tag:
```bash
git commit -am "chore: bump version to 0.2.0"
git tag kizzasi-tokenizer-v0.2.0
git push --tags
```

4. GitHub Actions will:
   - Run all tests
   - Build for all platforms
   - Create GitHub release
   - Publish to crates.io
   - Deploy documentation

### Pre-Release Checklist

Use the Makefile for a comprehensive check:

```bash
make release-checklist
```

This runs:
- Clean build
- All CI checks
- Performance regression check
- Coverage report
- Publish dry-run

## Getting Help

- **Questions**: Open a [Discussion](https://github.com/cool-japan/kizzasi/discussions)
- **Bugs**: Open an [Issue](https://github.com/cool-japan/kizzasi/issues)
- **Security**: Email security@cool-japan.org

## Code of Conduct

Be respectful, inclusive, and collaborative. We follow the [Rust Code of Conduct](https://www.rust-lang.org/policies/code-of-conduct).

## License

By contributing, you agree that your contributions will be licensed under the same license as the project (check LICENSE file).

## Additional Resources

- [Rust API Guidelines](https://rust-lang.github.io/api-guidelines/)
- [Rust Book](https://doc.rust-lang.org/book/)
- [Criterion.rs Guide](https://bheisler.github.io/criterion.rs/book/)
- [Property-based testing with proptest](https://proptest-rs.github.io/proptest/)

Thank you for contributing! 🦀