onecode 0.1.0

Rust bindings for ONEcode - a data representation format for genomic data
Documentation
# BUG: get_sequence_name() Can't Read GDB Skeleton in .1aln Files

## Problem

`OneFile::get_sequence_name()` and `get_all_sequence_names()` always return `None`/empty, even when the .1aln file DOES contain GDB skeleton data with 'S' (scaffold) records.

## Evidence

### 1. The .1aln File Has Scaffolds

Using ONEview shows the data IS present:

```bash
$ ~/fastga-rs/deps/fastga/ONEview /tmp/with_gdb.1aln | grep "^[gS]" | head -20
g
S 13 SGDref#1#chrI
C 230218
S 14 SGDref#1#chrII
C 813184
S 15 SGDref#1#chrIII
C 316620
...
```

The file has:
- 136 'S' (scaffold) records
- All inside a 'g' (group) object
- Each 'S' line has a string (the sequence name)

### 2. But get_sequence_name() Returns None

```rust
let mut file = OneFile::open_read("with_gdb.1aln", Some(&schema), Some("aln"), 1)?;

// Try to get sequence 0
match file.get_sequence_name(0) {
    Some(name) => println!("Found: {}", name),
    None => println!("NOT FOUND"),  // <-- Always hits this
}

// Try to get all
let all = file.get_all_sequence_names();
println!("Count: {}", all.len());  // <-- Always returns 0
```

**Output:**
```
NOT FOUND
Count: 0
```

## Root Cause

Looking at `/home/erik/.cargo/git/checkouts/onecode-rs-ae9ceae2b231d338/21626bd/src/file.rs:534-561`:

```rust
pub fn get_sequence_name(&mut self, seq_id: i64) -> Option<String> {
    // Save current position
    let saved_line = self.line_number();

    // Go to the beginning of the file to scan for S lines
    unsafe {
        // Try to goto the start of S objects (if indexed)
        if ffi::oneGoto(self.ptr, 'S' as i8, 0) {  // <-- THIS FAILS
            let mut current_id = 0i64;
            loop {
                let line_type = ffi::oneReadLine(self.ptr) as u8 as char;
                if line_type == '\0' {
                    break;
                }
                if line_type == 'S' {
                    ...
                }
            }
        }
    }
    None  // <-- Always returns None because oneGoto failed
}
```

**The problem:** `oneGoto(self.ptr, 'S' as i8, 0)` returns `false` because:

1. 'S' is NOT a top-level object type ('O' in the schema)
2. 'S' is a GROUP member ('G' in the schema): `~ G S 0`
3. 'S' lines are INSIDE 'g' group objects
4. oneGoto() can't jump directly to 'S' records

## The Schema

From ONEview output:

```
~ O g 0                       groups scaffolds into a GDB skeleton
~ G S 0                         collection of scaffolds constituting a GDB
~ O S 1 6 STRING              id for a scaffold
```

This means:
- `g` is an **object** type (`O g 0`) - can be indexed
- `S` is a **group member** (`G S 0`) - INSIDE 'g' objects
- `S` has one STRING field - the sequence name

## Correct Implementation

To read 'S' records, we need to:

1. Navigate to the 'g' (group) object
2. Read lines until we hit 'S' records
3. Extract the STRING field from each 'S' line

```rust
pub fn get_all_sequence_names(&mut self) -> HashMap<i64, String> {
    let mut names = HashMap::new();
    let saved_line = self.line_number();

    unsafe {
        // Go to the 'g' group object (not 'S' directly!)
        if ffi::oneGoto(self.ptr, 'g' as i8, 0) {
            let mut current_id = 0i64;

            loop {
                let line_type = ffi::oneReadLine(self.ptr) as u8 as char;

                if line_type == '\0' {
                    break; // EOF
                }

                if line_type == 'S' {
                    // 'S' has one STRING field
                    if let Some(name) = self.string() {
                        names.insert(current_id, name.to_string());
                        current_id += 1;
                    }
                }

                // Stop when we hit the next 'g' or reach 'A' (alignments)
                if line_type == 'g' || line_type == 'A' {
                    break;
                }
            }

            // Restore position
            let _ = ffi::oneGoto(self.ptr, (*self.ptr).lineType, saved_line);
        }
    }

    names
}
```

## Testing the Fix

Once fixed, this should work:

```rust
let mut file = OneFile::open_read("with_gdb.1aln", Some(&schema), Some("aln"), 1)?;

let all_names = file.get_all_sequence_names();
assert_eq!(all_names.len(), 136); // 136 scaffolds in test file

// Check specific names
assert_eq!(all_names.get(&0), Some(&"SGDref#1#chrI".to_string()));
assert_eq!(all_names.get(&1), Some(&"SGDref#1#chrII".to_string()));

// Individual lookup
let name = file.get_sequence_name(0)?;
assert_eq!(name, "SGDref#1#chrI");
```

## Priority

**CRITICAL** - This is the final blocker for pure .1aln filtering in sweepga.

Without this fix:
- ❌ Can't get real sequence names from .1aln
- ❌ Must convert .1aln → PAF to get names
- ❌ Can't group alignments by chromosome pairs properly

With this fix:
- ✅ Pure .1aln filtering works
- ✅ No PAF intermediate needed
- ✅ Sweepga can use X-field identity directly

## Files to Fix

1. `/home/erik/.cargo/git/checkouts/onecode-rs-ae9ceae2b231d338/21626bd/src/file.rs`
   - Line 534: `get_sequence_name()`
   - Line 570: `get_all_sequence_names()`

Both need to navigate to 'g' object first, then scan for 'S' lines.

## Additional Note

The same issue likely affects other group members in ONE files. The pattern of "navigate to parent object, then scan for group members" should be documented for future API users.