bed-reader 0.2.13

Read and write the PLINK BED format, simply and efficiently.
Documentation
bed-reader
==========

[<img alt="github" src="https://img.shields.io/badge/github-bed--reader-8da0cb?style=for-the-badge&labelColor=555555&logo=github" height="20">](https://github.com/fastlmm/bed-reader)
[<img alt="crates.io" src="https://img.shields.io/crates/v/bed-reader.svg?style=for-the-badge&color=fc8d62&logo=rust" height="20">](https://crates.io/crates/bed-reader)
[<img alt="docs.rs" src="https://img.shields.io/badge/docs.rs-bed--reader-66c2a5?style=for-the-badge&labelColor=555555&logoColor=white&logo=" height="20">](https://docs.rs/bed-reader)
[<img alt="build status" src="https://img.shields.io/github/workflow/status/fastlmm/bed-reader/CI/master?style=for-the-badge" height="20">](https://github.com/fastlmm/bed-reader/actions?query=branch%3Amaster)

Read and write the PLINK BED format, simply and efficiently.

Features
--------

* Fast and multi-threaded
* Supports many indexing methods. Slice data by individuals (samples) and/or SNPs (variants).
* The [Python-facing API](https://pypi.org/project/bed-reader/) for this library is used by [PySnpTools](https://github.com/fastlmm/PySnpTools), [FaST-LMM](https://github.com/fastlmm/FaST-LMM), and [PyStatGen](https://github.com/pystatgen).
* Supports [PLINK 1.9](https://www.cog-genomics.org/plink2/formats).

Examples
--------

*Sample files available* [*here on Github*](https://github.com/fastlmm/bed-reader/tree/master/bed_reader/tests/data).

Read all genotype data from a .bed file.

```rust
use ndarray as nd;
use bed_reader::{Bed, ReadOptions, assert_eq_nan};

let file_name = "small.bed";
let mut bed = Bed::new(file_name)?;
let val = ReadOptions::builder().f64().read(&mut bed)?;

assert_eq_nan(
    &val,
    &nd::array![
        [1.0, 0.0, f64::NAN, 0.0],
        [2.0, 0.0, f64::NAN, 2.0],
        [0.0, 1.0, 2.0, 0.0]
    ],
);
```

Read individual (samples) from 20 to 30 and every second SNP (variant).

```rust
use ndarray::s;

let file_name = "some_missing.bed";
let mut bed = Bed::new(file_name)?;
let val = ReadOptions::builder()
    .iid_index(s![..;2])
    .sid_index(20..30)
    .f64()
    .read(&mut bed)?;

assert!(val.dim() == (50, 10));
```

List the first 5 individual (sample) ids, the first 5 SNP (variant) ids,
and every unique chromosome. Then, read every genomic value in chromosome 5.

```rust
use std::collections::HashSet;

let mut bed = Bed::new(file_name)?;
println!("{:?}", bed.iid()?.slice(s![..5])); // Outputs ndarray: ["iid_0", "iid_1", "iid_2", "iid_3", "iid_4"]
println!("{:?}", bed.sid()?.slice(s![..5])); // Outputs ndarray: ["sid_0", "sid_1", "sid_2", "sid_3", "sid_4"]
println!("{:?}", bed.chromosome()?.iter().collect::<HashSet<_>>());
// Outputs: {"12", "10", "4", "8", "19", "21", "9", "15", "6", "16", "13", "7", "17", "18", "1", "22", "11", "2", "20", "3", "5", "14"}
let val = ReadOptions::builder()
    .sid_index(bed.chromosome()?.map(|elem| elem == "5"))
    .f64()
    .read(&mut bed)?;

assert!(val.dim() == (100, 6));
```

Additional Links
-----

- [**Questions via Email**](mailto:fastlmm-dev@python.org)
- [**Discussion**](https://github.com/fastlmm/bed-reader/discussions/)
- [**Bug reports**](https://github.com/fastlmm/bed-reader/issues)