data-gov 0.3.1

Rust client library and CLI for data.gov - search, explore, and download U.S. government datasets
Documentation
# data-gov

High-level Rust client and CLI for [data.gov](https://data.gov). It wraps the low-level [`data-gov-ckan`](../data-gov-ckan/) crate with download helpers, an interactive REPL, and ergonomic configuration.

## Requirements

- Rust **1.90+** (Rust 2024 edition)
- Cargo and git

```bash
rustup toolchain install stable
rustup default stable
```

## Add to your project

Use the published crate from crates.io:

```toml
[dependencies]
data-gov = "0.2.0"
tokio = { version = "1", features = ["full"] }
```

Working inside this repository? You can still use a path dependency in `Cargo.toml`:

```toml
data-gov = { path = "../data-gov" }
```

Need unreleased features between tags? Swap in the git dependency form instead:

```toml
data-gov = { git = "https://github.com/dspadea/data-gov-rs", package = "data-gov" }
```

### CLI install

```bash
git clone https://github.com/dspadea/data-gov-rs.git
cd data-gov-rs/data-gov
cargo install --path .
```

The `data-gov` binary is then available on your PATH.

## Highlights

- 🔍 Search data.gov with optional organization / format filters
- 📦 Retrieve dataset metadata and enumerate downloadable resources
- ⬇️ Download individual resources or entire datasets with progress bars
- 🏛️ List organisations and query autocomplete endpoints
- 🖥️ Interactive REPL with colour-aware output and shebang-friendly scripts

## Library quick start

```rust
use data_gov::DataGovClient;

#[tokio::main]
async fn main() -> data_gov::Result<()> {
    let client = DataGovClient::new()?;

    let results = client.search("climate change", Some(10), None, None, None).await?;
    println!("Found {} datasets", results.count.unwrap_or(0));

    let dataset = client.get_dataset("consumer-complaint-database").await?;
    println!("Dataset: {}", dataset.title.as_deref().unwrap_or(&dataset.name));

    let resources = DataGovClient::get_downloadable_resources(&dataset);
    if let Some(resource) = resources.first() {
        let path = client.download_resource(resource, None).await?;
        println!("Downloaded to {path:?}");
    }

    Ok(())
}
```

## CLI overview

```
data-gov search "climate change" 5
data-gov show electric-vehicle-population-data
data-gov download electric-vehicle-population-data 0                           # Download by index
data-gov download electric-vehicle-population-data "Comma Separated Values File"  # Download by name (quoted)
data-gov download electric-vehicle-population-data csv                         # Partial match (unquoted)
data-gov list organizations
```

Key defaults:

- **Interactive mode:** `data-gov` launches a REPL that stores downloads under `~/Downloads/<dataset>/`
- **Non-interactive mode:** Commands run directly in your current directory (`./<dataset>/`)
- Override download location with `--download-dir`, toggle colours with `--color`, and silence progress bars via `NO_PROGRESS=1`

### Command reference

| Command | Purpose |
| ------- | ------- |
| `search <query> [limit]` | Full-text search with optional result cap |
| `show <dataset_id>` | Inspect dataset details and resources |
| `download <dataset_id> [index\|name]` | Download all resources, or a specific resource by index or name (partial match, use quotes for multi-word names) |
| `list organizations` | List publishing organisations |
| `setdir <path>` | Change the active download directory (REPL only) |
| `info` | Display current configuration |
| `help`, `quit` | Help and exit commands |

### Automation

The REPL accepts stdin, so shebang scripts work out of the box:

```bash
#!/usr/bin/env data-gov
# Simple automation example
search "electric vehicle" 3
show electric-vehicle-population-data
download electric-vehicle-population-data "Comma Separated Values File"    # Download by name (quoted)
quit
```

See [`../examples/scripting`](../examples/scripting) for ready-made scripts such as `download-epa-climate.sh` and `list-orgs.sh`.

### Solr query syntax

The `search` method and CLI `search` command accept free-text queries that are
interpreted by CKAN's Solr backend. You can use simple text, wildcards, phrases,
and boolean operators. Examples:

- `search "climat*"` (wildcard)
- `search "\"air quality\""` (phrase search)
- `search "climate AND (temperature OR precipitation)"` (boolean)

If you need advanced, fielded filters use the lower-level CKAN client via
`data_gov::ckan::CkanClient::package_search` and pass an `fq` filter string.

## Configuration

```rust
use data_gov::{DataGovClient, DataGovConfig, OperatingMode};

let config = DataGovConfig::new()
    .with_mode(OperatingMode::CommandLine)
    .with_download_dir("./data")
    .with_api_key("your-api-key")
    .with_max_concurrent_downloads(5)
    .with_progress(true);

let client = DataGovClient::with_config(config)?;
```

Configuration covers the underlying CKAN settings, download directory logic, concurrency, progress output, and colour preferences.

## Development

```bash
cd data-gov-rs
cargo test -p data-gov
cargo run -p data-gov --example demo
```

The crate re-exports `data-gov-ckan` as `data_gov::ckan`, making the lower-level client available when you need direct CKAN access.

## Contributing & license

- Fork, branch, add tests, run `cargo test`, open a PR
- Licensed under [Apache 2.0]../LICENSE


## Disclaimer & license

This is an independent project and is not affiliated with data.gov or any government agency. For authoritative information, refer to the official [data.gov](https://www.data.gov/) portal.

Licensed under the [Apache License 2.0](LICENSE).