Expand description
§NPPES (National Plan and Provider Enumeration System) Data Library
A Rust library for working with NPPES healthcare provider data.
§Features
- High performance: Efficient parsing of large datasets with progress tracking
- Builder pattern for loading datasets
- Querying and statistical analysis
- Multiple export formats: JSON, CSV, SQL, and more
- Download capability: Direct download from CMS servers
- Full text search: Advanced search capabilities
- Indexing for provider lookups
- Modular design: Load only the data required
- Type-safe data structures with validation
§Quick Start
use nppes::prelude::*;
// Load all NPPES data from a directory
let dataset = NppesDataset::load_standard("./data")?;
// Query providers
let ca_cardiologists = dataset
.query()
.state("CA")
.specialty("Cardiology")
.active_only()
.execute();
println!("Found {} cardiologists in California", ca_cardiologists.len());
// Export results
dataset.export_subset(
"ca_cardiologists.json",
|p| p.mailing_address.state.as_ref().map(|s| s.as_code()) == Some("CA"),
ExportFormat::Json
)?;
§Loading Data
§Using the Builder Pattern
let dataset = NppesDatasetBuilder::new()
.main_data("data/npidata_pfile_20240101-20240107.csv")
.taxonomy_reference("data/nucc_taxonomy_240.csv")
.other_names("data/othername_pfile_20240101-20240107.csv")
.skip_invalid_records(true)
.build()?;
§Download from CMS
// Download the latest NPPES data directly from CMS
let dataset = NppesDatasetBuilder::download_latest().await?;
§Memory Estimation
// Estimate memory requirements before loading
let estimate = NppesReader::estimate_memory_usage("data/npidata_pfile.csv")?;
println!("Estimated memory usage: {}", estimate.estimated_memory_human);
§Querying Data
§Find Providers by Criteria
// Find all active primary care physicians in New York
let ny_pcps = dataset
.query()
.state("NY")
.entity_type(EntityType::Individual)
.specialty("Primary Care")
.active_only()
.execute();
// Get providers by NPI (O(1) lookup if indexed)
if let Some(provider) = dataset.get_by_npi(&Npi::new("1234567890".to_string())?) {
println!("Provider: {}", provider.display_name());
}
§Statistical Analysis
// Get dataset statistics
let stats = dataset.statistics();
stats.print_summary();
// Use analytics engine for advanced queries
let analytics = dataset.analytics();
let top_states = analytics.top_states_by_provider_count(10);
§Exporting Data
§Export to Different Formats
// Export to JSON
dataset.export_json("providers.json")?;
// Export to JSON Lines (streaming format)
dataset.export_json_lines("providers.jsonl")?;
// Export to normalized CSV files
dataset.export_csv("providers.csv")?;
// Export to SQL
dataset.export_sql("providers.sql", SqlDialect::PostgreSQL)?;
// Export filtered subset
dataset.export_subset(
"texas_organizations.json",
|p| p.entity_type == Some(EntityType::Organization) &&
p.mailing_address.state.as_ref().map(|s| s.as_code()) == Some("TX"),
ExportFormat::Json
)?;
§Configuration
§Using Configuration
// Use a custom configuration
let config = NppesConfig::performance();
nppes::config::set_global_config(config);
// Or build your own
let config = ConfigBuilder::new()
.progress_bar(false)
.validation_level(ValidationLevel::Basic)
.skip_invalid_records(true)
.build();
§Performance Considerations
- Use indexes for fast lookups
- Parallel processing is enabled by default for faster data loading and queries.
- Skip invalid records for resilient parsing
- Estimate memory requirements before loading large files
- Enable progress bars for long operations
§NPPES Data Files
The library supports the following NPPES file types:
- Main Data File:
npidata_pfile_YYYYMMDD-YYYYMMDD.csv
- Other Names:
othername_pfile_YYYYMMDD-YYYYMMDD.csv
- Practice Locations:
pl_pfile_YYYYMMDD-YYYYMMDD.csv
- Endpoints:
endpoint_pfile_YYYYMMDD-YYYYMMDD.csv
- Taxonomy Reference:
nucc_taxonomy_XXX.csv
Data files are available at: https://download.cms.gov/nppes/NPI_Files.html
Re-exports§
pub use error::NppesError;
pub use error::Result;
pub use error::ErrorContext;
pub use error::ExportFormat;
Modules§
- analytics
- Analytics and querying functionality for NPPES data
- config
- Configuration support for NPPES library
- constants
- NPPES data constants
- cookbook
- Common recipes and utility functions
- data_
types - Data type definitions for NPPES records
- dataset
- Unified dataset API for NPPES data
- download
- Download functionality for NPPES data from the internet
- error
- Enhanced error handling for NPPES data library operations
- export
- Export functionality for NPPES data
- prelude
- Prelude module for convenient imports
- reader
- Enhanced CSV reader for NPPES data files
- schema
- Schema definitions for NPPES data files