# Worker Service — Index
Centralised registry of workforce and professional identities:
clinicians, contractors, drivers, hospital staff, field engineers.
Carries credential / licence / professional-identifier fields (NPI,
DEA, board licence, employee number) alongside healthcare-aware
demographics. Probabilistic + deterministic matching, real-time and
batch deduplication, HIPAA-grade audit, GDPR Article 15 export, and a
FHIR R5 Practitioner surface.
This page is a **navigation aid with worked examples**. For canonical
behaviour, read [`spec.md`](spec.md).
## Documentation map
| [`spec.md`](spec.md) | **Single source of truth.** What the system does, how it is built, NFRs, tasks (§13), open questions (§16). |
| [`README.md`](README.md) / [`CLAUDE.md`](CLAUDE.md) | User-facing intro — must stay consistent with the spec. |
| [`AGENTS.md`](AGENTS.md) | Agent-facing entry point — `AGENTS/*` directory + shared docs. |
| [`AGENTS/spec-driven-development.md`](AGENTS/spec-driven-development.md) | The SDD discipline this crate practises. |
| [`AGENTS/models.md`](AGENTS/models.md) | Field-by-field domain model reference. |
| [`AGENTS/matching.md`](AGENTS/matching.md) | Match weights, components, deterministic rules, Soundex. |
| [`AGENTS/restful.md`](AGENTS/restful.md) | Endpoint catalogue + library API. |
| [`AGENTS/testing.md`](AGENTS/testing.md) | Unit / integration / benchmark layout. |
| [`agents/share/*`](../agents/share/) | Project-wide cross-crate references. |
## Quick start
```bash
# REST + gRPC API
cargo run --release
# Web UI (Loco / Tera / HTMX / Alpine / Lily)
cargo run --bin web # → http://0.0.0.0:5150
PORT=5180 cargo run --bin web
# Tests
cargo test --lib # unit (~99)
DATABASE_URL=… cargo test --tests # integration (needs PostgreSQL)
cargo bench # Criterion (matching / search / validation)
```
## URL surface (REST)
| GET | `/api/health` | Liveness |
| POST | `/api/workers` | Create — `409` on detected duplicate |
| GET | `/api/workers/{id}` | Read |
| PUT | `/api/workers/{id}` | Update |
| DELETE | `/api/workers/{id}` | Soft delete |
| GET | `/api/workers/search` | Full-text / fuzzy / phonetic |
| POST | `/api/workers/match` | Score against candidates |
| POST | `/api/workers/check-duplicates` | Real-time dup check |
| POST | `/api/workers/merge` | Merge survivor + duplicate |
| POST | `/api/workers/deduplicate` | Batch dedup scan |
| GET | `/api/workers/{id}/masked` | Privacy view |
| GET | `/api/workers/{id}/export` | GDPR Art. 15 export |
| GET | `/api/workers/{id}/audit` | Per-record audit |
| GET | `/api/audit/recent` | System-wide recent audit |
| GET | `/api/audit/user` | Per-user audit |
FHIR R5 Practitioner mounted under `/fhir/Practitioner/*`. See
[`AGENTS/restful.md`](AGENTS/restful.md) for full parameters.
## Worked examples
### Create a worker (clinician with NPI)
```bash
curl -X POST http://localhost:8080/api/workers \
-H 'content-type: application/json' \
-d '{
"name": { "family": "Patel", "given": ["Nisha"], "prefix": ["Dr."] },
"birth_date": "1985-07-04",
"gender": "female",
"identifiers": [
{ "identifier_type": "NPI", "system": "http://hl7.org/fhir/sid/us-npi", "value": "1234567893" }
],
"documents": [{
"document_type": "PROFESSIONAL_LICENSE",
"number": "CA-MD-12345",
"issuing_country": "US",
"issuing_authority": "Medical Board of California",
"issue_date": "2012-06-15",
"expiry_date": "2026-06-14",
"verified": true
}],
"managing_organization": "11111111-1111-1111-1111-111111111111"
}'
```
If the request creates a duplicate above the threshold, you get
`409 Conflict` with the candidate matches and per-component scores
(NPI exact match short-circuits to `score = 1.00`).
### Check for duplicates without creating
```bash
curl -X POST http://localhost:8080/api/workers/check-duplicates \
-H 'content-type: application/json' \
-d '{
"name": { "family": "Patel", "given": ["Nisha"] },
"identifiers": [
{ "identifier_type": "NPI", "system": "http://hl7.org/fhir/sid/us-npi", "value": "1234567893" }
]
}'
```
### Search
```bash
curl "http://localhost:8080/api/workers/search?q=Patel\
&limit=10&offset=0&fuzzy=true&phonetic=true&mask_sensitive=true"
```
### Match against existing records
```bash
curl -X POST http://localhost:8080/api/workers/match \
-H 'content-type: application/json' \
-d '{
"name": { "family": "Patell", "given": ["Nisha"] },
"birth_date": "1985-07-04",
"threshold": 0.7
}'
```
Returns ranked candidates with `score`, `match_quality`, and a
per-component `breakdown`.
### Merge
```bash
curl -X POST http://localhost:8080/api/workers/merge \
-H 'content-type: application/json' \
-d '{
"main_worker_id": "11111111-1111-1111-1111-111111111111",
"duplicate_worker_id": "22222222-2222-2222-2222-222222222222",
"merge_reason": "Confirmed duplicate — same NPI, different employee numbers"
}'
```
Credentials transfer from the duplicate to the survivor, the
duplicate's primary name appends as a "former" alias, and a
`Replaces` link is written.
### Batch deduplication
```bash
curl -X POST http://localhost:8080/api/workers/deduplicate \
-H 'content-type: application/json' \
-d '{
"threshold": 0.70,
"auto_merge_threshold": 0.95,
"max_candidates": 50
}'
```
### GDPR Article 15 export
```bash
curl "http://localhost:8080/api/workers/{id}/export"
```
### Masked worker view
```bash
curl "http://localhost:8080/api/workers/{id}/masked"
```
Workforce-specific sensitive fields (SSN, tax ID, DEA, home address)
are masked by default in the masked view.
### FHIR R5 Practitioner
```bash
# Create
curl -X POST http://localhost:8080/fhir/Practitioner \
-H 'content-type: application/fhir+json' \
-d '{
"resourceType": "Practitioner",
"identifier": [{ "system": "http://hl7.org/fhir/sid/us-npi", "value": "1234567893" }],
"name": [{ "family": "Patel", "given": ["Nisha"], "prefix": ["Dr."] }],
"gender": "female",
"birthDate": "1985-07-04"
}'
# Read
curl -H 'accept: application/fhir+json' http://localhost:8080/fhir/Practitioner/{id}
# Search
## Library API examples
### Match two workers
```rust
use worker_service::matching::{ProbabilisticMatcher, WorkerMatcher};
use worker_service::models::*;
let a = Worker::new(HumanName::new("Patel", ["Nisha"]), Gender::Female);
let b = Worker::new(HumanName::new("Patell", ["Nisha"]), Gender::Female);
let matcher = ProbabilisticMatcher::with_defaults();
let result = matcher.match_workers(&a, &b);
println!("score={:.3} quality={:?}", result.score, result.quality);
for (k, v) in &result.breakdown {
println!(" {k}: {v:.3}");
}
```
### Validate and normalise
```rust
use worker_service::validation::{validate_worker, normalize_phone, standardize_address};
let errs = validate_worker(&worker);
assert!(errs.is_empty(), "validation failed: {errs:?}");
let phone = normalize_phone("(555) 010-9999", "US"); // → "+15550109999"
```
### Privacy mask + GDPR export
```rust
use worker_service::privacy::{mask_worker, export_worker_data, has_active_consent};
use worker_service::models::consent::ConsentType;
let masked = mask_worker(&worker);
let export = export_worker_data(&worker);
let ok = has_active_consent(&worker_consents, ConsentType::DataSharing);
```
## Configuration
| `DATABASE_URL` | PostgreSQL connection string | _required_ |
| `DATABASE_MIN_CONNECTIONS` / `DATABASE_MAX_CONNECTIONS` | Pool sizes | `2` / `10` |
| `SERVER_HOST` | REST bind address | `0.0.0.0` |
| `SERVER_PORT` | REST port | `8080` |
| `PORT` | Web UI port (`cargo run --bin web`) | `5150` |
| `SEARCH_INDEX_PATH` | Tantivy index directory | `./search_index` |
| `MATCHING_THRESHOLD` | Default match cutoff | `0.7` |
| `OTLP_ENDPOINT` | OpenTelemetry collector | `http://localhost:4317` |
| `OTLP_SERVICE_NAME` | OTel `service.name` | `worker-service` |
| `RUST_LOG` | `tracing-subscriber` filter | `info,worker_service=info` |
## Project layout
```
src/
├── lib.rs # Library root
├── api/ # REST, FHIR R5 Practitioner, gRPC API layers
├── models/ # Worker, HumanName, Identifier, Document, EmergencyContact, …
├── matching/ # algorithms (name, DOB, gender, address, identifier, tax-ID, document, phonetic)
├── search/ # Tantivy index + query
├── db/ # SeaORM models + repositories + audit
├── streaming/ # Event publishing (InMemory + Fluvio stub)
├── validation/ # Validation + normalisation
├── privacy/ # Masking + GDPR export + consent
├── config/ # Env loading + Config struct
├── observability/ # OpenTelemetry setup
├── web/ # Loco app + Tera views + Axum web router
├── bin/web.rs # cargo run --bin web
└── error.rs
assets/views/ # Tera templates (HTMX + Alpine + Lily)
assets/static/ # lily.css, htmx.min.js, alpine.min.js
config/ # development.yaml, test.yaml, production.yaml
migrations/ # SeaORM up.sql / down.sql pairs
tests/ # Integration tests
benches/ # Criterion benchmarks
AGENTS/ # Reference documentation
```
## Key types
| `Worker` | `models::worker` | Core worker identity record |
| `HumanName` | `models::worker` | Structured name |
| `Gender` | `models::mod` | Male / Female / Other / Unknown |
| `Identifier` | `models::identifier` | External IDs (MRN, SSN, DL, NPI, PPN, TAX, Other) |
| `IdentityDocument` | `models::document` | Credentials / licences / passports |
| `EmergencyContact` | `models::emergency_contact` | Name + relationship + telecom + address |
| `Address` / `ContactPoint` | `models::mod` | Shared shapes |
| `Consent` | `models::consent` | GDPR consent record |
| `MergeRequest` / `MergeResponse` / `MergeRecord` | `models::merge` | Merge contract + persisted record |
| `ReviewQueueItem` | `models::review_queue` | Pending / Confirmed / Rejected / AutoMerged |
| `MatchResult` / `MatchScoreBreakdown` | `matching::mod` | Score + per-component detail |
## Key functions
| `match_workers` | `matching::mod` | Match two workers with weighted scoring |
| `find_matches` | `matching::mod` | Match a worker against a candidate list |
| `match_name` | `matching::algorithms` | Jaro-Winkler + Levenshtein name comparison |
| `match_dob` | `matching::algorithms` | Date proximity with tolerance |
| `match_address` | `matching::algorithms` | Weighted address comparison |
| `match_tax_id` | `matching::algorithms` | Exact tax-ID match (short-circuit) |
| `match_document` | `matching::algorithms` | Document type + number match |
| `soundex` | `matching::phonetic` | 4-char phonetic |
| `validate_worker` | `validation` | Required + format checks |
| `normalize_phone` | `validation` | E.164-like normalisation |
| `standardize_address` | `validation` | Title-case city, uppercase region, expand abbreviations |
| `mask_worker` | `privacy` | Per-field masking |
| `export_worker_data` | `privacy` | GDPR Article 15 export |
| `has_active_consent` | `privacy` | Consent check utility |
## Status & roadmap
- **Status** — see [`spec.md §14`](spec.md#14-implementation-status).
- **Tasks** — see [`spec.md §13`](spec.md#13-tasks), including the
credential-expiry warning workflow (T-7) and role / assignment
history timeline (T-8).
- **Roadmap** — see [`spec.md §15`](spec.md#15-roadmap).
- **Open questions** — see [`spec.md §16`](spec.md#16-open-questions).
## Compliance
| HIPAA | Audit log, soft delete, encryption-at-rest, access controls |
| GDPR Art. 15 | `/api/workers/{id}/export` |
| GDPR Art. 17 | Soft delete + consent revocation |
| HL7 FHIR R5 | Practitioner resource bidirectional conversion |
| ISO/IEC 27001 | Operational controls (deployment-side) |
## License
Dual-licensed: MIT OR Apache-2.0.