api_claude 0.4.0

Claude API for accessing Anthropic's large language models (LLMs).
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
# spec

- **Name:** api_claude
- **Version:** 0.8
- **Date:** 2025-11-07
- **Status:** Production Ready ✅
- **Test Coverage:** 435/435 tests passing (100%)
- **System Specification:** [../../spec.md]../../spec.md

### Project Overview

The `api_claude` crate provides a comprehensive, type-safe Rust client library for interacting with Anthropic's Claude API services. This specification defines the architecture, requirements, and standards for the implementation.

**Architecture Decision**: This API crate is designed as a **stateless HTTP client** with no persistence requirements. All operations are direct HTTP calls to the Claude API without local data storage, caching, or state management beyond request/response handling.

**Governing Principle**: **"Thin Client, Rich API"** - Expose all server-side functionality transparently while maintaining zero client-side intelligence or **automatic** behaviors. **Key Distinction**: The principle prohibits **automatic/implicit** behaviors but explicitly **allows and encourages** **explicit/configurable** enterprise reliability features.

**Note:** This specification must be implemented in accordance with the ecosystem-wide requirements defined in the [System Specification](../../spec.md).

### Vocabulary

- **API Client**: The main library interface that coordinates all interactions with Anthropic Claude services
- **Environment**: Configuration object that encapsulates API credentials, base URLs, and connection parameters
- **Secret Management**: Secure handling of API keys using `secrecy` crate and workspace integration
- **Streaming**: Real-time delivery of generated content via Server-Sent Events
- **Tool Calling**: Function calling capabilities for integrating external tools and APIs
- **Vision Support**: Image understanding and multimodal content processing capabilities
- **Rate Limiting**: Request throttling mechanisms to manage API usage and prevent abuse
- **Circuit Breaker**: Fault tolerance pattern that prevents cascading failures by monitoring service health
- **Retry Logic**: Automatic retry mechanisms with exponential backoff for transient failures

### Scope and Objectives

### 1.1 Primary Objectives
- Provide complete coverage of Anthropic Claude API endpoints
- Ensure type safety and compile-time error detection
- Support both synchronous and asynchronous operations
- Implement robust error handling and retry mechanisms
- Maintain security best practices for credential management
- Support Server-Sent Events streaming for real-time applications

### 1.2 API Coverage Requirements
The client must support all major Anthropic Claude API endpoints:

### Core Endpoints
- **Messages API**: Create and manage conversational interactions
- **Count Tokens API**: Count message tokens before API calls for cost estimation
- **Models API**: Model listing and information retrieval
- **Content Generation**: Text generation with various parameters
- **Tool Calling**: Function calling and tool integration
- **Vision**: Image understanding capabilities
- **Streaming**: Server-sent events for incremental responses

### Advanced Features
- **System Prompts**: Separate system message handling
- **Message Roles**: User, assistant, and system role management
- **Safety Settings**: Content filtering and safety controls
- **Context Windows**: Support for large context lengths (200k+ tokens)

### 2. Architecture Design

### 2.1 Core Components

### Client Layer
```rust
pub struct Client {
    secret : Secret,
    environment : Environment,
}
```

### Environment Management
```rust
pub struct Environment {
    pub base_url : String,
    pub api_version : String,
    pub timeout : Duration,
}
```

### Secret Management
```rust
pub struct Secret(SecretString);
```
- Integration with `secrecy` crate for credential protection
- Multiple loading mechanisms: environment variables, files, workspace secrets
- Format validation for API keys (sk-ant- prefix)
- Audit trail for security monitoring

### 2.2 Module Organization

### Private Namespace Pattern
All modules must follow the wTools ecosystem `mod_interface` pattern:
```rust
mod private {}

crate::mod_interface! {
  layer client;
  layer environment;
  layer error;
  layer secret;  
  layer messages;
}
```

### Component Structure
- **Client**: `src/client.rs` - HTTP client wrapper and API operations
- **Environment**: `src/environment.rs` - Configuration management
- **Error Handling**: `src/error.rs` - Unified error types
- **Secret Management**: `src/secret.rs` - API key handling
- **Messages**: `src/messages.rs` - Message format definitions

### 2.3 Error Handling Strategy

### Error Types
```rust
#[derive(Debug, Clone, PartialEq)]
pub enum AnthropicError {
  Api(ApiError),           // API-returned errors
  Http(String),            // HTTP transport errors  
  Network(String),         // Network connectivity issues
  Timeout(String),         // Request timeout errors
  InvalidArgument(String), // Client-side validation failures
  Authentication(String),  // API key issues
  RateLimit(String),       // Rate limiting errors
}
```

### Integration with error_tools
- Leverage `error_tools` crate for ecosystem consistency
- Automatic derive implementations where possible
- Structured error propagation

### 3. Quality Standards

### 3.1 Testing Requirements

### Unit Testing
- Minimum 80% code coverage
- Test all public APIs
- Mock external dependencies for isolated component testing
- Focus on edge cases and error conditions
- Must not test API integration behavior

### ⚠️ CRITICAL POLICY: NO MOCKING ALLOWED IN INTEGRATION TESTS ⚠️

**ZERO TOLERANCE FOR MOCK TESTING IN INTEGRATION TESTS**: This crate maintains a strict **NO MOCKING ALLOWED** policy for all integration tests. This policy applies only to integration tests and is non-negotiable.

### Integration Testing: MANDATORY STRICT FAILURE POLICY
- Feature-gated integration tests: `#![cfg(feature = "integration")]`
- **🚫 ABSOLUTE PROHIBITION**: No fake API keys, mock servers, or simulated responses
- **🚫 ZERO MOCK TOLERANCE**: Any mock usage in integration tests is considered a critical policy violation
- **✅ REAL API ONLY**: Integration tests MUST use real Anthropic API endpoints exclusively
- **💥 IMMEDIATE FAILURE REQUIREMENT**: Tests MUST fail immediately and loudly if:
  - API secrets are not available (no graceful fallbacks or silent skips)
  - Network connectivity issues occur
  - API authentication fails
  - Any API endpoint returns errors
  - Mock dependencies are detected
- **📋 DOCUMENTATION MANDATE**: Every integration test file must contain mandatory policy documentation
- **🔐 CREDENTIAL REQUIREMENT**: All integration tests must use `Client::from_workspace()` or equivalent real credential loading

### Policy Enforcement
- **Code Review Requirement**: All PRs must be reviewed for integration test mock usage violations
- **Automated Scanning**: CI/CD pipeline must scan for prohibited patterns in integration tests:
  - `sk-ant-test-` fake API key patterns in integration tests
  - Mock server implementations in integration tests
  - Hardcoded JSON responses mimicking API format in integration tests
  - Simulated error responses in integration tests
- **Violation Response**: Any discovered mock usage in integration tests requires immediate remediation

### Test Organization
```
tests/
├── tests.rs         # Component-level tests
└── inc/
    ├── basic_test.rs
    └── ...
```

### 3.2 Code Quality Standards

### Documentation Requirements
- All public APIs must have rustdoc comments
- Include usage examples in documentation
- Maintain up-to-date README with examples
- API reference documentation generation

### Linting and Formatting
- Custom clippy configuration for project-specific rules
- Strict adherence to wTools codestyle patterns
- Zero-warning builds required

### 4. Enterprise Reliability Features

### 4.1 Overview

The Claude API client supports optional enterprise reliability features that enhance production robustness. These features align with the **"Thin Client, Rich API"** governing principle by requiring **explicit configuration** and **transparent operation**.

**Key Requirements**:
- **Feature Gating**: All enterprise features behind cargo features (`retry`, `circuit_breaker`, `rate_limiting`, `failover`, `health_checks`)
- **Explicit Configuration**: Developer must explicitly enable and configure each feature
- **Transparent Operation**: Method names clearly indicate additional behaviors
- **Zero Overhead**: No runtime cost when features are disabled

### 4.2 Configurable Retry Logic

**Cargo Feature**: `retry`

**Requirements**:
- Exponential backoff with jitter for connection failures
- Configurable maximum retry attempts and timeouts
- Error classification (retryable vs non-retryable errors)
- Transparent method naming: `execute_with_retries()`

**Configuration Pattern**:
```rust
let client = Client::builder()
  .api_key("sk-ant-...")
  .max_retries(3)                    // Explicitly configured
  .enable_retry_logic(true)          // Explicitly enabled
  .retry_backoff_multiplier(2.0)     // Optional: backoff configuration
  .build()?;

// Method name clearly indicates retry behavior
client.execute_with_retries(request).await?;
```

### 4.3 Circuit Breaker Pattern

**Cargo Feature**: `circuit_breaker`

**Requirements**:
- Configurable failure thresholds and timeout periods
- State transitions: Closed → Open → Half-Open → Closed
- Automatic recovery testing with configurable success thresholds
- Thread-safe implementation using `Arc<Mutex<>>` patterns

**Configuration Pattern**:
```rust
let client = Client::builder()
  .api_key("sk-ant-...")
  .circuit_breaker_failure_threshold(5)     // Explicitly configured
  .circuit_breaker_timeout(Duration::from_secs(60))
  .enable_circuit_breaker(true)             // Explicitly enabled
  .build()?;

// Method name clearly indicates circuit breaker behavior
client.execute_with_circuit_breaker(request).await?;
```

### 4.4 Rate Limiting

**Cargo Feature**: `rate_limiting`

**Requirements**:
- Token bucket or sliding window algorithms
- Configurable request rates and burst limits
- Request queuing and delay mechanisms
- Support for different rate limits per endpoint type

**Configuration Pattern**:
```rust
let client = Client::builder()
  .api_key("sk-ant-...")
  .rate_limit_requests_per_second(10.0)     // Explicitly configured
  .rate_limit_burst_size(20)                // Optional: burst configuration
  .enable_rate_limiting(true)               // Explicitly enabled
  .build()?;

// Method name clearly indicates rate limiting behavior
client.execute_with_rate_limiting(request).await?;
```

### 4.5 Failover Support

**Cargo Feature**: `failover`

**Requirements**:
- Multi-endpoint failover configuration
- Multiple failover strategies: Priority, RoundRobin, Random, Sticky
- Endpoint health tracking (Healthy, Degraded, Unhealthy, Unknown)
- Context tracking to prevent retry loops
- Thread-safe implementation

**Configuration Pattern**:
```rust
let failover_config = FailoverConfig::builder()
  .add_endpoint("https://api.anthropic.com", 1)  // Primary
  .add_endpoint("https://backup.anthropic.com", 2)  // Secondary
  .strategy(FailoverStrategy::Priority)
  .max_attempts(3)
  .build()?;

let manager = FailoverManager::new(failover_config);
let endpoint = manager.select_endpoint(context)?;
```

**Implemented**: ✅ Complete (Task 358, 4 strategies, 17 tests)

### 4.6 Health Checks

**Cargo Feature**: `health-checks`

**Requirements**:
- Stateless endpoint health monitoring (no background processes)
- Multiple strategies: Ping (HEAD request), LightweightApi (OPTIONS request)
- Response time-based status determination
- Configurable thresholds for healthy/degraded/unhealthy states
- Concurrent multi-endpoint checking capability
- Metrics aggregation (healthy/degraded/unhealthy percentages)

**Configuration Pattern**:
```rust
let config = HealthCheckConfig::builder()
  .strategy(HealthCheckStrategy::Ping)
  .timeout(Duration::from_secs(5))
  .healthy_threshold(Duration::from_millis(200))
  .degraded_threshold(Duration::from_millis(1000))
  .build()?;

let result = HealthChecker::check_endpoint(&endpoint_url, &config).await?;
```

**Implemented**: ✅ Complete (Task 359, 2 strategies, 14 tests)

### 4.7 Prompt Caching

**Implementation**: Integrated in core client (`src/client.rs`)

**Requirements**:
- CacheControl directive support for system prompts and messages
- Usage metadata tracking (cache_creation_input_tokens, cache_read_input_tokens)
- Cost savings calculation (~90% on cached tokens)
- Ephemeral cache control type
- Cache hit/miss scenario handling

**Configuration Pattern**:
```rust
let request = CreateMessageRequest::builder()
  .model("claude-sonnet-4-5-20250929")
  .max_tokens(1000)
  .system_with_cache(
    "You are an expert Rust developer...",
    CacheControl::ephemeral()
  )
  .messages(messages)
  .build();

// Check cache usage in response
if let Some(cache_read) = response.usage.cache_read_input_tokens {
  println!("Cache hit! Saved {} tokens", cache_read);
}
```

**Implemented**: ✅ Complete (Tasks 716-717, ~90% cost savings, 18 tests)

### 4.8 Enterprise Features Integration

**Unified Enterprise Execution**:
```rust
// Single method that applies all explicitly enabled enterprise features
client.execute_with_enterprise_features(request).await?;
```

**Feature Combinations**:
- Features can be enabled independently or in combination
- Each feature maintains its own configuration and state
- Thread-safe concurrent operation across all features
- Optional metrics integration for monitoring and observability

**Enterprise Reliability Status**: ✅ 100% Complete (5/5 features)
- Retry Logic ✅
- Circuit Breaker ✅
- Rate Limiting ✅
- Failover ✅
- Health Checks ✅

### 4.9 Implementation Standards

**Thread Safety**:
- All enterprise features must be thread-safe for concurrent use
- Use `Arc<Mutex<>>` patterns for shared state management
- Avoid blocking operations in async contexts

**Error Handling**:
- Enterprise features must integrate with existing `AnthropicError` types
- Clear error messages indicating which enterprise feature caused failures
- Proper error propagation without masking underlying API errors

**Testing Requirements**:
- Unit tests for each enterprise feature behind appropriate cargo features
- Integration tests validating feature combinations
- Performance tests ensuring zero overhead when features disabled

### 5. Security Requirements

### 5.1 Credential Management

### Secret Storage
- Never log or expose actual secret values
- Use `secrecy::SecretString` for in-memory protection
- Support multiple credential sources with priority order:
  1. **Workspace secrets**: `workspace_root/secret/-secrets.sh` (Primary location - see [Secret Directory Policy]../../secret_directory_policy.md)
  2. **Environment variables**: `ANTHROPIC_API_KEY` (Fallback)
  3. **Runtime**: programmatic setting (Direct)

**Critical**: The secret directory MUST be named `secret/` at workspace root (NO dot prefix). workspace_tools 0.6.0 uses `secret/` directly.

### Validation Standards
```rust
fn validate_api_key_format(secret: &str) -> Result< (), AnthropicError > {
  // Enforce Anthropic API key format requirements
  // - Must start with "sk-ant-"
  // - Length constraints: reasonable bounds
  // - Character set validation
}
```

### Audit Trail
- Track all secret exposure events
- Hash-based identification without credential leakage
- Global exposure counter for monitoring
- Call stack tracking for debugging

### 5.2 Network Security
- TLS/HTTPS enforcement for all communications
- Certificate validation
- Request signing and authentication
- Rate limiting and retry with exponential backoff

### 6. Performance Requirements

### 6.1 Async-First Design
- All I/O operations must be asynchronous
- Support for `tokio` runtime
- Efficient connection pooling
- Minimal allocation overhead

### 6.2 Streaming Support
- Server-sent events for incremental responses
- Backpressure handling
- Graceful connection management
- Event parsing for `content_block_delta` and `message_stop` events

### 6.3 Compilation Performance
- Build times under 60 seconds for full compilation
- Incremental compilation optimization
- Minimal dependency feature usage
- Parallel compilation support

### 7. Dependency Management

### 7.1 Core Dependencies
```toml
[dependencies]
# wTools ecosystem
mod_interface = { workspace = true }
error_tools = { workspace = true }
workspace_tools = { workspace = true, features = ["secret_management"] }

# HTTP and async
reqwest = { workspace = true, features = ["json"] }
tokio = { workspace = true, features = ["fs", "macros"] }
futures-core = { workspace = true }
futures-util = { workspace = true }

# Serialization
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
serde_with = { workspace = true }
```

### 7.2 Optional Features
- `integration`: Integration test support
- `streaming`: Server-sent events support ✅
- `tools`: Tool calling support (included in default features) ✅
- `vision`: Image understanding support ✅
- `authentication`: Advanced authentication functionality ✅
- `content-generation`: Advanced content generation functionality ✅
- `model-management`: Model management functionality ✅
- `error-handling`: Enhanced error handling functionality (included in default features) ✅
- `retry-logic`: Configurable retry mechanisms ✅
- `circuit-breaker`: Circuit breaker fault tolerance ✅
- `rate-limiting`: Token bucket rate limiting ✅
- `failover`: Multi-endpoint failover support ✅
- `health-checks`: Endpoint health monitoring ✅
- `request-caching`: Prompt caching support (integrated in core) ✅
- `count-tokens`: Token counting before API calls for cost estimation ✅
- `streaming-control`: Pause/resume/cancel control for streaming operations ✅
- `compression`: HTTP request/response compression for bandwidth optimization ✅
- `enterprise-quota`: Client-side usage tracking and quota management ✅
- `dynamic-config`: Runtime configuration hot-reloading with file watching ✅
- `sync-api`: Synchronous API wrapper ✅
- `curl-diagnostics`: CURL command generation ✅
- `general-diagnostics`: Comprehensive diagnostics ✅

### 8. Tool Calling Refactoring

### 8.1 Feature Gating and Modular Design
The tool calling functionality has been comprehensively refactored with proper feature gating:

- All tool-related types are conditionally compiled under the `tools` feature
- Feature is included in default features for seamless usage
- Clean separation between core message handling and tool-specific functionality
- Backward compatibility maintained through conditional compilation

### 8.2 Enhanced Validation
Comprehensive validation has been implemented for tool calling:

### Tool Definition Validation
- **Name validation**: Alphanumeric, underscore, hyphen only; max 64 characters
- **Description validation**: Non-empty, max 1024 characters  
- **Duplicate detection**: Prevents duplicate tool names in requests
- **Schema validation**: JSON schema format validation
- **Limits enforcement**: Maximum 64 tools per request

### Tool Choice Validation
- **Reference validation**: Ensures specific tool choices reference existing tools
- **Consistency checks**: Validates tool_choice is only used with tools array
- **Format validation**: Proper tool choice structure enforcement

### Request-Level Validation
```rust
// Comprehensive validation in CreateMessageRequest::validate()
if let Some(ref tools) = self.tools {
    // Check for empty tools array
    // Validate each tool definition
    // Check for duplicate tool names
    // Enforce tool limits
}
```

### 8.3 Improved Error Handling
Enhanced error messages provide actionable feedback:
- Clear validation failure descriptions
- Specific tool name references in error messages
- Structured error types for different validation failures
- Integration with existing error handling framework

### 8.4 Helper Methods and Builder Patterns
New convenience methods for tool definition creation:
```rust
// Simple tool with no parameters
let tool = ToolDefinition::simple("calculator", "Basic math operations");

// Tool with typed parameters  
let tool = ToolDefinition::with_properties(
    "weather", 
    "Get weather information",
    serde_json::json!({
        "location": { "type": "string", "description": "City name" },
        "units": { "type": "string", "enum": ["celsius", "fahrenheit"] }
    }),
    vec!["location".to_string()]
);
```

### 9. Vision Support Refactoring

### 9.1 Feature Gating and Modular Design
The vision support functionality has been comprehensively refactored with proper feature gating:

- All vision-related types are conditionally compiled under the `vision` feature
- Feature is optional and not included in default features for minimal compilation
- Clean separation between core message handling and vision-specific functionality
- Backward compatibility maintained through conditional compilation

### 9.2 Enhanced Type System
Comprehensive type system for vision content:

### ImageContent Structure
- **Type validation**: Always "image" type with compile-time safety
- **Source management**: Integrated ImageSource handling
- **Helper constructors**: Convenient methods for common image formats
- **Validation methods**: Runtime validation with actionable error messages

### ImageSource Validation
- **Format validation**: Only "base64" source type supported
- **Media type validation**: Supports JPEG, PNG, GIF, WebP formats
- **Content validation**: Non-empty data with basic base64 format checking
- **Size estimation**: Approximate byte size calculation from base64 length

### 9.3 Builder Pattern Integration
Vision content seamlessly integrates with message builders:

```rust
#[cfg(feature = "vision")]
let message = Message::user_with_image(
    "What's in this image?",
    ImageContent::jpeg("base64_image_data_here")
);

#[cfg(feature = "vision")]
let multi_image_message = Message::user_with_images(
    "Compare these images",
    vec![
        ImageContent::jpeg("first_image"),
        ImageContent::png("second_image")
    ]
);
```

### 9.4 Validation Framework
Comprehensive validation ensures data integrity:

### Content-Level Validation
```rust
// Automatic validation during content creation
let image = ImageContent::jpeg("valid_base64_data");
image.validate()?; // Returns Result< (), error_tools::Error >

// Helper validation methods
assert!(image.is_valid());
assert_eq!(image.media_type(), "image/jpeg");
```

### Source-Level Validation
```rust
// Direct source validation
let source = ImageSource::png("image_data");
source.validate()?;

// Format and size checks
assert!(source.is_valid_base64());
let estimated_bytes = source.estimated_size_bytes();
```

### 9.5 Message Integration
Vision content works seamlessly with existing message patterns:

### Multi-Modal Messages
- Text and image content in single messages
- Multiple images per message supported
- Proper serialization to Anthropic API format
- Type-safe content enumeration

### Content Type System
- `Content::Image` variant conditionally compiled with vision feature
- Helper methods like `is_image()` feature-gated appropriately
- Consistent API patterns across all content types

### 10. Streaming Support Refactoring

### 10.1 Enhanced Error Handling and Modularity
The streaming support functionality has been comprehensively refactored with improved error handling and modularity:

- **Conditional error handling**: Supports both enhanced `AnthropicError` and simplified `error_tools::Error`
- **Feature compatibility**: Proper integration with existing error-handling feature flag
- **Enhanced validation**: Comprehensive validation for all streaming types
- **Robust parsing**: Improved SSE parsing with better error messages

### 10.2 Comprehensive Type System
Enhanced streaming type system with full validation and helper methods:

### StreamMessage Structure
- **Helper constructors**: Convenient creation methods with validation
- **Status checking**: Methods to check completion state and content presence
- **Validation framework**: Complete validation with actionable error messages
- **Content management**: Methods to manage and inspect content blocks

### StreamContentBlock Enhancements
- **Feature gating**: Tool use blocks properly gated under `tools` feature
- **Type safety**: Helper methods for content type checking and access
- **Validation**: Content-specific validation for text and tool use blocks
- **Builder patterns**: Convenient constructors for different content types

### StreamDelta Improvements
- **Type-safe construction**: Helper methods for different delta types
- **Content access**: Safe methods to access delta content
- **Validation**: Delta-specific validation with proper error handling
- **Feature compatibility**: Tool-related deltas properly feature-gated

### 10.3 Enhanced SSE Parsing
Robust Server-Sent Events parsing with comprehensive error handling:

```rust
// Parse SSE events with validation
let events = parse_sse_events(&sse_data)?;

// Enhanced event construction with validation
let event = StreamEvent::content_block_delta(0, StreamDelta::new_text("Hello"));
event.validate()?;
```

### Parsing Improvements
- **Input validation**: Parameter validation before parsing
- **Enhanced error messages**: Detailed error information with context
- **Content validation**: Automatic validation of parsed content
- **Unknown event handling**: Helpful error messages for unsupported events

### Error Resilience
- **Conditional error types**: Support for both error handling modes
- **Validation integration**: Automatic validation during parsing
- **Context preservation**: Error messages include relevant context
- **Recovery strategies**: Graceful handling of malformed events

### 10.4 Builder Pattern Integration
Streaming types seamlessly integrate with builder patterns:

```rust
// Stream message construction
let message = StreamMessage::new(
    "msg_123",
    "message", 
    "user",
    "claude-sonnet-4-5-20250929",
    usage_stats
);

// Content block creation with validation
let content_block = StreamContentBlock::new_text("Hello, world!");
content_block.validate()?;

// Delta construction
let delta = StreamDelta::new_text("incremental text");
```

### 10.5 Feature Integration
Streaming support properly integrates with other features:

### Tool Integration
- Tool use content blocks feature-gated under `tools` feature
- Tool-related deltas conditionally compiled
- Consistent patterns with message tool handling

### Error Handling Integration
- Conditional compilation based on `error-handling` feature
- Seamless fallback to `error_tools::Error` when needed
- Consistent error patterns across the crate

### Validation Framework
- Comprehensive validation for all streaming types
- Integration with existing validation patterns
- Actionable error messages for debugging

### 11. API Design Patterns

### 11.1 Client Creation
```rust
// Direct secret creation
let client = Client::new(secret)?;

// Load from environment variable (ANTHROPIC_API_KEY)
let client = Client::from_env()?;

// Load from workspace secrets (workspace_root/secret/-secrets.sh with fallback to env)
// See Secret Directory Policy: ../../secret_directory_policy.md
let client = Client::from_workspace()?;
```

### 11.2 Message Format
Anthropic-specific message structure:
```rust
{
  "model": "claude-sonnet-4-5-20250929",
  "max_tokens": 1024,
  "system": "System prompt here",
  "messages": [
    {
      "role": "user",
      "content": "User message content"
    }
  ]
}
```

### 11.3 Streaming Response Handling
Server-Sent Events format:
```rust
data: {"type": "content_block_delta", "delta": {"text": "Hello"}}
data: {"type": "message_stop", "stop_reason": "end_turn"}
data: [DONE]
```

### 11.4 Streaming Control (streaming-control feature)

The `streaming-control` feature provides pause/resume/cancel capabilities for streaming responses.

**Architecture:**
- `StreamState`: Enum tracking stream state (Running, Paused, Cancelled)
- `StreamControl`: Thread-safe handle for controlling stream operations
- `ControlledStream<S>`: Wrapper implementing `Stream` trait with buffering

**Key Features:**
- **Client-side buffering**: Events are buffered when paused using `VecDeque`
- **Configurable buffer limits**: Oldest events dropped when buffer is full (FIFO)
- **Immediate error delivery**: Errors bypass buffering and are returned immediately
- **Thread-safe**: Uses `Arc<Mutex<>>` for state sharing across threads
- **Zero-cost when disabled**: Feature-gated to avoid overhead

**Usage Pattern:**
```rust
let (controlled_stream, control) = ControlledStream::new(inner_stream, buffer_limit);

// In one task: control the stream
control.pause()?;
// ... later ...
control.resume()?;
// ... or ...
control.cancel();

// In another task: consume the stream
while let Some(event) = controlled_stream.next().await {
    // Process events
}
```

**Implementation Details:**
- Pause: Buffers incoming events, returns `Poll::Pending` to consumer
- Resume: Delivers buffered events before polling inner stream
- Cancel: Clears buffer and stops all event delivery
- State checks: `is_paused()`, `is_cancelled()`, `is_running()`, `buffer_size()`

**Dependencies:**
- Requires `streaming` feature (depends on `StreamEvent` type)
- Located in `src/streaming_control.rs` (314 lines)

### 11.5 HTTP Compression (compression feature)

The `compression` feature provides gzip compression for request/response bodies to reduce bandwidth usage and improve performance for large prompts.

**Architecture:**
- `CompressionConfig`: Configuration for compression level and size thresholds
- `compress()`: Compress request bodies using gzip
- `decompress()`: Decompress gzip-compressed responses
- `is_gzip()`: Detect gzip-compressed data
- `add_compression_headers()`: Add appropriate HTTP headers

**Example:**
```rust
use api_claude::{ compress, decompress, CompressionConfig };

// Configure compression
let config = CompressionConfig::new()
    .with_level(6)          // 0-9, where 6 is balanced
    .with_min_size(1024);   // Only compress if >= 1KB

// Compress request body
let data = "Large prompt text...".repeat(1000);
let compressed = compress(data.as_bytes(), &config)?;

// Decompress response (if server returns compressed)
let decompressed = decompress(&compressed_response)?;
```

**Key Features:**
- ~60-80% size reduction for text content
- Configurable compression levels (0=none, 6=default, 9=best)
- Minimum size threshold to avoid compressing small data
- Only uses compression if it actually reduces size
- Automatic gzip magic number detection

**Dependencies:**
- Uses `flate2` crate for gzip compression/decompression
- Located in `src/compression.rs` (200+ lines)

### 11.6 Enterprise Quota Management (enterprise-quota feature)

The `enterprise-quota` feature provides client-side tracking and management of API usage and costs for production deployments.

**Architecture:**
- `UsageMetrics`: Tracks request counts, tokens, and costs
- `QuotaConfig`: Defines quota limits (daily/monthly)
- `CostCalculator`: Calculates costs based on model pricing
- `QuotaManager`: Enforces quotas and tracks usage
- `QuotaExceededError`: Returned when quotas are exceeded

**Example:**
```rust
use api_claude::{ QuotaManager, QuotaConfig };

// Configure quotas
let config = QuotaConfig::new()
    .with_daily_requests(1000)
    .with_daily_tokens(1_000_000)
    .with_daily_cost(50.0)          // $50/day limit
    .with_monthly_cost(1500.0);     // $1500/month limit

let manager = QuotaManager::new(config);

// Record usage before making API call
match manager.record_usage("claude-3-5-sonnet-latest", 1_000, 500) {
    Ok(()) => {
        // Quota OK, proceed with API call
    }
    Err(e) => {
        // Quota exceeded
        eprintln!("Quota exceeded: {}", e.message);
    }
}

// Export metrics
let json = manager.export_json()?;
println!("{}", json);
```

**Key Features:**
- Real-time quota enforcement before API calls
- Automatic cost calculation using current model pricing
- Per-model usage tracking
- Daily and monthly quota limits
- JSON export for monitoring systems
- Thread-safe for concurrent access

**Cost Calculation:**
- Claude 3.5 Sonnet: $3/M input, $15/M output
- Claude 3 Opus: $15/M input, $75/M output
- Claude 3 Haiku: $0.25/M input, $1.25/M output

**Dependencies:**
- Uses `parking_lot` for thread-safe RwLock
- Uses `chrono` for timestamps
- Located in `src/enterprise_quota.rs` (495 lines)

### 11.7 Dynamic Configuration (dynamic-config feature)

The `dynamic-config` feature provides runtime configuration changes with file system monitoring for zero-downtime updates.

**Architecture:**
- `RuntimeConfig`: Hot-reloadable configuration struct
- `ConfigWatcher`: File system watcher for automatic reloading
- Thread-safe updates via `Arc<RwLock<>>`
- JSON-based configuration format
- Configuration validation before applying changes

**Example:**
```rust
use api_claude::{ ConfigWatcher, RuntimeConfig };
use std::path::PathBuf;

// Create config file watcher
let config_path = PathBuf::from("/etc/claude/config.json");
let initial_config = RuntimeConfig::new();
let watcher = ConfigWatcher::new(config_path, initial_config)?;

// Get current config (thread-safe)
let config = watcher.config();
println!("Base URL: {}", config.base_url);
println!("Timeout: {:?}", config.timeout());

// Config automatically reloads when file changes
// Or manually reload
watcher.reload()?;

// Programmatic update
let new_config = RuntimeConfig {
    base_url: "https://custom.api.com".to_string(),
    timeout_ms: 120_000,
    max_retries: 5,
    ..RuntimeConfig::new()
};
watcher.update(new_config)?;
```

**Configuration Options:**
- `base_url`: API endpoint URL (default: "https://api.anthropic.com")
- `api_version`: API version string (default: "2023-06-01")
- `timeout_ms`: Request timeout in milliseconds (default: 300000)
- `enable_retry`: Enable retry logic (default: true)
- `max_retries`: Maximum retry attempts (default: 3, max: 10)
- `enable_circuit_breaker`: Enable circuit breaker (default: true)
- `circuit_breaker_threshold`: Failure threshold (default: 5)
- `enable_rate_limiting`: Enable rate limiting (default: false)
- `rate_limit_rps`: Requests per second limit (default: 10)

**Key Features:**
- Zero-downtime configuration updates
- Automatic file watching and reloading
- Configuration validation before applying
- A/B testing different configurations
- Thread-safe concurrent access
- JSON configuration format

**Dependencies:**
- Uses `notify` crate for file system watching
- Uses `parking_lot` for thread-safe RwLock
- Located in `src/dynamic_config.rs` (300+ lines)

### 12. Release and Versioning

### 12.1 Semantic Versioning
- **Major**: Breaking API changes, Anthropic API version updates
- **Minor**: New features, endpoint additions, backward-compatible changes
- **Patch**: Bug fixes, security updates, documentation improvements

### 12.2 Release Process
1. Automated testing pipeline (unit + integration)
2. Documentation generation and validation
3. Changelog generation
4. Version bumping and tagging
5. Crate publication to crates.io

### 13. Development Workflow

### 13.1 Task Management
Structured task tracking following wTools ecosystem patterns:
- Task prioritization by advisability score
- Clear acceptance criteria
- Status tracking with proper transitions
- Outcome documentation for completed tasks

### 13.2 Code Review Requirements
- All changes must pass automated testing
- Code review for architecture and security aspects
- Documentation updates for public API changes
- Performance impact assessment

### 14. Migration and Compatibility

### 14.1 Anthropic API Version Support
- Support for latest Anthropic Claude API version
- Graceful handling of API deprecations
- Version-specific feature flags where necessary

### 14.2 Backward Compatibility
- Semantic versioning adherence
- Deprecation warnings before breaking changes
- Migration guides for major version updates
- Legacy API support where reasonable

### Implementation Status

As of version 0.7 (2025-10-07):

**✅ Completed Features (Production Ready)**:
- ✅ Core message API (complete HTTP implementation, comprehensive message handling)
- ✅ Secret management (workspace integration, environment variables, validation)
- ✅ Tool calling (20 tests, complete implementation with validation)
- ✅ Vision support (8 tests, multimodal content processing)
- ✅ Streaming responses (5 tests, SSE with tool calling and vision)
- ✅ Prompt caching (18 tests, ~90% cost savings, tasks 716-717)
- ✅ Dynamic configuration (20+ tests, workspace + environment)
- ✅ Enterprise reliability (100% complete):
  - ✅ Retry logic (exponential backoff with jitter)
  - ✅ Circuit breaker (fault tolerance pattern)
  - ✅ Rate limiting (token bucket algorithm)
  - ✅ Failover (4 strategies, 17 tests, task 358)
  - ✅ Health checks (2 strategies, 14 tests, task 359)
- ✅ Sync API (synchronous wrapper for blocking use cases)
- ✅ Diagnostics (CURL generation, comprehensive error diagnostics)

**Test Coverage**: 435/435 tests passing (100%)
- Unit tests
- Integration tests (real API)
- Serialization validation
- Error handling
- Performance benchmarks

**API Feature Coverage**: 62% (26/42 API functionality features implemented)
**Cargo Features**: 21 feature flags defined (see Cargo.toml `[features]` section)
**Production Status**: ✅ Production Ready

**✅ Recently Completed (2025-10-11)**:
- Batch Messages API (tasks 706-707): COMPLETE
  - ✅ Complete batch type system with validation
  - ✅ HTTP client methods (create, retrieve, list, cancel)
  - ✅ Comprehensive test suite (9 tests, 5 integration tests)
  - ✅ All tests passing with zero regressions
- Enhanced Retry Logic (tasks 700-701): COMPLETE
  - ✅ Anthropic rate limit header integration (6 headers)
  - ✅ Server-provided retry-after duration support
  - ✅ Rate limit info parsing (requests/tokens remaining, limits, reset times)
  - ✅ Usage percentage calculations for intelligent retry decisions
  - ✅ Comprehensive test suite (16 tests: header parsing, retry logic, integration)
  - ✅ Error handling improvements (8 new error classification tests)

**🔄 Planned Future Enhancements**:
- Enhanced Circuit Breaker (tasks 702-703): Advanced failure tracking (moved to backlog - conflicts with explicit patterns)
- Enhanced Rate Limiting (tasks 704-705): API header-based rate limit tracking (moved to backlog - conflicts with explicit patterns)
- Enterprise Method Integration (tasks 710-711): Unified enterprise feature methods
- Streaming Control (task 361): Enhanced SSE control operations
- Model Tuning (task 364): Fine-tuning capabilities (managed service only)
- Model Deployment (task 365): Model hosting features (managed service only)

**Known API Limitations**:
- ❌ Embeddings: Not offered by Anthropic (architecture ready for future)
- ❌ WebSocket: Anthropic API uses SSE only (tasks 360 moved to backlog)
- ❌ Audio: Not available in Anthropic API
- ❌ Model Tuning/Deployment: Managed service only (tasks 364-365 moved to backlog)

This specification serves as the authoritative source for all development decisions and implementation priorities for the `api_claude` crate.