ai 0.4.0

Simple to use LLM library for Rust with streaming, tool calling, OAuth helpers, and a lightweight agent loop
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
# ai

Simple to use LLM library for Rust with streaming, tool calling, OAuth helpers,
and a lightweight agent loop, inspired by [`pi`](https://github.com/earendil-works/pi).

## Table of Contents

- [Supported Providers]#supported-providers
- [Installation]#installation
- [Quick Start]#quick-start
- [Tools]#tools
  - [Defining Tools]#defining-tools
  - [Handling Tool Calls]#handling-tool-calls
  - [Streaming Tool Calls with Partial JSON]#streaming-tool-calls-with-partial-json
  - [Validating Tool Arguments]#validating-tool-arguments
  - [Complete Event Reference]#complete-event-reference
- [Image Input]#image-input
- [Image Generation]#image-generation
  - [Basic Image Generation]#basic-image-generation
  - [Notes and Limitations]#notes-and-limitations
- [Thinking/Reasoning]#thinkingreasoning
  - [Unified Interface]#unified-interface-streamsimplecompletesimple
  - [Provider-Specific Options]#provider-specific-options-streamcomplete
  - [Streaming Thinking Content]#streaming-thinking-content
- [Stop Reasons]#stop-reasons
- [Error Handling]#error-handling
  - [Aborting Requests]#aborting-requests
  - [Continuing After Abort]#continuing-after-abort
  - [Debugging Provider Payloads]#debugging-provider-payloads
- [APIs, Models, and Providers]#apis-models-and-providers
  - [Faux provider for tests]#faux-provider-for-tests
  - [Providers and Models]#providers-and-models
  - [Querying Providers and Models]#querying-providers-and-models
  - [Custom Models]#custom-models
  - [OpenAI Compatibility Settings]#openai-compatibility-settings
  - [Thread Safety]#thread-safety
  - [Type Safety]#type-safety
- [Cross-Provider Handoffs]#cross-provider-handoffs
  - [How It Works]#how-it-works
  - [Example: Multi-Provider Conversation]#example-multi-provider-conversation
  - [Provider Compatibility]#provider-compatibility
- [Context Serialization]#context-serialization
- [Browser Usage]#browser-usage
  - [Browser Compatibility Notes]#browser-compatibility-notes
  - [Environment Variables]#environment-variables
  - [Checking Environment Variables]#checking-environment-variables
- [OAuth Providers]#oauth-providers
  - [CLI Login]#cli-login
  - [Programmatic OAuth]#programmatic-oauth
  - [Login Flow Example]#login-flow-example
  - [Using OAuth Tokens]#using-oauth-tokens
  - [Provider Notes]#provider-notes
- [Agent Core]#agent-core
  - [Installation]#agent-installation
  - [Quick Start]#agent-quick-start
  - [Core Concepts]#core-concepts
  - [Event Flow]#event-flow
  - [prompt_text() Event Sequence]#prompt_text-event-sequence
  - [With Tool Calls]#with-tool-calls
  - [continue_run() Event Sequence]#continue_run-event-sequence
  - [Event Types]#event-types
  - [Agent Options]#agent-options
  - [Agent State]#agent-state
  - [Methods]#methods
  - [Session and Thinking Budgets]#session-and-thinking-budgets
  - [Steering and Follow-up]#steering-and-follow-up
  - [Custom Message Types]#custom-message-types
  - [Tools]#agent-tools
  - [Tool Error Handling]#agent-tool-error-handling
  - [Proxy Usage]#proxy-usage
  - [Low-Level API]#low-level-api
- [Development]#development
  - [Adding a New Provider]#adding-a-new-provider
- [License]#license

## Supported Providers

- **OpenAI** via Chat Completions and Responses
- **Anthropic** via Messages
- **GitHub Copilot** through OAuth-backed OpenAI/Anthropic-compatible routes
- **Azure Foundry and other compatible endpoints** through provider handles with
  explicit `base_url`, headers, and compatibility settings

The active built-in stream APIs are:

- `openai-completions`
- `openai-responses`
- `anthropic-messages`

The active built-in provider handles are focused on `openai`, `anthropic`, and
`github_copilot`. Azure Foundry, Ollama, vLLM, and other compatible endpoints
can use configured provider handles with explicit `base_url`, HTTP headers, and
compatibility settings.

Broad native provider-specific APIs outside OpenAI, Anthropic, GitHub Copilot,
and custom compatible routing are not part of the active built-in provider
surface. PRs to add support for additional providers are welcome.

Image generation is not exposed yet. Chat image input and image blocks in tool
results are still supported by the regular chat APIs.

## Installation

```bash
cargo add ai
cargo add tokio --features macros,rt-multi-thread
cargo add futures serde_json
```

This crate uses Tokio-compatible async APIs. The examples use
`#[tokio::main]`, which requires Tokio's `macros` and `rt-multi-thread`
features. The examples also use `futures::StreamExt` for stream iteration and
`serde_json::json` for JSON Schema values.

## Quick Start

```rust
use ai::{complete_simple, providers::openai, Context, Message, Result};

#[tokio::main]
async fn main() -> Result<()> {
    let openai = openai::from_env()?;
    let model = openai.model("gpt-5.5").build()?;

    let context = Context::builder()
        .system_prompt("You are a helpful assistant.")
        .message(Message::user_text("What is the capital of France?"))
        .build();

    let message = complete_simple(model, context, None).await?;
    println!("{message:?}");
    Ok(())
}
```

### Streaming

Use `stream_simple` when the UI should update as tokens arrive. The
`futures::StreamExt` import is only needed for `.next().await`.

```rust
use futures::StreamExt;

use ai::{providers::openai, stream_simple, AssistantMessageEvent, Context, Message, Result};

#[tokio::main]
async fn main() -> Result<()> {
    let openai = openai::from_env()?;
    let model = openai.model("gpt-5.5").build()?;
    let context = Context::builder()
        .message(Message::user_text("Write a haiku about Rust."))
        .build();

    let mut events = stream_simple(model, context, None)?;
    while let Some(event) = events.next().await {
        if let AssistantMessageEvent::TextDelta { delta, .. } = event? {
            print!("{delta}");
        }
    }

    Ok(())
}
```

### Provider Handles

Use `openai::from_env()` for OpenAI Responses. Use `openai::builder()` when
selecting Chat Completions or an OpenAI-compatible endpoint.

#### OpenAI Responses

```rust
use ai::providers::openai;

let openai_responses_from_env = openai::from_env()?;

let openai_responses_with_key = openai::builder()
    .api_key(Some("sk-..."))
    .responses()
    .build()?;
```

#### OpenAI Chat Completions

```rust
use ai::providers::openai;

let openai_chat_with_key = openai::builder()
    .api_key(Some("sk-..."))
    .chat_completions()
    .build()?;

let ollama_chat = openai::builder()
    .base_url("http://localhost:11434/v1")
    .chat_completions()
    .build()?;
```

#### Anthropic

```rust
use ai::providers::anthropic;

let anthropic_from_env = anthropic::from_env()?;

let anthropic_with_key = anthropic::builder()
    .api_key("sk-ant-...")
    .build()?;
```

### Dynamic Provider Choice

Provider handles are trait objects when the application wants to choose a
backend at runtime.

```rust
use ai::{complete_simple, providers::openai, Context, Message, Provider, Result};

#[tokio::main]
async fn main() -> Result<()> {
    let (provider, model_id): (Box<dyn Provider>, &str) =
        if std::env::var("OPENAI_API_KEY").is_ok() {
            (Box::new(openai::from_env()?), "gpt-5.5")
        } else {
            let local = openai::builder()
                .provider_id("ollama")
                .base_url("http://localhost:11434/v1")
                .chat_completions()
                .build()?;
            (Box::new(local), "gemma3")
        };

    let model = provider.model(model_id).build()?;
    let context = Context::builder()
        .message(Message::user_text("Summarize Rust ownership."))
        .build();

    let message = complete_simple(model, context, None).await?;
    println!("{message:?}");
    Ok(())
}
```

## Tools

Tools enable LLMs to interact with external systems. This crate uses JSON
Schema values for tool definitions and provides validation helpers for tool
calls.

### Defining Tools

```rust
use ai::Tool;
use serde_json::json;

let weather_tool = Tool::builder("get_weather")
    .description("Get current weather for a location.")
    .parameters(json!({
        "type": "object",
        "properties": {
            "location": { "type": "string" },
            "units": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "default": "celsius"
            }
        },
        "required": ["location"]
    }))
    .build()?;
```

### Handling Tool Calls

Tool results use content blocks and can include both text and images.

```rust
use ai::{AssistantContent, Message, ToolResultContent, ToolResultMessage};

if let AssistantContent::ToolCall(call) = block {
    let result = run_weather_lookup(&call.arguments).await;
    context.messages.push(Message::ToolResult(ToolResultMessage {
        tool_call_id: call.id,
        tool_name: call.name,
        content: vec![ToolResultContent::text(result)],
        details: None,
        is_error: false,
        timestamp: 0,
    }));
}
```

### Streaming Tool Calls with Partial JSON

During streaming, tool call arguments are progressively parsed as they arrive.
This enables real-time UI updates before the complete arguments are available.

```rust
use ai::AssistantMessageEvent;

match event {
    AssistantMessageEvent::ToolCallDelta {
        content_index,
        delta,
        partial,
    } => {
        println!("tool block {content_index} delta: {delta}");
        println!("partial message: {partial:?}");
    }
    AssistantMessageEvent::ToolCallEnd { tool_call, .. } => {
        println!("tool completed: {} {:?}", tool_call.name, tool_call.arguments);
    }
    _ => {}
}
```

Important notes about partial tool arguments:

- During `ToolCallDelta`, arguments may be incomplete.
- Fields may be missing or partially parsed.
- String values may be truncated mid-word.
- Arrays and nested objects may be incomplete.
- Always validate final tool arguments before executing external effects.
- Use `content_index` to associate events with the right assistant content block.

### Validating Tool Arguments

When using the agent loop, tool arguments are validated before execution. When
implementing your own loop with `stream` or `complete`, use
`validate_tool_call` or `validate_tool_arguments`.

```rust
use ai::{validate_tool_call, Tool, ToolCall};

let validated = validate_tool_call(&tools, &tool_call)?;
```

### Complete Event Reference

All streaming events emitted during assistant message generation:

| Event | Description | Key Properties |
| --- | --- | --- |
| `Start` | Stream begins | `partial`: initial assistant message structure |
| `TextStart` | Text block starts | `content_index`: position in content array |
| `TextDelta` | Text chunk received | `delta`, `content_index` |
| `TextEnd` | Text block complete | `content`, `content_index` |
| `ThinkingStart` | Thinking block starts | `content_index` |
| `ThinkingDelta` | Thinking chunk received | `delta`, `content_index` |
| `ThinkingEnd` | Thinking block complete | `content`, `content_index` |
| `ToolCallStart` | Tool call begins | `content_index` |
| `ToolCallDelta` | Tool arguments stream | `delta`, `partial` |
| `ToolCallEnd` | Tool call complete | `tool_call` |
| `Done` | Stream complete | `reason`, `message` |
| `Error` | Error occurred | `reason`, `error` |

Streaming events for different content blocks are not guaranteed to be
contiguous. Consumers should use `content_index` to associate deltas and end
events with their blocks.

## Image Input

Models with vision capabilities can process images. Check `model.input` for
`ModelInput::Image`. If you pass images to a non-vision model, the message
transform layer downgrades unsupported image content to text placeholders.

```rust
use ai::{
    providers::openai, Context, ImageContent, Message, ModelInput, UserContent,
    UserMessage, UserMessageContent,
};

let openai = openai::from_env()?;
let model = openai.model("gpt-5.5").build()?;
if model.input.contains(&ModelInput::Image) {
    println!("model supports vision");
}

let context = Context {
    messages: vec![Message::User(UserMessage {
        content: UserMessageContent::Parts(vec![
            UserContent::text("What is in this image?"),
            UserContent::Image(ImageContent {
                data: "...base64...".to_string(),
                mime_type: "image/png".to_string(),
            }),
        ]),
        timestamp: 0,
    })],
    ..Default::default()
};
```

## Image Generation

Image generation is not exposed yet.

### Basic Image Generation

No Rust image-generation API is exposed at the moment. Keep using the regular
chat APIs for image input and tool-result image blocks.

### Notes and Limitations

The active Rust provider surface is focused on chat/agent behavior for OpenAI
Chat Completions, OpenAI Responses, Anthropic Messages, and GitHub
Copilot-compatible routing.

## Thinking/Reasoning

Many models support thinking or reasoning content. Check `model.reasoning` and
use `get_supported_thinking_levels` to inspect supported levels.

### Unified Interface (streamSimple/completeSimple)

Rust exports these as `stream_simple` and `complete_simple`.

```rust
use ai::{
    complete_simple, providers::anthropic, Context, Message, ModelThinkingLevel,
    SimpleStreamOptions,
};

let anthropic = anthropic::from_env()?;
let model = anthropic.model("claude-sonnet-4-5").build()?;
let options = SimpleStreamOptions {
    reasoning: Some(ModelThinkingLevel::Medium),
    ..Default::default()
};

let response = complete_simple(
    model,
    Context {
        messages: vec![Message::user_text("Solve: 2x + 5 = 13")],
        ..Default::default()
    },
    Some(options),
)
.await?;
```

### Provider-Specific Options (stream/complete)

`stream_simple` and `complete_simple` are the preferred app-level APIs,
They take `SimpleStreamOptions`, resolve the model's API, and map common
options such as reasoning, cache retention, API key, cancellation, payload
hooks, retry settings, and provider options onto the selected provider.

`stream` and `complete` are the lower-level APIs. Use them when you need the
non-simple `StreamOptions` shape or direct provider-option forwarding. For
provider-specific escape hatches, place fields in
`StreamOptions::provider_options` using provider option names such as
`toolChoice`, `serviceTier`, or `thinkingDisplay`.

The crate root also exports scoped direct provider stream functions:

- `stream_openai_completions` / `stream_simple_openai_completions`
- `stream_openai_responses` / `stream_simple_openai_responses`
- `stream_anthropic` / `stream_simple_anthropic`

Provider modules expose typed provider options for direct provider calls:

- `providers::openai_completions::OpenAICompletionsOptions`
- `providers::openai_responses::OpenAIResponsesOptions`
- `providers::anthropic::AnthropicOptions`

### Streaming Thinking Content

Thinking content streams through `ThinkingStart`, `ThinkingDelta`, and
`ThinkingEnd` events. Completed messages store thinking blocks as
`AssistantContent::Thinking`.

## Stop Reasons

Every `AssistantMessage` includes a `stop_reason` field that indicates how the
generation ended:

- `Stop` - Normal completion
- `Length` - Output hit the maximum token limit
- `ToolUse` - Model is calling tools and expects tool results
- `Error` - An error occurred during generation
- `Aborted` - Request was cancelled

`AssistantMessage` may also include `response_id`, a provider-specific response
or message identifier when the underlying API exposes one.

## Error Handling

Setup failures before a stream exists are returned as `Error` values. Once a
provider stream exists, provider-declared failures and cancellation are
surfaced as terminal `AssistantMessageEvent::Error` events carrying the final
assistant message. The `complete_*` helpers return that final assistant message;
check `message.stop_reason` for `StopReason::Error` or `StopReason::Aborted`.
Transport or decoder failures that cannot be represented as provider messages
still return `Err`.

### Aborting Requests

Use a Tokio cancellation token to abort in-flight requests. Prefer direct
struct initialization over mutating default options:

```rust
use ai::{SimpleStreamOptions, StreamOptions};
use tokio_util::sync::CancellationToken;

let token = CancellationToken::new();
let options = SimpleStreamOptions {
    stream: StreamOptions {
        cancellation_token: Some(token.clone()),
        ..Default::default()
    },
    ..Default::default()
};

token.cancel();
```

### Continuing After Abort

Abort produces an assistant message with `StopReason::Aborted`. The transform
layer drops aborted assistant turns before follow-up messages so conversations
can continue cleanly.

### Debugging Provider Payloads

Use `StreamOptions::on_payload` and `StreamOptions::on_response` hooks to
inspect or override provider payloads and observe raw provider responses. The
hooks are supported by `stream`, `complete`, `stream_simple`, and
`complete_simple`.

## APIs, Models, and Providers

Provider handles build executable models. Built-in language model APIs include:

- **`anthropic-messages`**: Anthropic Messages API
- **`openai-completions`**: OpenAI Chat Completions API
- **`openai-responses`**: OpenAI Responses API

`register_faux_provider()` is legacy support for the crate's own unit tests.
New application code should prefer provider handles and custom provider
implementations instead of registering global providers.

### Faux provider for tests

`register_faux_provider()` registers a temporary in-memory provider for tests.
It is opt-in and not part of the built-in provider set.

### Providers and Models

A provider offers models through a specific API. In this crate:

- **Anthropic** models use `anthropic-messages`.
- **OpenAI** models use `openai-completions` or `openai-responses`.
- **GitHub Copilot** models use OAuth-backed OpenAI/Anthropic-compatible
  routes.
- **Azure Foundry** and other compatible endpoints are configured as custom
  models by choosing an active API, setting `base_url`, and filling
  `ModelCompat` where the endpoint differs from the default request shape.

Built-in provider handles create executable model values directly from string
IDs. There is no built-in model catalog; applications that need one should keep
it in application state and build models through configured provider handles.

### Querying Providers and Models

```rust
use ai::{providers::{github_copilot, openai}, Provider};

let provider = openai::from_env()?;
let capabilities = provider.capabilities();
let model = provider.model("gpt-5.5").build()?;

let copilot = github_copilot::builder()
    .api_key("...")
    .anthropic_messages()
    .build()?;
let claude = copilot.model("claude-opus-4.5").build()?;
```

### Custom Models

You can create provider-bound models for local inference servers or custom
endpoints:

```rust
use ai::{
    providers::openai, stream_simple, Context, Message, SimpleStreamOptions,
};

let provider = openai::builder()
    .provider_id("azure-foundry")
    .api_key(Some("..."))
    .base_url("https://example.services.ai.azure.com/openai/v1")
    .chat_completions()
    .build()?;
let model = provider.model("gpt-5.5").build()?;

let stream = stream_simple(
    model,
    Context {
        messages: vec![Message::user_text("What is the capital of France?")],
        ..Default::default()
    },
    Some(SimpleStreamOptions::default()),
)?;
```

The same pattern works for local inference servers such as Ollama, vLLM, and LM
Studio when they expose an OpenAI-compatible chat endpoint.

Some OpenAI-compatible servers do not understand the `developer` role used for
reasoning-capable models. For those endpoints, build the model with compat
metadata so the system prompt is sent as a `system` message instead. If the
server also does not support `reasoning_effort`, disable that compat flag too.

### OpenAI Compatibility Settings

The `openai-completions` API is implemented by many providers with minor
differences. `ModelCompat` stores compatibility metadata for explicit custom
models, but the active built-in surface does not infer broad provider-specific
behavior from provider names or base URLs.

Set model-builder compat metadata when the target OpenAI-compatible endpoint
needs payload differences such as non-standard reasoning, cache-control,
max-token, or tool-result behavior.

### Thread Safety

Provider handles are regular cloneable values. Build them during application
startup, pass them where needed, and create executable model values with
`provider.model(id).build()?`.

### Type Safety

Public types are serializable with `serde` where they represent portable
context or message state.

## Cross-Provider Handoffs

The library supports handoffs between OpenAI, Anthropic, and GitHub
Copilot-compatible models within the same conversation.

### How It Works

When messages from one provider are sent to a different provider, the crate
transforms them for compatibility:

- User and tool-result messages are passed through.
- Assistant messages from the same provider/API are preserved as-is.
- Assistant messages from different providers have thinking blocks converted to
  text with `<thinking>` tags where needed.
- Tool calls and regular text are preserved.

### Example: Multi-Provider Conversation

```rust
use ai::{complete_simple, providers::{anthropic, openai}, Context, Message, SimpleStreamOptions};

let mut context = Context {
    messages: vec![Message::user_text("What is 25 * 18?")],
    ..Default::default()
};

let anthropic = anthropic::from_env()?;
let claude = anthropic.model("claude-sonnet-4-5").build()?;
let claude_response =
    complete_simple(claude, context.clone(), Some(SimpleStreamOptions::default())).await?;
context.messages.push(Message::Assistant(claude_response));

let openai = openai::from_env()?;
let gpt = openai.model("gpt-5.5").build()?;
context.messages.push(Message::user_text("Is that calculation correct?"));
let gpt_response =
    complete_simple(gpt, context.clone(), Some(SimpleStreamOptions::default())).await?;
context.messages.push(Message::Assistant(gpt_response));
```

### Provider Compatibility

All active providers can handle shared text, tool calls, tool results including
images, thinking/reasoning blocks after transformation, and aborted messages
with partial content.

## Context Serialization

`Context`, `Message`, assistant content, tool calls, and tool results implement
`Serialize` and `Deserialize`, so context can be persisted or handed to another
process.

```rust
use ai::Context;

let serialized = serde_json::to_string(&context)?;
let restored: Context = serde_json::from_str(&serialized)?;
```

If the context contains images encoded as base64, those are serialized too.
Serialized user, assistant, and tool-result messages include stable `role`
fields, including assistant messages nested in stream events.

## Browser Usage

This Rust crate is server/native focused and does not provide browser-specific
packaging. Pass API keys explicitly through options or use environment
variables on the server.

### Browser Compatibility Notes

Not applicable to this Rust crate.

### Environment Variables

| Provider | Environment variables |
| --- | --- |
| `openai` | `OPENAI_API_KEY` |
| `anthropic` | `ANTHROPIC_OAUTH_TOKEN`, then `ANTHROPIC_API_KEY` |
| `github-copilot` | `COPILOT_GITHUB_TOKEN` |

Explicit API keys in `StreamOptions` take precedence over environment lookup.

### Checking Environment Variables

```rust
use ai::get_env_api_key;

let key = get_env_api_key("openai");
```

## OAuth Providers

The OAuth registry includes:

- **Anthropic** (Claude Pro/Max subscription)
- **GitHub Copilot** (Copilot subscription)

### CLI Login

This crate exposes login primitives for applications that want to provide their
own login UI.

### Programmatic OAuth

```rust
use ai::{get_oauth_provider, OAuthCredentials};

let provider = get_oauth_provider("github-copilot").expect("provider");
```

### Login Flow Example

Use `login_anthropic` or `login_github_copilot` with `OAuthLoginCallbacks` to
drive the login UI from your application.

```rust
use ai::{providers::github_copilot, OAuthLoginCallbacks, Result};

async fn login() -> Result<()> {
    let callbacks = OAuthLoginCallbacks::builder()
        .on_device_code(|info| {
            eprintln!("Open {} and enter {}", info.verification_uri, info.user_code);
        })
        .on_prompt(|_| async { Ok(String::new()) })
        .build();

    let credentials = github_copilot::oauth().login(callbacks).await?;
    // Persist credentials here.

    Ok(())
}
```

### Using OAuth Tokens

Use provider-specific helpers to turn stored credentials into the API key used
by provider builders or stream options. For GitHub Copilot, the helper refreshes
expired credentials and returns `new_credentials`; persist those back to your
auth store.

```rust
use ai::{providers::github_copilot, OAuthCredentials, Result};

async fn provider_from_stored_credentials(
    credentials: OAuthCredentials,
) -> Result<github_copilot::GitHubCopilot> {
    let oauth_key = github_copilot::get_oauth_api_key(&credentials).await?;

    // Persist oauth_key.new_credentials here.

    github_copilot::builder()
        .api_key(oauth_key.api_key)
        .base_url(github_copilot::base_url_for_credentials(
            &oauth_key.new_credentials,
        ))
        .responses()
        .build()
}
```

### Provider Notes

**GitHub Copilot**: OAuth helpers and dynamic request headers are included.
Some Copilot model ids use vendor names, but they are routed through the active
OpenAI/Anthropic-compatible APIs in this crate. Other native provider APIs are
not registered.

**Anthropic**: OAuth follows the Claude Pro/Max OAuth flow.

## Agent Core

Stateful agent support is part of this crate. It runs model turns, executes
registered tools, appends tool results, and continues until the assistant
stops, an error occurs, or a hook asks the loop to stop.

### Agent Installation

No separate crate is required. The agent core lives in `ai`.

```bash
cargo add ai
```

### Agent Quick Start

```rust
use ai::{providers::anthropic, Agent, AgentEvent, AgentOptions, AssistantMessageEvent, Result};

#[tokio::main]
async fn main() -> Result<()> {
    let anthropic = anthropic::from_env()?;
    let model = anthropic.model("claude-sonnet-4-5").build()?;
    let agent = Agent::new(AgentOptions::new(model));

    agent.set_system_prompt("You are a helpful assistant.").await;

    let subscription = agent.subscribe(async |event, _cancellation_token| {
        if let AgentEvent::MessageUpdate {
            assistant_message_event: AssistantMessageEvent::TextDelta { delta, .. },
            ..
        } = event
        {
            print!("{delta}");
        }

        Ok(())
    });

    agent.prompt_text("Hello!", Vec::new()).await?;
    subscription.unsubscribe();

    Ok(())
}
```

### Core Concepts

#### AgentMessage vs LLM Message

`AgentMessage` is the same portable `Message` enum used by the shared LLM API.
It can contain standard LLM messages (`User`, `Assistant`, `ToolResult`) and
`Custom` app-owned messages.

LLMs only understand user, assistant, and tool-result messages. The
`convert_to_llm` function bridges this gap by filtering or transforming custom
messages before each provider call.

#### Message Flow

```text
AgentMessage[] -> transform_context() -> AgentMessage[] -> convert_to_llm() -> Message[] -> LLM
                    (optional)                           (required)
```

`transform_context` is intended for pruning, compaction, or external context
injection. `convert_to_llm` filters or converts app-owned messages.

### Event Flow

The agent emits events for UI updates. Understanding the event sequence helps
build responsive interfaces.

#### prompt_text() Event Sequence

When you call `prompt_text("Hello")`, the wrapper emits this core sequence:

```text
prompt_text("Hello")
|- agent_start
|- turn_start
|- message_start   user
|- message_end     user
|- message_start   assistant
|- message_update  assistant delta
|- message_end     assistant
|- turn_end
`- agent_end
```

#### With Tool Calls

If the assistant calls tools, the loop emits `tool_execution_start`, optional
`tool_execution_update`, `tool_execution_end`, then a tool-result
`message_start` / `message_end`. If the batch does not terminate, the next turn
starts and the model receives the tool results.

Tool execution mode is configurable:

- `Parallel` is the default. Preflight runs sequentially, allowed tools execute
  concurrently, completion events emit as each tool finalizes, and persisted
  tool-result messages remain in assistant source order.
- `Sequential` executes tool calls one by one.

`before_tool_call` runs after `tool_execution_start` and validated argument
parsing. `after_tool_call` runs after tool execution and before final tool
events. Tool results can set `terminate = true`; the loop stops early only when
every finalized result in the batch terminates.

Low-level loop callers can set `should_stop_after_turn` to stop gracefully after
the current turn completes. It runs after `turn_end`, before steering/follow-up
queues are polled, and before another model request starts.

#### continue_run() Event Sequence

`continue_run()` resumes from existing context without adding a new message.
Use it for retries after errors. The last message in context must be a user or
tool-result message, not an assistant message.

#### Event Types

| Event | Description |
| --- | --- |
| `AgentStart` | Agent begins processing |
| `AgentEnd` | Final event for the run. Awaited subscribers for this event still count toward settlement |
| `TurnStart` | New turn begins: one LLM call plus tool executions |
| `TurnEnd` | Turn completes with assistant message and tool results |
| `MessageStart` | Any message begins: user, assistant, or tool result |
| `MessageUpdate` | Assistant-only update containing the underlying assistant stream event |
| `MessageEnd` | Message completes |
| `ToolExecutionStart` | Tool begins |
| `ToolExecutionUpdate` | Tool streams progress |
| `ToolExecutionEnd` | Tool completes |

`Agent::subscribe` listeners are awaited in registration order. `agent_end`
means no more loop events will be emitted, but `wait_for_idle` and
`prompt_text` settle only after awaited final listeners finish.

### Agent Options

`AgentOptions` contains:

- `initial_state`: system prompt, model, thinking level, tools, and messages.
- `convert_to_llm`: converts agent messages to LLM messages.
- `transform_context`: prunes, compacts, or injects context before conversion.
- `steering_mode` and `follow_up_mode`: queue handling behavior.
- `stream_fn`: custom stream function for proxy backends.
- `session_id`: forwarded through `SimpleStreamOptions`.
- `tool_execution`: parallel or sequential tool execution.
- `before_tool_call` and `after_tool_call`: preflight and postprocess hooks.
- `prepare_next_turn`: updates context, model, or thinking level before another
  turn starts.
- `options`: transport, retry, cancellation, payload hooks, provider options,
  thinking budgets, and API key defaults.

```rust
use ai::{Agent, AgentOptions, AgentState, ModelThinkingLevel, ToolExecutionMode};

let initial_state = AgentState::builder(model.clone())
    .system_prompt("You are a helpful assistant.")
    .thinking_level(ModelThinkingLevel::Medium)
    .tools(tools)
    .messages(messages)
    .build();

let agent = Agent::new(
    AgentOptions::builder(model)
        .initial_state(initial_state)
        .tool_execution(ToolExecutionMode::Parallel)
        .session_id("session-123")
        .build(),
);
```

### Agent State

`AgentState` contains the system prompt, active model, thinking level, tools,
message history, streaming status, pending tool call IDs, and the latest error
message.

```rust
let state = AgentState::builder(model)
    .system_prompt("You are a helpful assistant.")
    .thinking_level(ModelThinkingLevel::Medium)
    .message(Message::user_text("Hello"))
    .build();
```

During streaming, `streaming_message` contains the current partial assistant
message. `is_streaming` remains true until the run fully settles, including
awaited `agent_end` subscribers.

### Methods

#### Prompting

```rust
agent.prompt_text("Hello", Vec::new()).await?;
agent.prompt_messages(vec![Message::user_text("Hello")]).await?;
agent.continue_run().await?;
```

`continue_run` resumes from current context. The last message must be a user or
tool-result message.

#### State Management

Use the state and option mutation helpers to update system prompt, model,
thinking level, tools, messages, session ID, queues, hooks, and tool execution
mode. `reset` returns the agent to its initial state.

```rust
agent.set_system_prompt("New prompt").await;
agent.set_model(model).await;
agent.set_thinking_level(ModelThinkingLevel::Medium).await;
agent.set_tools(tools).await;
agent.set_tool_execution(ToolExecutionMode::Sequential).await;
agent.set_messages(new_messages).await;
agent.push_message(message).await;
agent.reset().await;
```

#### Session and Thinking Budgets

`AgentOptions::session_id` is forwarded to providers that support prompt-cache
or session affinity behavior. `AgentOptions::options.thinking_budgets` is
applied by the simple-stream option builder before each model call.

```rust
agent.set_session_id(Some("session-123".to_string())).await;
agent.set_thinking_budgets(Some(ThinkingBudgets {
    minimal: Some(128),
    low: Some(512),
    medium: Some(1024),
    high: Some(2048),
}));
```

#### Control

```rust
agent.abort().await;
agent.wait_for_idle().await;
```

#### Events

```rust
let subscription = agent.subscribe(|event, cancellation_token| async move {
    if matches!(event, AgentEvent::AgentEnd { .. }) {
        flush_session_state(cancellation_token).await?;
    }

    Ok(())
});

subscription.unsubscribe();
```

Keep the subscription handle alive while the listener remains registered.
Dropping the handle also unsubscribes.

### Steering and Follow-up

Steering messages let you interrupt the agent while it is running. Follow-up
messages let you queue work after the agent would otherwise stop.

When steering messages are detected after a turn completes:

1. All tool calls from the current assistant message have already finished.
2. Steering messages are injected.
3. The LLM responds on the next turn.

Follow-up messages are checked only when there are no more tool calls and no
steering messages.

### Custom Message Types

Use `Message::Custom` for app-specific agent transcript entries. Custom
messages are retained in agent state, then filtered or converted by
`convert_to_llm` before provider calls.

### Agent Tools

Agent tools implement the `AgentTool` trait. `definition()` returns the shared
`Tool` schema, `label()` provides UI text, `execution_mode()` can force a whole
batch to run sequentially, `prepare_arguments()` can reshape model arguments
before validation, and `execute()` performs the tool work.

```rust
use ai::{
    providers::openai, Agent, AgentOptions, AgentToolBuilder, AgentToolResult, Result,
};
use serde_json::{json, Value};

#[tokio::main]
async fn main() -> Result<()> {
    let weather_tool = AgentToolBuilder::new("get_weather")
        .description("Get current weather for a city.")
        .parameters(json!({
            "type": "object",
            "properties": {
                "city": { "type": "string" }
            },
            "required": ["city"]
        }))
        .label("Weather")
        .execute(|args| async move {
            let city = args
                .get("city")
                .and_then(Value::as_str)
                .unwrap_or("unknown city");

            Ok(AgentToolResult::text(format!(
                "The weather in {city} is 68F and clear."
            )))
        })
        .build()?;

    let openai = openai::from_env()?;
    let model = openai.model("gpt-5.5").build()?;

    let agent = Agent::new(
        AgentOptions::builder(model)
            .tool(weather_tool)
            .build(),
    );

    agent
        .prompt_text("What is the weather in Seattle?", Vec::new())
        .await?;

    Ok(())
}
```

Implement `AgentTool` directly when a tool needs state, custom argument
preparation, an execution mode override, cancellation handling, or streaming
updates.

```rust
use ai::{AgentTool, AgentToolResult, AgentToolUpdateCallback, Tool};
use async_trait::async_trait;
use serde_json::Value;
use tokio_util::sync::CancellationToken;

struct WeatherTool;

#[async_trait]
impl AgentTool for WeatherTool {
    fn definition(&self) -> Tool {
        Tool::builder("get_weather")
            .description("Get current weather for a city.")
            .build()
            .expect("valid weather tool")
    }

    fn label(&self) -> &str {
        "Weather"
    }

    async fn execute(
        &self,
        _tool_call_id: &str,
        args: Value,
        _cancellation_token: Option<CancellationToken>,
        _on_update: Option<AgentToolUpdateCallback>,
    ) -> ai::AgentResult<AgentToolResult> {
        let city = args
            .get("city")
            .and_then(Value::as_str)
            .unwrap_or("unknown city");

        Ok(AgentToolResult::text(format!(
            "The weather in {city} is 68F and clear."
        )))
    }
}
```

#### Agent Tool Error Handling

Tool failures should return an error from `execute()`. The loop catches that
error and reports a tool-result message with `is_error = true`.

Return `terminate = true` from `execute()` or `after_tool_call` to hint that
the agent should stop after the current tool batch. This only takes effect when
every finalized tool result in the batch is terminating.

### Proxy Usage

For proxy backends, pass a custom `StreamFn` through `AgentOptions::stream_fn`
or directly to `agent_loop`. The function receives the selected `Model`, the
converted `Context`, and `SimpleStreamOptions`.

### Low-Level API

Use `agent_loop` or `agent_loop_continue` when you want an event stream, and
`run_agent_loop` or `run_agent_loop_continue` when you want to await the whole
loop directly.

```rust
use futures::StreamExt;

use ai::{
    agent_loop, agent_loop_continue, providers::openai, AgentContext, AgentLoopConfig, Message,
    Result,
};

#[tokio::main]
async fn main() -> Result<()> {
    let openai = openai::from_env()?;
    let model = openai.model("gpt-5.5").build()?;

    let context = AgentContext::builder()
        .system_prompt("You are helpful.")
        .build();
    let config = AgentLoopConfig::new(model);
    let user_message = Message::user_text("Hello");

    let mut events = agent_loop(
        vec![user_message.clone()],
        context.clone(),
        config.clone(),
        None,
        None,
    );
    while let Some(event) = events.next().await {
        println!("{event:?}");
    }

    let mut context = context;
    context.messages.push(user_message);
    let mut events = agent_loop_continue(context, config, None, None)?;
    while let Some(event) = events.next().await {
        println!("{event:?}");
    }

    Ok(())
}
```

Low-level streams are observational. They preserve event order, but they do not
wait for async event handling to settle before later producer phases continue.
Use `Agent` when message processing must be a barrier before tool preflight.

## Development

```bash
cargo fmt --all --check
cargo check -p ai --all-targets
cargo clippy -p ai --all-targets -- -D warnings
cargo test -p ai
cargo test -p ai --lib
cargo test -p ai --doc
cargo test -p ai --tests
cargo test --workspace
```

This crate currently keeps Rust test coverage in module-level unit tests under
`src`; there is no `crates/ai/tests` integration-test directory at the moment.

### Adding a New Provider

Adding a new LLM provider generally requires changes across multiple files:

#### 1. Core Types (`src/types.rs`)

- Add the API identifier if the provider needs a new transport shape.
- Create provider-specific options where direct provider calls need typed
  options.
- Add or extend compatibility metadata only when the payload behavior differs.

#### 2. Provider Implementation (`src/providers/`)

Create a provider module that exports:

- `stream_<provider>()`
- `stream_simple_<provider>()`
- Provider-specific options
- Message conversion from `Context` to provider payload
- Tool conversion if the provider supports tools
- Response parsing into standardized assistant events

#### 3. Provider Factory

- Implement `Provider` for the configured provider handle.
- Return model builders from `model(id)` and future capability builders.
- Add root-level exports in `src/lib.rs` when the provider should be public.

#### 4. Runtime API

- Implement the capability runtime trait carried by the built model, such as
  `LanguageModelApi`.
- Map provider capability, cost, input, context-window, and reasoning metadata
  onto the shared `Model` type.

#### 5. Tests

Create or update tests for streaming, tool use, token usage, abort behavior,
context overflow, empty messages, Unicode handling, tool-result edge cases,
image input/tool-result images if applicable, and cross-provider handoff.

#### 6. Agent Integration

If the provider needs agent-specific behavior, update this crate's agent tests
and examples directly.

#### 7. Documentation

Update this README with provider scope, authentication, provider-specific
options, and environment variables.

## License

MIT