1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
//! # trueno-gpu: Pure Rust PTX Generation for NVIDIA CUDA
//!
//! Generate PTX assembly directly from Rust - no LLVM, no nvcc, no external dependencies.
//!
//! ## Philosophy
//!
//! **Own the Stack** - Build everything from first principles for complete control,
//! auditability, and reproducibility.
//!
//! ## Quick Start
//!
//! ```rust
//! use trueno_gpu::ptx::{PtxModule, PtxKernel, PtxType};
//!
//! // Build a vector addition kernel
//! let module = PtxModule::new()
//! .version(8, 0)
//! .target("sm_70")
//! .address_size(64);
//!
//! let ptx_source = module.emit();
//! assert!(ptx_source.contains(".version 8.0"));
//! ```
//!
//! ## Modules
//!
//! - [`ptx`] - PTX code generation (builder pattern)
//! - [`driver`] - CUDA driver API (minimal FFI, optional)
//! - [`kernels`] - Hand-optimized GPU kernels
//! - [`memory`] - GPU memory management
//! - [`backend`] - Multi-backend abstraction
// ============================================================================
// Development-phase lint allows - to be addressed incrementally
// ============================================================================
// Allow dead code during development - will be used as API expands
// Allow precision loss in non-critical floating point calculations
// Allow possible truncation - we handle 64-bit correctly
// Allow format push string - not a critical performance path
// Allow doc markdown for code references - these are placeholders
// Allow missing errors doc during initial development
// Allow unnecessary literal bound for backend trait
// Allow manual div_ceil - will use std when stabilized
// Allow missing panics doc during initial development
// Allow cast_lossless - we intentionally use as for u32->u64
// Allow uninlined format args - stylistic preference
// Allow map_unwrap_or - more readable with map().unwrap_or()
// Allow redundant closure for method calls - clearer intent
// Allow unused self - methods will use self as API expands
// Allow expect_used in tests and non-critical paths
// Allow too_many_lines during development - will be refactored
// Allow needless_range_loop - clearer intent in some algorithms
// Allow float_cmp in tests where exact comparison is intended
// Allow unused comparisons - some are defensive checks
// Allow unwrap_used in tests
// Allow cast_sign_loss - we know values are positive
// Allow field_reassign_with_default - clearer test setup
// Allow panic in tests
// Allow manual_range_contains - clearer in assertions
// Allow default_constructed_unit_structs
// Allow clone_on_copy - clearer intent
// Allow absurd_extreme_comparisons - defensive checks
// Allow no_effect_underscore_binding - intentional in tests
// Allow must_use_candidate - methods may return values not always needed
// Allow manual_find - clearer intent in some cases
// Allow type_complexity - complex return types for tuples
// Allow range_plus_one - clearer in some contexts
// Allow map_clone - clearer intent
// Allow manual_is_multiple_of - not yet stabilized
// Allow items_after_statements - const definitions in kernels
// Allow doc_lazy_continuation - doc formatting
// Allow useless_vec in tests - clearer intent
// Allow similar names - k_h vs kt_h are semantically distinct (key vs key-transposed)
// Allow many single char names - standard matrix notation (a, b, m, n, k)
// Allow doc nested refdefs - acceptable in list items
// Allow cloned instead of copied - semantic clarity
// Allow too many arguments - GPU APIs require many parameters
// Allow explicit lifetimes - clearer for complex lifetime relationships
// Allow manual slice size calculation - clearer intent
/// Error types for trueno-gpu operations
/// E2E visual testing framework for GPU kernels
/// WASM visual testing bindings (requires viz feature)
pub use ;
pub use ;
// NOTE: ComputeBrick is available from the trueno crate, not trueno-gpu
// This is because trueno optionally depends on trueno-gpu (not vice versa)
// Usage: `use trueno::brick::{ComputeBrick, ComputeBackend, TokenBudget};`
// See: trueno/src/brick.rs for the full brick architecture