Expand description
Linear attention using random feature approximation (Performer-style)
Complexity: O(n * k * d) where k = number of random features
Structs§
- Linear
Attention - Linear attention with random feature maps
Enums§
- Kernel
Type - Kernel type for linear attention