LLM Attention Kernels and GPU Primitivesđ Jun 5, 2026 · â 1 min read · âī¸ k4iA series index for LLM attention kernels and GPU primitives: fused softmax, online softmax, FlashAttention, PagedAttention kernels, Triton/CUDA, and memory-access optimization.