Transformer

From Absolute Positional Encoding to RoPE: Why Position Can Be a Rotation

📅 May 28, 2026 · ☕ 10 min read · ✍️ k4i

A step-by-step explanation of positional encoding in Transformers, from absolute embeddings to sinusoidal encodings, Euler's formula, and rotary position embeddings.

Estimating Compute and Memory Requirements for LLM Training and Inference

📅 May 27, 2026 · ☕ 17 min read · ✍️ k4i

A back-of-the-envelope framework for estimating LLM training FLOPs, inference FLOPs, weight memory, KV cache, and training memory.

Why KV Cache Works in LLM Inference

📅 Apr 20, 2026 · ☕ 9 min read · ✍️ k4i

why the key-value cache avoids redundant computation in autoregressive decoding, and the memory/compute tradeoffs it introduces.