Papers, explained.
Talking-head videos and audiobook-style explainers of seminal and trending research papers — with notation walk-throughs, intuition, and runnable code where possible.
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
A linear-time alternative to Transformers based on input-selective state-space models — competitive at modest scale.
Constitutional AI: Harmlessness from AI Feedback
Replace human harmlessness labels with AI-generated critiques + revisions guided by a written constitution; reinforces with RLAIF.
Direct Preference Optimization (DPO)
Aligning language models with human preferences without an explicit reward model — DPO derives a closed-form objective from RLHF's KL-constrained problem.
Denoising Diffusion Probabilistic Models
Foundational work on diffusion models for image generation — gradually noising and learning to denoise.
LoRA: Low-Rank Adaptation of Large Language Models
Parameter-efficient fine-tuning via low-rank decomposition of weight deltas, reducing trainable parameters by 10,000× with no inference latency cost.
Attention Is All You Need
The paper that introduced the Transformer architecture — covering scaled dot-product attention, multi-head attention, positional encodings, and the original encoder-decoder design.
Want full library access?
Premium subscribers get all video transcripts, audiobooks, and code repositories.
View plans