PREMIUM SECTION

Papers, explained.

Talking-head videos and audiobook-style explainers of seminal and trending research papers — with notation walk-throughs, intuition, and runnable code where possible.

audio26 min

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Gu, Dao · 2023

A linear-time alternative to Transformers based on input-selective state-space models — competitive at modest scale.

FREE PREVIEWOpen

video55 min

Constitutional AI: Harmlessness from AI Feedback

Bai, Kadavath, Kundu et al. · 2022

Replace human harmlessness labels with AI-generated critiques + revisions guided by a written constitution; reinforces with RLAIF.

PREMIUMOpen

video37 min

Direct Preference Optimization (DPO)

Rafailov, Sharma, Mitchell et al. · 2023

Aligning language models with human preferences without an explicit reward model — DPO derives a closed-form objective from RLHF's KL-constrained problem.

PREMIUMOpen

video38 min

Denoising Diffusion Probabilistic Models

Ho, Jain, Abbeel · 2020

Foundational work on diffusion models for image generation — gradually noising and learning to denoise.

PREMIUMOpen

audio21 min

LoRA: Low-Rank Adaptation of Large Language Models

Hu, Shen, Wallis et al. · 2021

Parameter-efficient fine-tuning via low-rank decomposition of weight deltas, reducing trainable parameters by 10,000× with no inference latency cost.

PREMIUMOpen

video42 min

Attention Is All You Need

Vaswani, Shazeer, Parmar et al. · 2017

The paper that introduced the Transformer architecture — covering scaled dot-product attention, multi-head attention, positional encodings, and the original encoder-decoder design.

PREMIUMOpen

Want full library access?

Premium subscribers get all video transcripts, audiobooks, and code repositories.

View plans