Paper Explainers
video55 MIN PREMIUM

Constitutional AI: Harmlessness from AI Feedback

Bai, Kadavath, Kundu, Askell, Kernion, Jones, et al. · 2022 · DOI 10.48550/arXiv.2212.08073
Read original paper
SUMMARY

Replace human harmlessness labels with AI-generated critiques + revisions guided by a written constitution; reinforces with RLAIF.

Unlock the full explainer

Premium subscribers get the full video, transcript, and code repository.

View pricing plans

Made with Emergent