AI & ML interests

Frontier alignment research to ensure the safe development and deployment of advanced AI systems.

Recent Activity

chrisjcundy  updated a dataset about 13 hours ago
AlignmentResearch/deceptive-followup-v19
chrisjcundy  published a dataset about 13 hours ago
AlignmentResearch/deceptive-followup-v19
chrisjcundy  updated a dataset about 14 hours ago
AlignmentResearch/deceptive-followup-v17
View all activity

AlignmentResearch 's collections 4

The Obfuscation Atlas
Obfuscated Policy, Obfuscated Activations, Blatant Deception, and Honest models trained in the Obfuscation Atlas paper.
The Obfuscation Altas
Obfuscated Policy, Obfuscated Activations, Blatant Deception, and Honest models trained in the Obfuscation Atlas paper