Director, AI Alignment and Interpretability (Remote)
Indexed description
CrowdStrike Director, AI Alignment and Interpretability (Remote) An Hour AgoSaved Remote or Hybrid USA 195K-290K Annually Senior level 195K-290K Annually Senior levelCloud • Computer Vision • Information Technology • Sales • Security • CybersecurityLead and conduct mechanistic interpretability and alignment research for security-specialized AI. Develop methods to read model internals, detect misuse signals, design training interventions and evaluation frameworks, publish original research, and recruit and mentor a lean research team.Top Skills: Activation PatchingAdversarial EvaluationAlignment EvaluationsCausal TracingCircuit AnalysisFeature VisualizationLarge Language ModelsMechanistic InterpretabilityProbing ClassifiersRed Teaming
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search