Model Interpretability Jobs
Explore Model Interpretability roles on Inference Jobs and apply today.
2w agoAN
[Expression of Interest] Research Manager, Interpretability
Anthropic
San Francisco, California, United States (Hybrid)$350K – $500K Yearly
2w agoAN
Research Engineer, Interpretability
Anthropic
San Francisco, California, United States (Hybrid)$315K – $560K Yearly
2w agoAN
Senior Research Scientist, Reward Models
Anthropic
San Francisco, California, United States (Hybrid)$350K – $500K Yearly
4w agoBA
3mo agoRA
Member of Technical Staff - Safety Lead
Reflection AI
San Francisco, California, United States (On-site)
2w agoAN
Research Scientist, Interpretability
Anthropic
San Francisco, California, United States (Hybrid)$350K – $850K Yearly
3mo agoOP
Researcher, Interpretability
OpenAI
San Francisco, California, United States (On-site)$310K – $460K Yearly
4w agoAN
Communications Manager, Research
Anthropic
New York, New York, United States (Hybrid)$185K – $255K Yearly
3mo agoAN
Research Scientist, Societal Impacts
Anthropic
San Francisco, California, United States (Hybrid)$350K – $850K Yearly
2w agoAN
Machine Learning Engineer, Safeguards
Anthropic
San Francisco, California, United States (Hybrid)$350K – $500K Yearly
5d agoAN
2mo agoAN
Research Engineer / Research Scientist, Tokens
Anthropic
New York, New York, United States (Hybrid)$350K – $500K Yearly
3mo agoCA
2w agoAN
Research Engineer / Scientist, Alignment Science, London
Anthropic
London, England, United Kingdom (Hybrid)£260K – £370K Yearly
4w agoSC
Research Scientist, AI Controls and Monitoring
Scale
San Francisco, California, United States (On-site)$197.4K – $246.8K Yearly
3mo agoOP
2mo agoAN
Model Quality Software Engineer, Claude Code
Anthropic
San Francisco, California, United States (Hybrid)$320K – $485K Yearly