Model Interpretability Jobs
Explore Model Interpretability roles on Inference Jobs and apply today.
3mo agoOP
Technical Program Manager – Adversarial Model Research
OpenAI
San Francisco, California, US$230K – $285K Yearly
1mo agoAN
Research Lead, Training Insights
Anthropic
San Francisco, California, United States (Hybrid)$850K – $850K Yearly
2mo agoOP
Model Policy Manager, Chemical & Biological Risk
OpenAI
San Francisco, California, US$207K – $295K Yearly
3mo agoOP
TLM, Machine Learning, Integrity
OpenAI
San Francisco, California, United States (On-site)$405K – $490K Yearly
1mo agoD-
Principal Architect, Performance Analysis and Modeling
d-Matrix
Santa Clara, California, United States (Hybrid)$190K – $280K Yearly
3mo agoCO
Senior Research Engineer, Model Evaluation
Cohere
Toronto, Ontario, Canada or Remote (Canada + 2 more)
3mo agoAN
Research Product Manager, Model Behaviors
Anthropic
San Francisco, California, United States (Hybrid)$305K – $385K Yearly
2w agoAN
[Expression of Interest] Research Scientist/Engineer, Honesty
Anthropic
New York, New York, United States (Hybrid)$350K – $500K Yearly
2mo agoSC
Machine Learning Engineer - Model Evaluations, Public Sector
Scale
San Francisco, California, United States (On-site)$216.3K – $300.3K Yearly
2mo agoAN
Research Engineer, AI Observability
Anthropic
San Francisco, California, United States (Hybrid)$320K – $405K Yearly
3mo agoCE
3mo agoOP
Research Engineer, Frontier Evals & Environments
OpenAI
San Francisco, California, United States (On-site)$200K – $370K Yearly
3mo agoRA
Member of Technical Staff - Post-Training
Reflection AI
San Francisco, California, United States (On-site)
5d agoAN
Data Scientist, Finance Forecasting
Anthropic
San Francisco, California, United States (Hybrid)$270K – $320K Yearly
2mo agoPE
Research Engineering Manager - Model Training
Perplexity
San Francisco, California, United States (On-site)$300K – $470K Yearly
4w agoOP
Machine Learning Engineer, Integrity
OpenAI
San Francisco, California, United States (On-site)$266K – $555K Yearly