RL/Evaluation Jobs
Browse 30 RL/Evaluation jobs on Inference Jobs.
30 jobs
3w ago
OP
3w ago
AN
Research Engineer, Machine Learning (RL Velocity)
Anthropic
San Francisco, California, United States (Hybrid)$500K – $850K Yearly
3w ago
AN
Research Engineer, RL Infrastructure and Reliability (Knowledge Work)
Anthropic
San Francisco, California, United States (Hybrid)$350K – $850K Yearly
7d ago
AA
Senior AI Researcher- Reinforcement learning (f/m/d)
Aleph Alpha
Heidelberg, Baden-Württemberg, Germany (Hybrid)
3w ago
CO
Member of Technical Staff, Integration/RL Team (Research Engineer)
Cohere
Paris, FR or Remote (Eastern Time Zone, United States + 27 more)
2d ago
FI
3w ago
OP
3w ago
XA
Member of Technical Staff - Post-Training and RL
xAI
Palo Alto, California, United States (On-site)$180K – $600K Yearly
2d ago
AN
Research Engineer, Performance RL
Anthropic
San Francisco, California, United States (Hybrid)$350K – $850K Yearly
6d ago
AN
Research Engineer, Code RL (Reinforcement Learning)
Anthropic
San Francisco, California, United States (Hybrid)$500K – $850K Yearly
2d ago
XA
Member of Technical Staff - RL Infrastructure [data, evals, agent]
xAI
Palo Alto, California, United States (On-site)$180K – $440K Yearly
3w ago
OP
Research Engineer/Research Scientist, RL/Reasoning
OpenAI
San Francisco, California, United States (Hybrid)$295K – $445K Yearly
2d ago
LA
Forward Deployed Engineer, RL Environments
Labelbox
San Francisco, California, United States (Hybrid)$140K – $200K Yearly
2d ago
TM
Research Engineer, Infrastructure, RL Systems
Thinking Machines Lab
San Francisco, California, United States (On-site)$350K – $475K Yearly
4w ago
SC
Research Scientist, Safety Post Training
Scale
San Francisco, California, United States (On-site)$216K – $270K Yearly
6d ago
OP
Researcher, Agent Post-Training, Personality
OpenAI
San Francisco, California, United States (On-site)$295K – $445K Yearly
3w ago
OP
Researcher, Artifacts - Agent Post-Training
OpenAI
California, United States (Remote)$250K – $380K Yearly
3w ago
AN
Research Engineer, Machine Learning (RL Velocity)
Anthropic
London, England, United Kingdom (Hybrid)£370K – £630K Yearly