Production Reliability Jobs
Explore Production Reliability roles on Inference Jobs and apply today.
3mo agoCD
2mo agoOP
Data Center Incident Program Manager
OpenAI
United States or Remote (United States)$125.6K – $228K Yearly
1w agoAN
Research Engineer, Production Model Post Training
Anthropic
San Francisco, California, United States (Hybrid)$350K – $500K Yearly
2mo agoAN
Technical Program Manager, Infrastructure
Anthropic
San Francisco, California, United States (Hybrid)$290K – $365K Yearly
3mo agoOP
Software Engineer, Fleet Hardware Health
OpenAI
San Francisco, California, United States (On-site)$255K – $490K Yearly
4w agoCO
Facilities Technical Manager - Afton
CoreWeave
Afton, Texas, United States (On-site)$122K – $163K Yearly
2w agoTM
Software Engineer, Full Stack
Thinking Machines Lab
San Francisco, California, United States (On-site)$350K – $475K Yearly
6d agoCR
Senior Staff Engineer, Cloud Site Operations
Crusoe
San Francisco, California, United States (On-site)$179K – $218K Yearly
4w agoCO
Facilities Technical Manager - Richmond
CoreWeave
Richmond, Virginia, United States (Hybrid)$122K – $163K Yearly
2mo agoOP
Infrastructure Engineering Lead, IT
OpenAI
San Francisco, California, United States (On-site)$225K – $275K Yearly
2mo agoCR
Manager, Field Operations - Spark
Crusoe
Denver, Colorado, United States (On-site)$140.3K – $170K Yearly
4w agoNE
Technical Product Manager (Cluster Experience)
Nebius
Centrum, Amsterdam, North Holland, NL or Remote (Netherlands + 3 more)
3w agoOP
Commissioning and Quality Program Manager - Stargate
OpenAI
San Francisco, California, United States (On-site)$164K – $268K Yearly