Failure Detection Jobs
Explore Failure Detection roles on Inference Jobs and apply today.
2mo agoHA
Technical Program Manager, Quality and Reliability
Harvey
San Francisco, California, United States (On-site)$200K – $275K Yearly
4w agoNE
Senior Site Reliability Engineer — Token Factory (Inference Platform)
Nebius
United States + 4 more (Remote)
3w agoSC
Trust and Safety - Senior Data Scientist
Scale
San Francisco, California, United States (On-site)$198K – $247.5K Yearly
2mo agoOP
Data Center Incident Program Manager
OpenAI
United States or Remote (United States)$125.6K – $228K Yearly
2mo agoNV
Senior System Software Engineer, Data Center Diagnostics
NVIDIA
Santa Clara, California, United States (On-site)$152K – $287.5K Yearly
6d agoNV
2w agoGR
Principal Embedded SW/FW Engineer (Bringup) - Bengaluru, multiple vacancies
Graphcore
Bengaluru, Karnataka, India (On-site)
3w agoNV
Mechanical Team Manager, ETH and IB Switches Domain
NVIDIA
Yokne'am, Northern District, Israel (On-site)
2mo agoCO
2mo agoNV
2mo agoOP
Thermal-Mechanical Manufacturing Engineer
OpenAI
San Francisco, California, United States (Hybrid)$123K – $285K Yearly
3mo agoOP
Protection Scientist Engineer, Intelligence and Investigations
OpenAI
London, England, United Kingdom (On-site)
2w agoSC