Failure Detection Jobs
Browse 89 Failure Detection jobs on Inference Jobs.
61-80 of 89 jobs
2wNV
Senior Networking Solution Test Engineer, AI Cluster Debugging
NVIDIA
Yokne'am, Northern District, Israel (Hybrid)
2wNV
Senior Software Engineer, AI Resiliency
NVIDIA
Redmond, Washington, United States (On-site)$184k – $287.5k Yearly
3wNV
Senior Reliability Engineer
NVIDIA
Santa Clara, California, United States (Hybrid)$168k – $264.5k Yearly
2wOP
1wNV
2wOP
Software Engineer, Frontier Systems
OpenAI
San Francisco, California, United States (On-site)$295k – $440k Yearly
6dNE
Senior Site Reliability Engineer — Token Factory (Inference Platform)
Nebius
Netherlands + 4 more (Remote)
3wAN
Technical Program Manager, Safeguards – Infrastructure & Evals
Anthropic
San Francisco, California, United States (Hybrid)$290k – $365k Yearly