Cluster Reliability Jobs
Browse 219 Cluster Reliability jobs on Inference Jobs.
81-100 of 219 jobs
1wCO
Security Production Engineer
CoreWeave
Livingston, New Jersey, United States (Hybrid)$139k – $275k Yearly
2wCO
Staff Production Engineer, Security
CoreWeave
Livingston, New Jersey, United States (Hybrid)$188k – $275k Yearly
4dQD
Senior SRE Engineer - Cloud Operations (Remote, Americas Only)
Qdrant
Berlin, Berlin, Germany or Remote (Americas)
3wOP
Backend Software Engineer - B2B Connectors
OpenAI
San Francisco, California, United States (On-site)$230k – $385k Yearly
4wAN
Technical Program Manager, Reliability Engineering
Anthropic
San Francisco, California, United States (Hybrid)$290k – $365k Yearly
2wCR
Production Engineer, Storage
Crusoe
San Francisco, California, United States (On-site)$166k – $201k Yearly
2wPE
AI Infra Engineer (San Francisco)
Perplexity
San Francisco, California, United States (On-site)$210k – $385k Yearly
1wNV
Principal Datacenter Resiliency Architect, RAS Features and Modeling
NVIDIA
Santa Clara, California, United States (On-site)$272k – $431.3k Yearly
3wCR
Manager, Field Operations - Spark
Crusoe
Denver, Colorado, United States (On-site)$140.3k – $170k Yearly
1wFI
AI Training Infrastructure Engineer - Helix Team
Figure
San Jose, California, United States (On-site)$150k – $350k Yearly
2wCO
Senior Software Engineer, Kubernetes
CoreWeave
Livingston, New Jersey, United States (Hybrid)$120k – $176k Yearly
2wNV
Senior Software Engineer, AI Resiliency
NVIDIA
Redmond, Washington, United States (On-site)$184k – $287.5k Yearly