1. Home
  2. Jobs
  3. Cluster Reliability

Cluster Reliability Jobs

Browse 219 Cluster Reliability jobs on Inference Jobs.

81-100 of 219 jobs

1wCO

Security Production Engineer

CoreWeave

Livingston, New Jersey, United States (Hybrid)$139k – $275k Yearly
2wCO

Staff Production Engineer, Security

CoreWeave

Livingston, New Jersey, United States (Hybrid)$188k – $275k Yearly
3wOP

Backend Software Engineer - B2B Connectors

OpenAI

San Francisco, California, United States (On-site)$230k – $385k Yearly
4wAN

Technical Program Manager, Reliability Engineering

Anthropic

San Francisco, California, United States (Hybrid)$290k – $365k Yearly
2wMA

Datacenter Hardware Engineer, HPC

Mistral AI

Île de Ré, Charente-Maritime, France (On-site)
1wDE

Infrastructure Engineer

Descript

California, United States (Remote)$191k – $250k Yearly
2wCR

Production Engineer, Storage

Crusoe

San Francisco, California, United States (On-site)$166k – $201k Yearly
2wPE

AI Infra Engineer (San Francisco)

Perplexity

San Francisco, California, United States (On-site)$210k – $385k Yearly
1wCO

Senior Production Engineer

CoreWeave

Livingston, New Jersey, United States (Hybrid)
3wXA

Member of Technical Staff

xAI

Palo Alto, California, United States (On-site)
3wNE

DC Role Template

Nebius

Béthune, Pas-de-Calais, France (On-site)
1wNV

Principal Datacenter Resiliency Architect, RAS Features and Modeling

NVIDIA

Santa Clara, California, United States (On-site)$272k – $431.3k Yearly
3wCR

Manager, Field Operations - Spark

Crusoe

Denver, Colorado, United States (On-site)$140.3k – $170k Yearly
1wFI

AI Training Infrastructure Engineer - Helix Team

Figure

San Jose, California, United States (On-site)$150k – $350k Yearly
2wCO

Senior Software Engineer, Kubernetes

CoreWeave

Livingston, New Jersey, United States (Hybrid)$120k – $176k Yearly
3wNV
2wNV

Senior Software Engineer, AI Resiliency

NVIDIA

Redmond, Washington, United States (On-site)$184k – $287.5k Yearly