1. Home
  2. Jobs
  3. Cluster Reliability

Cluster Reliability Jobs

Browse 225 Cluster Reliability jobs on Inference Jobs.

41-60 of 225 jobs

1wFI

Reliability Engineer (All Levels)

Figure

San Jose, California, United States (On-site)$120k – $250k Yearly
4wCR

Site Reliability Engineering Intern, Summer 2026

Crusoe

San Francisco, California, United States (On-site)
3wCR

Senior Site Reliability Engineer, Managed AI

Crusoe

San Francisco, California, United States (On-site)$172k – $209k Yearly
1wTM

Research Engineer, Infrastructure, RL Systems

Thinking Machines Lab

San Francisco, California, United States (On-site)$350k – $475k Yearly
2wNV

Senior Silicon Reliability Engineer

NVIDIA

Santa Clara, California, United States (On-site)$168k – $264.5k Yearly
3wHE

Senior Site Reliability Engineer

Heidi

Sydney, New South Wales, Australia (Hybrid)
3dNV

Senior Data Scientist – EDA Datacenter Observability and Reliability

NVIDIA

Santa Clara, California, United States (Hybrid)$184k – $356.5k Yearly
4wFI

Staff Site Reliability Engineer

Figure

San Jose, California, United States (On-site)$175k – $250k Yearly
2wNV

Director, Global Network Reliability Engineering

NVIDIA

Santa Clara, California, United States (On-site)$268k – $408.3k Yearly
3dNV

Manager, Software Verification

NVIDIA

California, United States (Hybrid)$224k – $431.3k Yearly
1wXA

Site Reliability Engineer - xAI Technical Operations

xAI

Palo Alto, California, United States (On-site)$180k – $400k Yearly
2wOP

Software Engineer, Infrastructure Reliability

OpenAI

San Francisco, California, United States (On-site)$255k – $385k Yearly
2wOP

Reliability/DFX Engineer

OpenAI

San Francisco, California, United States (On-site)$285k – $460k Yearly
3dNV

Senior Site Reliability Engineer - HPC

NVIDIA

Santa Clara, California, United States (On-site)$152k – $287.5k Yearly
3dNV

Senior Software Engineer - Deep Learning Compiler Verification and Infrastructure

NVIDIA

Santa Clara, California, United States (On-site)$140k – $224.3k Yearly
5dNV

Senior Reliability Engineer - LPU Packaging

NVIDIA

Santa Clara, California, United States (On-site)$168k – $310.5k Yearly