1. Home
  2. Jobs
  3. Fault-tolerant Systems

Fault-tolerant Systems Jobs

Browse 11 Fault-tolerant Systems jobs on Inference Jobs.

11 jobs

2wOP

Software Engineer, Platform Systems

OpenAI

San Francisco, California, United States (On-site)$310k – $460k Yearly
2wNV

Senior Software Engineer, AI Resiliency

NVIDIA

Redmond, Washington, United States (On-site)$184k – $287.5k Yearly
2wNV
2wTA

Senior Software Engineer, Observability

Together AI

San Francisco, California, United States (Hybrid)$160k – $260k Yearly
1wOP

Reliability/DFX Engineer

OpenAI

San Francisco, California, United States (On-site)$285k – $460k Yearly
2wMA

Distributed Systems Engineer

Magic

San Francisco, California, United States (On-site)$225k – $550k Yearly
5dCE

Engineering Manager, Kernel Reliability

Cerebras

Sunnyvale, California, United States (On-site)
10hNV

Senior System Architect, Infrastructure Reliability

NVIDIA

Santa Clara, California, United States (Hybrid)$184k – $356.5k Yearly
4dNV

Principal Datacenter Resiliency Architect, RAS Features and Modeling

NVIDIA

Santa Clara, California, United States (On-site)$272k – $431.3k Yearly