1. Home
  2. Jobs
  3. Failure handling

Failure handling Jobs

Browse 60 Failure handling jobs on Inference Jobs.

41-60 of 60 jobs
3d agoNV

Senior Hardware Customer Quality Engineer

NVIDIA

Santa Clara, California, United States (On-site)$168k – $322k Yearly
3w agoCR

Incident Manager

Crusoe

San Francisco, California, United States (On-site)$136.1k – $165k Yearly
2w agoCO

Operations Engineering Manager, Fleet Reliability

CoreWeave

Dublin, Dublin, Ireland (Hybrid)€97k – €130k Yearly
4d agoNE

L3 Function Head

Nebius

Amsterdam, North Holland, Netherlands (On-site)
3d agoNV
3w agoOP

Data Center Incident Program Manager

OpenAI

United States or Remote (United States)$125.6k – $228k Yearly
1w agoCO

Infrastructure Operations Program Manager

CoreWeave

London, England, United Kingdom (Hybrid)£60k – £80k Yearly
3w agoNV

Senior Software Engineer, AI Resiliency

NVIDIA

Redmond, Washington, United States (On-site)$184k – $287.5k Yearly
4d agoNE

Senior Hardware Support Engineer

Nebius

United States (Remote)$125k – $180k Yearly
1w agoCO

Manager, Bare Metal Support Engineering

CoreWeave

London, England, United Kingdom (Hybrid)£103k – £137k Yearly
3w agoNV
4w agoAN

Technical Program Manager, Safeguards – Infrastructure & Evals

Anthropic

San Francisco, California, United States (Hybrid)$290k – $365k Yearly
4w agoAN

Technical Program Manager, Reliability Engineering

Anthropic

San Francisco, California, United States (Hybrid)$290k – $365k Yearly
4d agoNE

Senior Hardware Support Engineer

Nebius

United States (Remote)$125k – $180k Yearly
4w agoHA

Technical User Operations Specialist - Weekend Coverage

Harvey

United States or Remote (United States)$94k – $126k Yearly
2w agoCO

Senior Hardware Engineer, GPU & PCIe

CoreWeave

Livingston, New Jersey, United States (Hybrid)$150k – $250k Yearly
4d agoNE

Senior Hardware Support Engineer

Nebius

United States (Remote)$125k – $180k Yearly
2d agoNE

L3 Support Engineer

Nebius

Béthune, Pas-de-Calais, France (On-site)