1. Home
  2. Jobs
  3. GPU Inference

GPU Inference Jobs

Browse 617 GPU Inference jobs on Inference Jobs.

617 jobs

1wOP

Software Engineer, Inference – AMD GPU Enablement

OpenAI

San Francisco, California, United States (On-site)$325k – $490k Yearly
2wPL

LLM Inference Engineer

Periodic Labs

Menlo Park, California, United States (On-site)
6dVA

GPU Systems Engineer – HPC / Parallel Computing

Vast.ai

San Francisco, California, United States (On-site)$160k – $320k Yearly
6dOP

Inference Runtime, Engineering Manager

OpenAI

San Francisco, California, United States (On-site)$455k – $555k Yearly
3wCE

Inference Frontend

Cerebras

Sunnyvale, California, United States (On-site)
1wHA

LLM Inference Engineer

Hippocratic AI

Palo Alto, California, United States (On-site)
2wNV

Principal Software Engineer - Inference as a Service

NVIDIA

Santa Clara, California, United States (On-site)$248k – $391k Yearly
2wNV

Senior Software Engineer - Inference as a Service

NVIDIA

Santa Clara, California, United States (On-site)$200k – $391k Yearly
2wPE

Inference Engineering Manager

Perplexity

San Francisco, California, United States (On-site)$300k – $385k Yearly
2wOP

Inference Technical Lead, Sora

OpenAI

San Francisco, California, United States (Hybrid)$380k – $380k Yearly
2wPE

AI Inference Engineer (San Francisco)

Perplexity

San Francisco, California, United States (On-site)$210k – $385k Yearly
6dTA

LLM Inference Frameworks and Optimization Engineer

Together AI

San Francisco, California, United States (On-site)$160k – $230k Yearly
3dNV

Senior AI Inference Compiler Engineer

NVIDIA

Santa Clara, California, United States (On-site)$152k – $241.5k Yearly
2wPE

AI Inference Engineer (London)

Perplexity

London, England, United Kingdom (On-site)
2wBA

Engineering Manager - Forward Deployed Engineering (LLM)

Baseten

San Francisco, California, United States (On-site)$220k – $285k Yearly
6dNE

GPU Cluster Architect

Nebius

United States (Remote)$150k – $180k Yearly
3wXA

Member of Technical Staff, Inference

xAI

Palo Alto, California, United States (On-site)$180k – $440k Yearly
2wRA

Member of Technical Staff - GPU Infrastructure

Reflection AI

San Francisco, California, United States (On-site)