Inference Engineer jobs in San Francisco, California, United States

Discover Inference Engineer roles in San Francisco, California, United States on Inference Jobs and apply today.

20 jobs

PE2w

AI Inference Engineer (San Francisco)

Perplexity

San Francisco, California, United States (On-site)

$210k – $385k Yearly

PE2w

Inference Engineering Manager

Perplexity

San Francisco, California, United States (On-site)

$300k – $385k Yearly

TA1w

Machine Learning Engineer - Inference

Together AI

San Francisco, California, United States (On-site)

$160k – $230k Yearly

TA1w

LLM Inference Frameworks and Optimization Engineer

Together AI

San Francisco, California, United States (On-site)

$160k – $230k Yearly

OP1w

Inference Runtime, Engineering Manager

OpenAI

San Francisco, California, United States (On-site)

$455k – $555k Yearly

NE1w

Senior Site Reliability Engineer — Token Factory (Inference Platform)

Nebius

Netherlands + 4 more (Remote)

PO2w

Member of Engineering (Inference)

Poolside

United Kingdom or Remote (Europe + 1 more)

OP2w

Inference Technical Lead, Sora

OpenAI

San Francisco, California, United States (Hybrid)

$380k – $380k Yearly

CE3w

Principal Engineer, AI Inference Reliability

Cerebras

United States + 1 more (Remote)

OP2w

Software Engineer, Inference – AMD GPU Enablement

OpenAI

San Francisco, California, United States (On-site)

$325k – $490k Yearly

PO2w

Member of Engineering (Pre-training and inference software)

Poolside

United Kingdom or Remote (Europe, Middle East, and Africa, North America)

OP2w

Software Engineer, Load Balancing - Inference

OpenAI

San Francisco, California, United States (On-site)

$325k – $490k Yearly

CO2w

Full-Stack Software Engineer, Inference

Cohere

Toronto, Ontario, Canada or Remote (Canada + 2 more)

AN3w

Software Engineer, Inference Deployment

Anthropic

San Francisco, California, United States (Hybrid)

$320k – $485k Yearly

AN6d

Engineering Manager, Inference

Anthropic

San Francisco, California, United States (Hybrid)

$425k – $560k Yearly

LA2w

Fullstack Engineer, Applied AI

LangChain

San Francisco, California, United States (On-site)

$170k – $195k Yearly

BA2w

Engineering Manager - Forward Deployed Engineering (LLM)

Baseten

San Francisco, California, United States (On-site)

$220k – $285k Yearly

TM1w

Research Engineer, Infrastructure, Inference

Thinking Machines Lab

San Francisco, California, United States (On-site)

$350k – $475k Yearly

OP2w

Software Engineer, Model Inference

OpenAI

San Francisco, California, United States (On-site)

$325k – $490k Yearly

VA1w

GPU Systems Engineer – HPC / Parallel Computing

Vast.ai

San Francisco, California, United States (On-site)

$160k – $320k Yearly