Low-Latency Inference Jobs
Explore Low-Latency Inference roles on Inference Jobs and apply today.
3mo agoOP
Inference Technical Lead, Sora
OpenAI
San Francisco, California, United States (Hybrid)$380K – $380K Yearly
3mo agoD-
Machine Learning Intern - Dynamic KV-Cache Modeling for Efficient LLM Inference
d-Matrix
Santa Clara, Ca, Ca, United States or Remote (California, United States)$30 – $59 Hourly
4w agoAN
Engineering Manager, Inference Routing and Performance
Anthropic
San Francisco, California, United States (Hybrid)$405K – $485K Yearly
1w agoAN
1mo agoAN
Sr. Software Engineer, Inference
Anthropic
London, England, United Kingdom (Hybrid)£225K – £325K Yearly
2mo agoAN
Staff Software Engineer, Inference
Anthropic
London, England, United Kingdom (Hybrid)£325K – £390K Yearly
3mo agoOP
Software Engineer, Model Inference
OpenAI
San Francisco, California, United States (On-site)$325K – $490K Yearly
4w agoAN
2d agoCE
2w agoAN
Senior/Staff Software Engineer, Inference
Anthropic
San Francisco, California, United States (Hybrid)$300K – $485K Yearly
17h agoNV
Senior Software Engineer - AI Inference
NVIDIA
Santa Clara, California, United States (On-site)$152K – $287.5K Yearly
3mo agoOP
Software Engineer, Inference – AMD GPU Enablement
OpenAI
San Francisco, California, United States (On-site)$325K – $490K Yearly
2mo agoNV
Senior Software Engineer – TensorRT Edge-LLM
NVIDIA
Santa Clara, California, United States (Hybrid)$152K – $287.5K Yearly
3w agoTM
Research, Audio Expertise
Thinking Machines Lab
San Francisco, California, United States (On-site)$350K – $475K Yearly