Low Latency Optimization Jobs
Explore Low Latency Optimization roles on Inference Jobs and apply today.
3mo agoCO
Audio Inference Engineer, Model Efficiency
Cohere
New York, United States or Remote (New York, United States + 3 more)
3mo agoOP
Software Engineer, Model Inference
OpenAI
San Francisco, California, United States (On-site)$325K – $490K Yearly
3mo agoBA
Software Engineer - Model Performance
Baseten
San Francisco, California, United States (On-site)$150K – $250K Yearly
4w agoTA
AI Researcher, Core ML
Together AI
San Francisco, California, United States (On-site)$200K – $280K Yearly
4w agoNE
1mo agoNV
Senior DL Algorithms Engineer - Inference Performance
NVIDIA
Santa Clara, California, United States (On-site)$184K – $356.5K Yearly
3mo agoPE
2mo agoNV
Senior Machine Learning Engineer, Quantized Inference
NVIDIA
Redmond, Washington, United States (On-site)$152K – $287.5K Yearly
4w agoAN
Engineering Manager, Inference Routing and Performance
Anthropic
San Francisco, California, United States (Hybrid)$405K – $485K Yearly
2w agoGR
Distinguished Engineer - Inference Serving Network and Storage
Graphcore
Austin, Texas, United States (On-site)
3mo agoBA
Software Engineer, Model Performance Tooling
Baseten
Canada or Remote (Canada + 1 more)C$130K – C$200K Yearly
4w agoTA
Senior Backend Engineer, Inference Platform
Together AI
San Francisco, California, United States (On-site)$160K – $250K Yearly
2mo agoNV
Senior Applied Deep Learning Research Scientist, Efficiency
NVIDIA
Santa Clara, California, United States (On-site)$192K – $356.5K Yearly
1mo agoNV
AI Inference Performance Engineer - New College Grad 2026
NVIDIA
Santa Clara, California, United States (On-site)$124K – $241.5K Yearly
3mo agoMO
Member of Technical Staff - ML Performance
Modal
New York, United States (On-site)$150K – $270K Yearly