1. Home
  2. Jobs
  3. United States
  4. Low Latency Optimization

Low Latency Optimization Jobs in United States

Discover Low Latency Optimization roles in United States on Inference Jobs and apply today.

4w agoTA

LLM Inference Frameworks and Optimization Engineer

Together AI

San Francisco, California, United States (On-site)$160K – $230K Yearly
2mo agoNV

Senior GPU Low Power Architect

NVIDIA

Santa Clara, California, United States (On-site)$136K – $264.5K Yearly
3mo agoDE

Senior Software Engineer, Voice Agent

Decagon

San Francisco, California, United States (On-site)$250K – $330K Yearly
4w agoOP

TL, Research Inference

OpenAI

San Francisco, California, United States (On-site)$380K – $555K Yearly
3mo agoHA
3mo agoDE

Staff Software Engineer, Voice Agent

Decagon

San Francisco, California, United States (On-site)$300K – $375K Yearly
4w agoOP
3mo agoPL
3mo agoD-
2mo agoNV

Senior Software Engineer – TensorRT Edge-LLM

NVIDIA

Santa Clara, California, United States (Hybrid)$152K – $287.5K Yearly
1mo agoLA
2w agoAN

TPU Kernel Engineer

Anthropic

San Francisco, California, United States (Hybrid)$280K – $850K Yearly
3mo agoSE

ML Model Serving Engineer

Sesame

San Francisco, California, United States (On-site)$175K – $280K Yearly
2mo agoTA

Research Engineer, Core ML

Together AI

San Francisco, California, United States (On-site)$200K – $280K Yearly
2mo agoCE
3mo agoCO

Member of Technical Staff, Model Efficiency

Cohere

New York, United States or Remote (New York, United States + 3 more)
1mo agoAI

ML Runtime Optimization Engineer

Applied Intuition

Sunnyvale, California, United States (On-site)$159.1K – $199.3K Yearly