vLLM Jobs in California, United States
Discover vLLM roles in California, United States on Inference Jobs and apply today.
2mo agoNV
Senior System Software Engineer - Dynamo-Triton Inference Server
NVIDIA
Santa Clara, California, United States (On-site)$152K – $241.5K Yearly
2mo agoXA
Member of Technical Staff, Model Evaluation
xAI
Palo Alto, California, United States (On-site)$180K – $440K Yearly
4w agoTA
Senior Backend Engineer, Inference Platform
Together AI
San Francisco, California, United States (On-site)$160K – $250K Yearly
3mo agoRA
Member of Technical Staff - GPU Infrastructure
Reflection AI
San Francisco, California, United States (On-site)
2mo agoBA
3w agoCO
Staff Engineer - Perf and Benchmarking
CoreWeave
Sunnyvale, California, United States (Hybrid)$206K – $333K Yearly
1mo agoPE
Staff Full Stack Software Engineer
Perplexity
San Francisco, California, United States (On-site)$220K – $405K Yearly
2mo agoNV
Senior Software Engineer - Deep Learning Compiler Verification and Infrastructure
NVIDIA
Santa Clara, California, United States (On-site)$140K – $224.3K Yearly
3mo agoCO
Audio Inference Engineer, Model Efficiency
Cohere
New York, United States or Remote (New York, United States + 3 more)
2mo agoTM
Software Engineer, Developer Productivity, AI Tools
Thinking Machines Lab
San Francisco, California, United States (On-site)$350K – $475K Yearly
3mo agoPE
Inference Engineering Manager
Perplexity
San Francisco, California, United States (On-site)$300K – $385K Yearly
3w agoCO
Sr. Software Engineer - Perf and Benchmarking
CoreWeave
Sunnyvale, California, United States (Hybrid)$139K – $204K Yearly
3mo agoNV
Deep Learning Senior Engineer, End-To-End Autonomous Driving
NVIDIA
Santa Clara, California, United States (On-site)$184K – $356.5K Yearly
3mo agoNV
Senior Power Analysis and Optimization Engineer, AI-LLM Systems
NVIDIA
Santa Clara, California, United States (On-site)$136K – $264.5K Yearly
3mo agoOP
Backend Software Engineer (Evals) – Support Automation Engineering
OpenAI
San Francisco, California, United States (On-site)$255K – $405K Yearly
3mo agoOP
TLM, Machine Learning, Integrity
OpenAI
San Francisco, California, United States (On-site)$405K – $490K Yearly
3mo agoD-
Machine Learning Intern - Dynamic KV-Cache Modeling for Efficient LLM Inference
d-Matrix
Santa Clara, Ca, Ca, United States or Remote (California, United States)$30 – $59 Hourly
1mo agoD-
HW Design Verification Intern
d-Matrix
Santa Clara, California, United States (Hybrid)$30 – $60 Yearly