vLLM Jobs in San Francisco, California, United States
Discover vLLM roles in San Francisco, California, United States on Inference Jobs and apply today.
3mo agoCO
2mo agoBA
Software Engineer — GPU Networking & Distributed Systems
Baseten
San Francisco, California, United States (On-site)$150K – $250K Yearly
2w agoTA
Senior Machine Learning Engineer, Voice AI
Together AI
San Francisco, California, United States (On-site)$200K – $260K Yearly
4w agoTA
Machine Learning Engineer - Inference
Together AI
San Francisco, California, United States (On-site)$160K – $230K Yearly
3mo agoBA
Software Engineer - Model Performance
Baseten
San Francisco, California, United States (On-site)$150K – $250K Yearly
4w agoTA
Senior Backend Engineer, Inference Platform
Together AI
San Francisco, California, United States (On-site)$160K – $250K Yearly
3mo agoRA
Member of Technical Staff - GPU Infrastructure
Reflection AI
San Francisco, California, United States (On-site)
2mo agoBA
1mo agoPE
Staff Full Stack Software Engineer
Perplexity
San Francisco, California, United States (On-site)$220K – $405K Yearly
3mo agoCO
Audio Inference Engineer, Model Efficiency
Cohere
New York, United States or Remote (New York, United States + 3 more)
2mo agoTM
Software Engineer, Developer Productivity, AI Tools
Thinking Machines Lab
San Francisco, California, United States (On-site)$350K – $475K Yearly
1mo agoCO
3mo agoPE
Inference Engineering Manager
Perplexity
San Francisco, California, United States (On-site)$300K – $385K Yearly
3mo agoBA
Senior Product Engineer - Training Platform
Baseten
San Francisco, California, United States (On-site)$200K – $275K Yearly
3mo agoOP
Backend Software Engineer (Evals) – Support Automation Engineering
OpenAI
San Francisco, California, United States (On-site)$255K – $405K Yearly
2w agoAN
Machine Learning Systems Engineer, RL Engineering
Anthropic
San Francisco, California, United States (Hybrid)$500K – $850K Yearly
3mo agoOP
TLM, Machine Learning, Integrity
OpenAI
San Francisco, California, United States (On-site)$405K – $490K Yearly
3mo agoD-
Machine Learning Intern - Dynamic KV-Cache Modeling for Efficient LLM Inference
d-Matrix
Santa Clara, Ca, Ca, United States or Remote (California, United States)$30 – $59 Hourly