Low-latency RPC Jobs
Browse 65 Low-latency RPC jobs on Inference Jobs.
41-60 of 65 jobs
6dAN
Software Engineer, Sandboxing (Systems)
Anthropic
San Francisco, California, United States (Hybrid)$300k – $405k Yearly
4wNV
Senior HPC and AI Networking Performance Research and Analysis Engineer
NVIDIA
Shanghai, Shanghai, China (On-site)
2wBA
Software Engineer - Model API's
Baseten
San Francisco, California, United States (On-site)$150k – $230k Yearly
4dBA
Software Engineer — GPU Networking & Distributed Systems
Baseten
San Francisco, California, United States (On-site)$150k – $250k Yearly
2wPE
AI Inference Engineer (San Francisco)
Perplexity
San Francisco, California, United States (On-site)$210k – $385k Yearly
6dTA
LLM Inference Frameworks and Optimization Engineer
Together AI
San Francisco, California, United States (On-site)$160k – $230k Yearly
6dTM
Research Engineer, Infrastructure, Numerics
Thinking Machines Lab
San Francisco, California, United States (On-site)$350k – $475k Yearly
6dXA
5dNV
6dTM
Research Engineer, Infrastructure, Inference
Thinking Machines Lab
San Francisco, California, United States (On-site)$350k – $475k Yearly
6dXA
2wCO
Member of Technical Staff, Model Efficiency
Cohere
New York, New York, United States or Remote (New York, United States + 3 more)
6dNV
Manager, AI Networking Performance Research and Analysis
NVIDIA
Yokneam Ilit, Northern District, Israel (Hybrid)
2wNV
Senior Software Engineer – TensorRT Edge-LLM
NVIDIA
Santa Clara, California, United States (Hybrid)$152k – $287.5k Yearly
3wXA
Member of Technical Staff, Inference
xAI
Palo Alto, California, United States (On-site)$180k – $440k Yearly