Cache Optimization Jobs
Browse 328 Cache Optimization jobs on Inference Jobs.
328 jobs
4wD-
Machine Learning Intern - Dynamic KV-Cache Modeling for Efficient LLM Inference
d-Matrix
Campbell, California, United States or Remote (California, United States)$30 – $59 Hourly
1wCO
Member of Technical Staff, Model Efficiency
Cohere
New York, New York, United States or Remote (New York, United States + 3 more)
5dVE
1wTE
Software Engineer, Kernel Development and Optimization
Tenstorrent
Gdańsk, Pomeranian Voivodeship, Poland (Hybrid)
2wAI
ML Runtime Optimization Engineer - Lead
Applied Intuition
Sunnyvale, California, United States (On-site)$199.3k – $264.5k Yearly
5dTA
LLM Inference Frameworks and Optimization Engineer
Together AI
San Francisco, California, United States (On-site)$160k – $230k Yearly
5dAI
ML Runtime Optimization Engineer
Applied Intuition
Mountain View, California, United States (On-site)$159.1k – $199.3k Yearly
5dNV
Senior Design Optimization Engineer - LPU Packaging
NVIDIA
Santa Clara, California, United States (Hybrid)$184k – $345k Yearly
2wNV
Senior Performance Architect - Heterogeneous Workload Optimization
NVIDIA
Santa Clara, California, United States (Hybrid)$184k – $356.5k Yearly
5dAN
Senior/Staff Software Engineer, Inference
Anthropic
New York, New York, United States (Hybrid)$300k – $485k Yearly
1wTA
Research Engineer, Core ML
Together AI
San Francisco, California, United States (On-site)$200k – $280k Yearly
4dNV
Senior ASIC Physical Design Engineer, Cache Coherent Interconnects
NVIDIA
Santa Clara, California, United States (Hybrid)$136k – $264.5k Yearly
5dTM
Research Engineer, Infrastructure, Inference
Thinking Machines Lab
San Francisco, California, United States (On-site)$350k – $475k Yearly
1wD-
Senior Staff ML Researcher - LLM Algorithmic Optimization
d-Matrix
Bengaluru, Karnataka, India (Hybrid)₹4M – ₹6M Yearly