1. Home
  2. Jobs
  3. LLM Inference Optimization

LLM Inference Optimization Jobs

Browse 450 LLM Inference Optimization jobs on Inference Jobs.

450 jobs

5dTA

LLM Inference Frameworks and Optimization Engineer

Together AI

San Francisco, California, United States (On-site)$160k – $230k Yearly
1wHA

LLM Inference Engineer

Hippocratic AI

Palo Alto, California, United States (On-site)
2wPL

LLM Inference Engineer

Periodic Labs

Menlo Park, California, United States (On-site)
1wBA

Engineering Manager - Forward Deployed Engineering (LLM)

Baseten

San Francisco, California, United States (On-site)$220k – $285k Yearly
1wD-

Senior Staff ML Researcher - LLM Algorithmic Optimization

d-Matrix

Bengaluru, Karnataka, India (Hybrid)₹4M – ₹6M Yearly
1wCO

Staff Research Engineer, Model Efficiency

Cohere

New York, New York, United States (Hybrid)
2wPE

AI Inference Engineer (San Francisco)

Perplexity

San Francisco, California, United States (On-site)$210k – $385k Yearly
4wD-

Machine Learning Intern - Dynamic KV-Cache Modeling for Efficient LLM Inference

d-Matrix

Campbell, California, United States or Remote (California, United States)$30 – $59 Hourly
5dAN

Senior/Staff Software Engineer, Inference

Anthropic

New York, New York, United States (Hybrid)$300k – $485k Yearly
3dNV

Senior AI Inference Compiler Engineer

NVIDIA

Santa Clara, California, United States (On-site)$152k – $241.5k Yearly
5dAN

Senior Software Engineer, Inference

Anthropic

Dublin, Dublin, Ireland (Hybrid)€235k – €295k Yearly
5dTM

Research Engineer, Infrastructure, Inference

Thinking Machines Lab

San Francisco, California, United States (On-site)$350k – $475k Yearly
8hNV

Senior Software Engineer, Quantized Inference

NVIDIA

Redmond, Washington, United States (On-site)$152k – $287.5k Yearly
1wCO

Member of Technical Staff, Model Efficiency

Cohere

New York, New York, United States or Remote (New York, United States + 3 more)
2wPE

AI Inference Engineer (London)

Perplexity

London, England, United Kingdom (On-site)
2wNV

Senior Software Engineer – TensorRT Edge-LLM

NVIDIA

Santa Clara, California, United States (Hybrid)$152k – $287.5k Yearly