1. Home
  2. Jobs
  3. Inference Infrastructure

Inference Infrastructure Jobs

Browse 955 Inference Infrastructure jobs on Inference Jobs.

121-140 of 955 jobs

4wD-

Machine Learning Intern - Dynamic KV-Cache Modeling for Efficient LLM Inference

d-Matrix

Campbell, California, United States or Remote (California, United States)$30 – $59 Hourly
2wNV

Senior Deep Learning Performance Architect

NVIDIA

California, United States (Hybrid)$152k – $287.5k Yearly
4dNV

Senior Compiler Engineer, AI Inference Performance

NVIDIA

Santa Clara, California, United States (On-site)$152k – $241.5k Yearly
2wOP

Software Engineer, Model Inference

OpenAI

San Francisco, California, United States (On-site)$325k – $490k Yearly
3wGR

Senior Staff Engineer

Graphcore

Bristol, England, United Kingdom (On-site)
2wD-

Senior Runtime Systems Engineer

d-Matrix

Santa Clara, California, United States (Hybrid)
3dCO

Senior Software Engineer I, Inference

CoreWeave

Sunnyvale, California, United States (Hybrid)$139k – $204k Yearly
2wNV

Senior Deep Learning Engineer

NVIDIA

Warszawa, Masovian Voivodeship, Poland (Hybrid)zł 292.5k – zł 507k Yearly
1wNV

Senior ML Framework Performance Engineer - AI for Science at Scale

NVIDIA

Santa Clara, California, United States (On-site)$184k – $287.5k Yearly
6dCE

Senior Research Engineer - Inference ML

Cerebras

Sunnyvale, California, United States (Hybrid)
3wXA

Member of Technical Staff, Model Evaluation

xAI

Palo Alto, California, United States (On-site)$180k – $440k Yearly
4wNV

Product Manager - BioNeMo Inference

NVIDIA

New York, New York, United States (On-site)$168k – $258.8k Yearly
4dNV

Software Engineer, TensorRT Specialized Platforms - New College Grad 2025

NVIDIA

Santa Clara, California, United States (On-site)$124k – $195.5k Yearly
2wSE

ML Engineer

Sesame

New York, New York, United States (On-site)$190k – $320k Yearly
1dNV

Senior Systems Software Engineer - Deep Learning Solutions

NVIDIA

Toronto, Ontario, Canada (On-site)C$225k – C$275k Yearly
3wNV

Senior Technical Program Manager, Deep Learning Libraries

NVIDIA

Santa Clara, California, United States (On-site)$168k – $322k Yearly
2wCO

Staff Research Engineer, Model Efficiency

Cohere

New York, New York, United States (Hybrid)
2wMA

Research Engineer

Magic

San Francisco, California, United States (On-site)$225k – $550k Yearly