Home
Jobs
LLM Inference Optimization

LLM Inference Optimization jobs

Explore LLM Inference Optimization roles on Inference Jobs and apply today.

161-180 of 447 jobs

NV6d

Senior Software Engineer, AI Inference Systems

NVIDIA · Toronto, Ontario, Canada (Hybrid) · C$170k – C$275k Yearly

NVIDIA

Toronto, Ontario, Canada (Hybrid)

C$170k – C$275k Yearly

MO2w

Member of Technical Staff - ML Performance

Modal · New York, New York, United States (On-site) · $150k – $270k Yearly

Modal

New York, New York, United States (On-site)

$150k – $270k Yearly

AN3w

Software Engineer, Inference Deployment

Anthropic · San Francisco, California, United States (Hybrid) · $320k – $485k Yearly

Anthropic

San Francisco, California, United States (Hybrid)

$320k – $485k Yearly

OP2w

Software Engineer, Load Balancing - Inference

OpenAI · San Francisco, California, United States (On-site) · $325k – $490k Yearly

OpenAI

San Francisco, California, United States (On-site)

$325k – $490k Yearly

NV2w

Senior Software Engineer - VLM Microservices for Neural Reconstruction

NVIDIA · Santa Clara, California, United States (On-site) · $152k – $287.5k Yearly

NVIDIA

Santa Clara, California, United States (On-site)

$152k – $287.5k Yearly

PE2w

UK Internship Program

Perplexity · London, England, United Kingdom (Hybrid)

Perplexity

London, England, United Kingdom (Hybrid)

LO5d

AI Engineer

Lovable · Stockholm, Stockholm, Sweden (On-site)

Lovable

Stockholm, Stockholm, Sweden (On-site)

NV1w

Senior ML Framework Performance Engineer - AI for Science at Scale

NVIDIA · Santa Clara, California, United States (On-site) · $184k – $287.5k Yearly

NVIDIA

Santa Clara, California, United States (On-site)

$184k – $287.5k Yearly

NV2d

Senior Systems Software Engineer - Deep Learning Solutions

NVIDIA · Toronto, Ontario, Canada (On-site) · C$225k – C$275k Yearly

NVIDIA

Toronto, Ontario, Canada (On-site)

C$225k – C$275k Yearly

TA1w

AI Researcher, Core ML

Together AI · San Francisco, California, United States (On-site) · $160k – $230k Yearly

Together AI

San Francisco, California, United States (On-site)

$160k – $230k Yearly

XA4w

Member of Technical Staff, Model Evaluation

xAI · Palo Alto, California, United States (On-site) · $180k – $440k Yearly

xAI

Palo Alto, California, United States (On-site)

$180k – $440k Yearly

PE2w

Search Machine Learning Research Engineer (Berlin)

Perplexity · Berlin, Berlin, Germany (On-site)

Perplexity

Berlin, Berlin, Germany (On-site)

NE2w

Senior ML Engineer (Token Factory)

Nebius · Europe + 6 more (Remote)

Nebius

Europe + 6 more (Remote)

AC1w

Infrastructure Engineer, ML Systems

Applied Compute · San Francisco, California, United States (On-site)

Applied Compute

San Francisco, California, United States (On-site)

BA2w

Software Engineer, Model Performance Tooling

Baseten · Canada or Remote (Canada + 1 more) · C$130k – C$200k Yearly

Baseten

Canada or Remote (Canada + 1 more)

C$130k – C$200k Yearly

BA2w

Software Engineer - Model API's

Baseten · San Francisco, California, United States (On-site) · $150k – $230k Yearly

Baseten

San Francisco, California, United States (On-site)

$150k – $230k Yearly

PE3w

Research Engineering Manager - Model Training

Perplexity · San Francisco, California, United States (On-site) · $300k – $470k Yearly

Perplexity

San Francisco, California, United States (On-site)

$300k – $470k Yearly

CO2w

Member of Technical Staff, MLE (Korea)

Cohere · Seoul, Seoul, South Korea or Remote (South Korea)

Cohere

Seoul, Seoul, South Korea or Remote (South Korea)

NV2w

Deep Learning Performance Architect - Intern - 2026

NVIDIA · Shanghai, Shanghai, China (On-site)

NVIDIA

Shanghai, Shanghai, China (On-site)

PE2w

Internship - Machine Learning Research Engineer (Berlin)

Perplexity · Berlin, Berlin, Germany (On-site)

Perplexity

Berlin, Berlin, Germany (On-site)

Inference Jobs

Senior Software Engineer, AI Inference Systems

Member of Technical Staff - ML Performance

Software Engineer, Inference Deployment

Software Engineer, Load Balancing - Inference

Senior Software Engineer - VLM Microservices for Neural Reconstruction

UK Internship Program

AI Engineer

Senior ML Framework Performance Engineer - AI for Science at Scale

Senior Systems Software Engineer - Deep Learning Solutions

AI Researcher, Core ML

Member of Technical Staff, Model Evaluation

Search Machine Learning Research Engineer (Berlin)

Senior ML Engineer (Token Factory)

Infrastructure Engineer, ML Systems

Software Engineer, Model Performance Tooling

Software Engineer - Model API's

Research Engineering Manager - Model Training

Member of Technical Staff, MLE (Korea)

Deep Learning Performance Architect - Intern - 2026

Internship - Machine Learning Research Engineer (Berlin)

Related searches