Home
Jobs
Low Latency Acceleration

Low Latency Acceleration Jobs

Browse 41 Low Latency Acceleration jobs on Inference Jobs.

21-40 of 41 jobs

3w ago

TPU Kernel Engineer

Anthropic

San Francisco, California, United States (Hybrid)$280K – $850K Yearly

Kernel Engineering

Machine Learning Engineering

TPU

ML Systems

5d ago

Senior Performance Engineer - LLM Inference Frameworks

NVIDIA

Yokne'am, Northern District, Israel (On-site)

Performance Engineering

LLM Inference

Python

PyTorch

3w ago

Senior Deep Learning Solution Architect

NVIDIA

Beijing, Beijing, China (Hybrid)

Deep Learning

Solution Architecture

Deep Learning

LLM Inference

3w ago

Member of Technical Staff - Inference Research

Modal

New York, United States (On-site)$150K – $350K Yearly

Engineering

ML Infrastructure Engineering

LLM Inference

Speculative Decoding

4w ago

Performance & Systems Engineer, Codex

OpenAI

San Francisco, California, United States (Hybrid)$295K – $445K Yearly

Systems Engineering

Performance Engineering

LLM Inference

Cloud Orchestration

3w ago

Solutions Architect - CPU and LPU

NVIDIA

Beijing, Beijing, China (On-site)

Solutions Architecture

AI Infrastructure

NVIDIA Grace

NVIDIA Vera

2w ago

RISC-V AI / HPC & Agentic Software Engineer

Tenstorrent

Taiwan (Remote)

RISC-V Software Engineering

AI/HPC Engineering

RISC-V

HPC

7d ago

Accelerated Computing GPU Product Manager

NVIDIA

Santa Clara, California, United States (On-site)$168K – $327.8K Yearly

Product Management

Technical Product Management

Accelerated Computing

GPU

4w ago

LLM Inference Engineer

Hippocratic AI

Palo Alto, California, United States (On-site)

LLM Engineering

Machine Learning Engineering

LLM Inference

Distributed Serving

6h ago

Staff Software Engineer, ML Performance & Systems

fal.ai

San Francisco, California, United States (On-site)$180K – $250K Yearly

Staff Software Engineer

ML Infrastructure Engineer

PyTorch

TensorRT

2w ago

Sr. Staff Machine Learning Researcher - Model Training & Optimization

Tenstorrent

Toronto, Ontario, Canada (Hybrid)$100K – $500K Yearly

ML Models

Machine Learning Research

Python

PyTorch

7d ago

Senior Software Architect, AI Networking

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-site)

Software Architecture

AI/ML Engineering

Distributed Systems

C++

4w ago

Research Engineer, Infrastructure, Kernels

Thinking Machines Lab

San Francisco, California, United States (On-site)$350K – $475K Yearly

Machine Learning Infrastructure

AI Research Engineer

CUDA

CuTe

2w ago

Senior DL Algorithms Engineer - Inference Performance

NVIDIA

Santa Clara, California, United States (On-site)$152K – $287.5K Yearly

Deep Learning Engineer

Algorithms Engineer

Deep Learning

Inference

3d ago

Software Engineer – Performance Profiling

Etched

San Jose, California, United States (On-site)$150K – $275K Yearly

Software Engineering

Performance Engineering

C++

Rust

6d ago

TL, Research Inference

OpenAI

San Francisco, California, United States (On-site)$380K – $555K Yearly

Research Engineering

Machine Learning Infrastructure

High-Performance Inference

Model Execution

3w ago

D-

Principal LLM Inference Engineer

d-Matrix

United States (Remote)$195K – $285K Yearly

LLM Inference Engineering

AI Infrastructure Engineer

Python

C/C++

7d ago

AI Inference Performance Engineer

NVIDIA

Santa Clara, California, United States (On-site)$152K – $241.5K Yearly

AI Infrastructure Engineer

Performance Engineer

TensorRT-LLM

SGLang

3w ago

Member of Technical Staff, Inference

xAI

Palo Alto, California, United States (On-site)$180K – $440K Yearly

Machine Learning Engineer

AI Infrastructure

Rust

C++

4w ago

Member of Technical Staff - ML Performance

Modal

New York, United States (On-site)$150K – $350K Yearly

Engineering

Machine Learning Engineer

PyTorch

vLLM

Inference Jobs

Discover the latest AI roles from Inference Jobs.

Powered byCavuno

For Candidates

Jobs
Companies
Pricing

For Companies

Post a job
Pricing

Resources

Locations
Salaries
Sitemap

About

About