1. Home
  2. Jobs
  3. United Kingdom
  4. London
  5. AI
  6. AI Inference Engineer (London)
PE

AI Inference Engineer (London)

Perplexity
Posted onFeb 16, 2026
LocationLondon, England, United Kingdom (On-site)
Employment typeFull-time

We are looking for an AI Inference engineer to join our growing team. Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA, Kubernetes. You will have the opportunity to work on large-scale deployment of machine learning models for real-time inference.

Responsibilities

  • Develop APIs for AI inference that will be used by both internal and external customers

  • Benchmark and address bottlenecks throughout our inference stack

  • Improve the reliability and observability of our systems and respond to system outages

  • Explore novel research and implement LLM inference optimizations

Qualifications

  • Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)

  • Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.)

  • Understanding of GPU architectures or experience with GPU kernel programming using CUDA

Final offer amounts are determined by multiple factors, including, experience and expertise.

Equity: In addition to the base salary, equity may be part of the total compensation package.

Perplexity is an AI-powered answer engine that provides accurate, real-time answers to questions backed by credible sources and citations.

Similar jobs

You might also be interested in...

NE5d

ML/AI Engineer

Nebius

Amsterdam, North Holland, Netherlands (On-site)

NV2w

Senior Deep Learning Engineer

NVIDIA

Warszawa, Masovian Voivodeship, Poland (Hybrid)

zł 292.5k – zł 507k Yearly

NE5d
TA5d

LLM Inference Frameworks and Optimization Engineer

Together AI

San Francisco, California, United States (On-site)

$160k – $230k Yearly

CO3w

Software Engineer, Inference AI/ML

CoreWeave

Sunnyvale, California, United States (Hybrid)

$92k – $135k Yearly