About

Baseten builds AI infrastructure for production deployment and scaling of models, with work spanning kernel-level optimization for inference performance through developer tooling. The platform ships daily, measuring success by real-world impact of AI products running on it rather than vanity metrics. Engineers embed directly with customers to surface operational bottlenecks, then optimize obsessively - work ranges from TensorRT-LLM and CUDA kernel tuning to building developer tools that reduce deployment friction.

The stack centers on inference at scale: TensorRT-LLM and PyTorch for model execution, NVIDIA Triton Inference Server for serving, Kubernetes (EKS) with Karpenter for autoscaling, and Knative for event-driven workloads on AWS EC2. Infrastructure decisions prioritize shipping velocity over process - small teams with real ownership iterate rapidly on production reliability, latency (including tail behavior), and cost efficiency. Docker containerization and PostgreSQL round out core operational dependencies.

The team is internationally distributed, composed of engineers and designers who take craft seriously without performative posturing. Customer-embedded engineering informs both platform architecture and developer experience tradeoffs, creating tight feedback loops between deployment reality and infrastructure evolution. From founding, the approach has centered on hands-on problem solving and rapid iteration rather than abstraction layers that delay production learning.

Open roles at Baseten

Explore 58 open positions at Baseten and find your next opportunity.

Baseten logoBA

Software Engineer - Core Product

Baseten

San Francisco, California, US or Remote (United States)

$165K – $330K Yearly5d ago
Baseten logoBA

Engineering Manager - Model Performance

Baseten

San Francisco, California, US or Remote (California, United States + 1 more)

$260K – $380K Yearly5d ago
Baseten logoBA

Executive Recruiter

Baseten

Worldwide (Remote)

$180K – $210K Yearly5d ago
Baseten logoBA

Software Engineer, Model Performance Tooling

Baseten

CA or Remote (Canada + 1 more)

$160K – $200K Yearly5d ago
Baseten logoBA

Sales Manager - Emerging

Baseten

New York, United States (Hybrid)

$300K – $340K Yearly5d ago
Baseten logoBA

Account Executive - AI Native: Strategic

Baseten

New York, US or Remote (Worldwide)

$230K – $300K Yearly5d ago
Baseten logoBA

Engineering Manager, Cloud Platform

Baseten

Worldwide (Remote)

$165K – $330K Yearly5d ago
Baseten logoBA

Site Reliability Engineer (SRE)

Baseten

San Francisco, California, United States (Hybrid)

$165K – $330K Yearly5d ago
Baseten logoBA

GPU Kernel Engineer

Baseten

San Francisco, California, US or Remote (United States)

$180K – $360K Yearly5d ago
Baseten logoBA

Software Engineer - Internal Platform

Baseten

San Francisco, California, US or Remote (United States + 1 more)

$165K – $330K Yearly5d ago
Baseten logoBA

Forward Deployed SRE

Baseten

San Francisco, California, US or Remote (Worldwide)

$135K – $285K Yearly5d ago
Baseten logoBA

Sales Manager - Strategic

Baseten

San Francisco, California, United States (Hybrid)

$320K – $360K Yearly5d ago
Baseten logoBA

Account Executive - AI Native: Emerging

Baseten

San Francisco, California, US or Remote (United States)

$180K – $230K Yearly5d ago
Baseten logoBA

Engineering Manager - Forward Deployed Engineering (LLM)

Baseten

San Francisco, California, US or Remote (Worldwide)

$260K – $380K Yearly5d ago
Baseten logoBA

Senior Product Engineer - Training Platform

Baseten

San Francisco, California, US or Remote (Worldwide)

$165K – $330K Yearly5d ago
Baseten logoBA

Sales Development Representative

Baseten

San Francisco, California, US or Remote (Worldwide)

$80K – $110K Yearly5d ago
Baseten logoBA

Applied AI Inference Engineer

Baseten

San Francisco, California, US or Remote (California, United States + 1 more)

$165K – $330K Yearly5d ago
Baseten logoBA

Software Engineer - AI Enablement

Baseten

San Francisco, California, United States (On-site)

$150K – $230K Yearly2w ago
Baseten logoBA

Software Engineer — GPU Networking & Distributed Systems

Baseten

San Francisco, California, United States (On-site)

$150K – $250K Yearly2w ago
Baseten logoBA

Solution Architect

Baseten

San Francisco, California, United States (On-site)

$165K – $275K Yearly2w ago

Similar companies

Together AI logoTA

Together AI

Together AI is a research-driven AI cloud infrastructure provider enabling developers and enterprises to train, fine-tune, and deploy open-source generative AI models at scale.

48 jobs
Braintrust logoBR

Braintrust

Braintrust is the AI observability platform helping teams measure, evaluate, and improve AI in production. Trusted by companies like Notion, Stripe, Zapier, Vercel, and Ramp.

32 jobs
Modal logoMO

Modal

Modal is a serverless compute platform for AI and data teams that enables running compute-intensive workloads like ML inference, fine-tuning, and batch jobs with instant GPU access and usage-based pricing.

28 jobs
Runpod logoRU

Runpod

RunPod provides cloud infrastructure for AI developers, offering GPU computing services for training, deploying, and scaling AI models.

18 jobs
Lambda logoLA

Lambda

Lambda is an AI-only company providing cloud GPUs, on-demand clusters, and hardware for AI training and inference, building the infrastructure powering AI services used by hundreds of millions of people.

14 jobs
Bento logoBE

Bento

Bento provides an open-source framework and enterprise platform for deploying and operating AI/ML model inference in production with control over performance, scaling, and operational complexity.