Compression Jobs
Browse 40 Compression jobs on Inference Jobs.
21-40 of 40 jobs
6dBA
Software Engineer — GPU Networking & Distributed Systems
Baseten
San Francisco, California, United States (On-site)$150k – $250k Yearly
4wHF
Data/Infrastructure Advocate Engineer - US Remote
Hugging Face
New York, New York, United States or Remote (New York, United States)
2wBA
Software Engineer - Model Performance
Baseten
San Francisco, California, United States (On-site)$150k – $250k Yearly
1wNV
Senior Design Optimization Engineer - LPU Packaging
NVIDIA
Santa Clara, California, United States (Hybrid)$184k – $345k Yearly
2wNV
Compiler Verification Engineer, Compute Performance – GPU
NVIDIA
Austin, Texas, United States (On-site)$140k – $224.3k Yearly
2wOP
Software Engineer, Caching Infrastructure
OpenAI
San Francisco, California, United States (On-site)$255k – $405k Yearly
4wXA
Fullstack Engineer - Companions
xAI
Palo Alto, California, United States (On-site)$180k – $440k Yearly
2wBA
Engineering Manager - Forward Deployed Engineering (LLM)
Baseten
San Francisco, California, United States (On-site)$220k – $285k Yearly
1wNV
Senior Compiler Engineer - Compute Front-End
NVIDIA
Santa Clara, California, United States (On-site)$152k – $287.5k Yearly
3wCE
Inference Compiler and Frontend Engineer – Dubai
Cerebras
Dubai, Dubai, United Arab Emirates (On-site)
3wNV
Senior Software Engineer - Inference as a Service
NVIDIA
Santa Clara, California, United States (On-site)$200k – $391k Yearly
5dCO
Software Engineer II - Artifact Management
CoreWeave
Livingston, New Jersey, United States (Hybrid)$109k – $160k Yearly
2wOP
Software Engineer, Load Balancing - Inference
OpenAI
San Francisco, California, United States (On-site)$325k – $490k Yearly
3dNV
2wNV
Senior Software Engineer – TensorRT Edge-LLM
NVIDIA
Santa Clara, California, United States (Hybrid)$152k – $287.5k Yearly