1. Home
  2. AI Companies
  3. SambaNova
SambaNova logoSA

SambaNova

About

SambaNova builds a full-stack AI inference platform centered on custom dataflow chips (RDUs) and a three-tier memory architecture designed to address latency and energy efficiency bottlenecks in generative AI deployment. The architecture targets enterprise and government workloads requiring on-premises or sovereign deployment - fine-tuning open-source models behind customer firewalls with full data and model ownership retention. The platform powers sovereign AI data centers across Australia, Europe, and the UK, focusing on avoiding vendor lock-in to proprietary inference services.

The technical approach uses custom dataflow technology rather than GPU-based architectures, trading off ecosystem maturity for claimed improvements in inference throughput and energy consumption at scale. The three-tier memory design addresses memory bandwidth constraints common in transformer inference. The platform supports PyTorch-based model fine-tuning and deployment workflows, with integration points through Python and C++ APIs. Operational complexity centers on full-stack ownership - hardware, software, and deployment infrastructure - requiring coordination across chip design, systems software, and model serving layers.

The stack includes standard ML tooling (PyTorch, Python) alongside proprietary components for the RDU runtime and memory management. Build and CI infrastructure uses Bazel and CircleCI; artifact management through Google Artifact Registry and JFrog. The deployment model targets enterprises prioritizing data sovereignty over cloud-based inference APIs, introducing trade-offs in operational overhead versus control and latency predictability for on-premises workloads.

Open roles at SambaNova

Explore 3 open positions at SambaNova and find your next opportunity.

SambaNova logoSA

Software Engineer, ML Inference Performance

SambaNova

Palo Alto, California, United States (On-site)

4d ago
SambaNova logoSA

Hardware Design Engineer

SambaNova

United States (Remote)

2w ago

Similar companies

Tenstorrent logoTE

Tenstorrent

Tenstorrent is a next-generation computing company that builds computers for AI, developing AI Graph Processors, high-performance RISC-V CPUs, and configurable chiplets with an open-source software stack.

116 jobs
Cerebras logoCE

Cerebras

Cerebras Systems builds the world's fastest AI infrastructure with industry-leading speed, scale, and quality through wafer-scale AI chips.

87 jobs
Together AI logoTA

Together AI

Together AI is a research-driven AI cloud infrastructure provider enabling developers and enterprises to train, fine-tune, and deploy open-source generative AI models at scale.

48 jobs
d-Matrix logoD-

d-Matrix

d-Matrix builds purpose-built AI inference computing platforms to make generative AI commercially viable, efficient, and sustainable through digital in-memory compute technology.

43 jobs
Modal logoMO

Modal

Modal is a serverless compute platform for AI and data teams that enables running compute-intensive workloads like ML inference, fine-tuning, and batch jobs with instant GPU access and usage-based pricing.

28 jobs
Bento logoBE

Bento

Bento provides an open-source framework and enterprise platform for deploying and operating AI/ML model inference in production with control over performance, scaling, and operational complexity.