1. Home
  2. AI Companies
  3. SambaNova
SA

SambaNova

About

SambaNova builds a full-stack AI inference platform centered on custom dataflow chips (RDUs) and a three-tier memory architecture designed to address latency and energy efficiency bottlenecks in generative AI deployment. The architecture targets enterprise and government workloads requiring on-premises or sovereign deployment - fine-tuning open-source models behind customer firewalls with full data and model ownership retention. The platform powers sovereign AI data centers across Australia, Europe, and the UK, focusing on avoiding vendor lock-in to proprietary inference services.

The technical approach uses custom dataflow technology rather than GPU-based architectures, trading off ecosystem maturity for claimed improvements in inference throughput and energy consumption at scale. The three-tier memory design addresses memory bandwidth constraints common in transformer inference. The platform supports PyTorch-based model fine-tuning and deployment workflows, with integration points through Python and C++ APIs. Operational complexity centers on full-stack ownership - hardware, software, and deployment infrastructure - requiring coordination across chip design, systems software, and model serving layers.

The stack includes standard ML tooling (PyTorch, Python) alongside proprietary components for the RDU runtime and memory management. Build and CI infrastructure uses Bazel and CircleCI; artifact management through Google Artifact Registry and JFrog. The deployment model targets enterprises prioritizing data sovereignty over cloud-based inference APIs, introducing trade-offs in operational overhead versus control and latency predictability for on-premises workloads.

Open roles at SambaNova

Explore 4 open positions at SambaNova and find your next opportunity.

SA

Full Stack Support Engineer

SambaNova

United States (Remote)

3d ago
SA

Hardware Design Engineer

SambaNova

United States (Remote)

1w ago
SA

Software Engineer

SambaNova

United States (Remote)

1w ago

Similar companies

EL

EliseAI

EliseAI builds a unified conversational AI platform for property management and healthcare operations, automating workflows that span leasing tours, maintenance requests, patient scheduling, and intake forms. Founded in 2017, the company serves over 600 property owners and healthcare operators managing 5 million+ units, having raised $360 million in funding. The engineering organization ships 175+ new features per year, reflecting a rapid iteration cycle informed by frontline user feedback. The platform consolidates functionality that would otherwise require multiple point solutions, addressing operational bottlenecks in high-volume, repetitive administrative tasks. In property management, this includes conversational AI for leasing tour coordination and maintenance request handling. In healthcare, the system automates patient scheduling and intake form collection. The technical approach centers on a single platform architecture rather than a collection of disconnected tools, with production deployment at scale across both industry verticals. The company's engineering culture emphasizes shipping velocity and product development driven by operational constraints observed in production environments. The 175+ annual feature releases suggest continuous deployment practices and tight feedback loops between product iteration and user-facing workflows. Development priorities appear structured around reducing latency in administrative operations and improving throughput for organizations managing thousands of concurrent interactions across property portfolios or patient populations.

113 jobs
HA

HappyRobot

HappyRobot is an AI workforce platform founded in 2023 that builds autonomous agents to handle end-to-end operational work across phone, email, messaging, and documents. The company focuses on logistics and industrial operations - supply chains, freight, and businesses that move physical goods - where complex, patterned work spans multiple communication channels and document formats. Rather than augmenting human workflows, HappyRobot's system is designed to own complete tasks autonomously, operating as an AI-native OS for operations. The platform has been deployed across over 150 enterprise customers, including DHL and Ryder, and the company has raised $62 million from investors including Y Combinator and Andreessen Horowitz. The technical approach centers on building AI workers that can manage the operational complexity inherent in real-economy businesses: inbound calls that require looking up order status across internal systems, email threads with multi-party coordination, document processing that feeds into downstream workflows. The platform integrates natural language processing for conversational interfaces with document automation capabilities, handling the operational load that typically requires human judgment and context-switching. The stack is built on TypeScript, Next.js, and Go, suggesting a focus on both frontend orchestration and backend performance for production-scale operations. The founding team - Pablo Palafox, Javier Palafox, and Luis Paarup - brings backgrounds in engineering and logistics, positioning the company to understand both the technical constraints of building reliable AI systems and the operational bottlenecks in target industries. The company's positioning as AI-native reflects a systems-level bet: that automating operations requires rethinking the entire operational stack rather than bolting AI onto existing software workflows. For engineers, the work involves building agents that handle reliability and failure modes in production environments where downtime has direct business impact - missed shipments, delayed communications, operational backlogs.

95 jobs
MA

Mirelo AI

Mirelo AI builds foundation models for generating synchronized audio for video content, targeting the latency and quality bottleneck in audio-for-video workflows. Founded in 2023 in Berlin, the company raised $41 million in seed funding co-led by Index Ventures and Andreessen Horowitz. Their models generate synchronized sound effects in seconds rather than the hours typically required for manual sound design, addressing production throughput constraints across gaming, film, social media, and broader visual content verticals. The technical stack centers on PyTorch with transformer architectures, optimized for H100 and H200 GPUs using Nsight profiling and SLURM for cluster orchestration. The team sources from Google Brain, Amazon, Meta FAIR, Disney, ETH Zürich, and Max Planck Institutes, combining AI research depth with domain expertise from musicians and product specialists. Co-founder and CEO CJ Simon-Gabriel previously worked at AWS Labs, where the founding team originated. The core technical challenge is tight audio-visual synchronization at generation time - a constraint that spans model architecture design, latency optimization, and evaluation methodology. Production systems must handle variable-length video inputs while maintaining temporal coherence across generated audio, requiring careful trade-offs between generation speed, output quality, and computational cost. The company positions its models as infrastructure for visual content pipelines, treating audio generation as a systems problem rather than a standalone creative tool.

8 jobs
TO

Toma

Toma operates a voice AI platform for automotive dealerships, processing over 1,000,000 calls since launching in 2024. The system handles inbound phone operations - service scheduling, call routing, and follow-up automation - with safeguards designed to manage transfer latency and revenue leakage. Core technical challenge: maintaining conversational quality and intent detection accuracy across high-variance dealership scenarios (service appointments, parts inquiries, sales handoffs) while minimizing false transfers and dropped context. The platform implements transfer triggers, clawback mechanisms for mistimed handoffs, and follow-up alerts when human staff doesn't complete actions, addressing the operational complexity of human-AI transition points in production telephony. Infrastructure runs on AWS with a TypeScript/Next.js frontend, PostgreSQL via Prisma for state management, and tRPC for type-safe API boundaries. The voice AI layer must handle real-time constraints - low-latency speech recognition and synthesis, sub-second intent classification - while managing concurrent call volume and dealership-specific context (inventory, scheduling systems, staff availability). Trade-offs center on model selection for conversational understanding versus inference cost at scale, and the reliability surface area of integrating with legacy dealership management systems. Founded by engineers from Scale AI, Uber, Lyft, and Amazon; backed by Andreessen Horowitz and Y Combinator with $17 million Series A funding. Deployment spans dealerships across the United States, including Pohanka Automotive Group, SCHOMP, Hudson Automotive Group, and Bergey's. Primary bottlenecks likely involve tuning voice models for domain-specific terminology (vehicle makes, service codes, dealership jargon), managing tail latency in transfer decisions where milliseconds impact customer experience, and evaluating conversational success beyond simple call completion - did the AI correctly capture appointment details, route urgency appropriately, preserve customer satisfaction? The system's value proposition hinges on converting missed calls and staff bottlenecks into captured revenue, which requires high precision on intent classification and low false-negative rates on transfer triggers to avoid revenue loss from mishandled interactions.

1 job