About

Sesame builds voice interfaces through tight integration of hardware, software, and machine learning, pursuing research in speech generation, personality modeling, and multimodal ML. The company operates large GPU clusters to support ambitious research programs aimed at making computers lifelike through natural voice interaction, with development cycles measured in days rather than quarters. Backed by a16z, Sequoia, Spark, and Matrix, the technical effort spans PyTorch-based model development alongside Android and iOS deployment, with infrastructure supporting rapid iteration from whiteboard concepts to production systems.

The engineering organization comprises an interdisciplinary team of long-tenured experts across machine learning, hardware, software, and entertainment backgrounds, operating from offices in San Francisco, Bellevue, and New York. Core technical domains include speech generation systems, personality modeling for voice companions, and multimodal ML architectures that coordinate audio and other sensory inputs. The product strategy emphasizes deliberate design choices to create voice interfaces that are nuanced and intimate rather than intrusive, with hardware engineering efforts targeting lightweight eyewear form factors for all-day wear.

Infrastructure and operational requirements center on GPU cluster management to support training and inference for speech models, alongside mobile platform engineering for real-time voice processing. The technical challenge involves crossing the uncanny valley in voice interaction - achieving latency, naturalness, and contextual appropriateness simultaneously across diverse usage scenarios. Team composition reflects this: specialists in human-computer interaction work alongside ML researchers and hardware engineers to optimize the full stack from acoustic modeling through industrial design.

Open roles at Sesame

Explore 23 open positions at Sesame and find your next opportunity.

Sesame logoSE

Audio System TPM

Sesame

臺北市, Taipei, Taiwan (On-site)

2w ago
Sesame logoSE

Sensing Systems Engineer

Sesame

San Francisco, California, United States (On-site)

$175K – $280K Yearly2w ago
Sesame logoSE

Technical Program Manager, Data

Sesame

San Francisco, California, United States (On-site)

$200K – $260K Yearly2w ago

Similar companies

xAI logoXA

xAI

xAI is an AI company founded by Elon Musk in 2023 with the mission to advance scientific discovery and gain a deeper understanding of the universe through artificial intelligence.

172 jobs
Cerebras logoCE

Cerebras

Cerebras Systems builds the world's fastest AI infrastructure with industry-leading speed, scale, and quality through wafer-scale AI chips.

87 jobs
ElevenLabs logoEL

ElevenLabs

ElevenLabs is an AI audio research and deployment company building the most realistic voice AI platform, powering millions of developers, creators, and enterprises with text-to-speech, voice cloning, and conversational AI agents.

43 jobs
Cartesia logoCA

Cartesia

Cartesia builds real-time multimodal AI models, including the Sonic text-to-speech and Ink speech-to-text systems, to power next-generation voice applications.

22 jobs
Mirelo AI logoMA

Mirelo AI

Mirelo AI is a Berlin-based startup building AI foundation models that generate synchronized sound effects and music for videos in seconds, addressing the gap in audio technology for generative AI.

8 jobs
Reka logoRE

Reka

Reka is a frontier AI research and product company building unified multimodal foundation models that understand text, images, video, and audio to empower organizations and enterprises.