1. Home
  2. AI Companies
  3. turbopuffer
turbopuffer logoTU

turbopuffer

About

turbopuffer is a serverless vector and full-text search database built on object storage, separating compute and storage to address the latency-cost-throughput trade-off in production retrieval systems. The architecture uses tiered storage - NVMe/SSD caching layered over object storage - to optimize for variable query patterns and burst load while avoiding the fixed cost overhead of traditional in-memory vector databases.

The system handles 3.5T+ documents, 10M+ writes/s, and 25k+ queries/s, with support for hybrid search (vector + full-text) and metadata filtering. Serverless scaling means you pay for what you use; the separation of compute and storage eliminates the need to over-provision either dimension. This matters for workloads with bursty traffic or datasets that grow unpredictably - common in AI retrieval pipelines feeding assistants and agents.

The design makes explicit trade-offs around tail latency and operational complexity. Tiered storage introduces variable access costs and potential cache-miss penalties, requiring careful tuning for your query profile. Full-text search integration alongside vectors reduces the need for multiple systems, but hybrid scoring and ranking add computational overhead that affects per-query latency. Metadata filtering allows selective search without scanning the full corpus, critical for reducing throughput costs in gated retrieval scenarios.

Similar companies

Perplexity logoPE

Perplexity

Perplexity is an AI-powered answer engine that provides accurate, real-time answers to questions backed by credible sources and citations.

54 jobs
Modal logoMO

Modal

Modal is a serverless compute platform for AI and data teams that enables running compute-intensive workloads like ML inference, fine-tuning, and batch jobs with instant GPU access and usage-based pricing.

28 jobs
Pinecone logoPI

Pinecone

Pinecone is the leading vector database for building accurate and performant AI applications at scale in production.

2 jobs
Qdrant logoQD

Qdrant

Qdrant is an open-source vector database and similarity search engine written in Rust, powering AI applications with high-performance vector similarity search technology.

1 job
Weaviate logoWE

Weaviate

Open-source vector database for semantic search, RAG, and agentic AI workflows with hybrid search and embeddings support.

Bento logoBE

Bento

Bento provides an open-source framework and enterprise platform for deploying and operating AI/ML model inference in production with control over performance, scaling, and operational complexity.