1. Home
  2. AI Companies
  3. Stability AI
SA

Stability AI

Similar companies

GR

Graphcore

Graphcore, a British semiconductor company and wholly owned subsidiary of SoftBank Group, develops specialized AI compute hardware centered on its Intelligence Processing Unit (IPU). The IPU represents a processor architecture specifically designed for machine intelligence workloads rather than general-purpose computing. The company built a complete AI compute stack spanning silicon design through datacenter infrastructure, including the Poplar software framework that sits atop the hardware. Graphcore brought the first Wafer-on-Wafer AI processor to market, a packaging approach that addresses the bandwidth and latency constraints inherent in traditional chip-to-chip interconnects for AI workloads. The technical scope encompasses semiconductor engineering, processor design, and AI-specific optimizations across both hardware and software layers. The engineering team works on silicon design, wafer-scale integration technology, and the development of tools for AI model optimization. The software stack includes developer tools designed to extract performance from the IPU architecture, with ongoing work to optimize popular AI models for the platform. This systems-level approach attempts to address the throughput and efficiency bottlenecks that emerge when running large-scale machine learning workloads on conventional processor architectures. Under CEO Nigel Toon's leadership, Graphcore operates with global presence and maintains teams of semiconductor, software, and AI specialists. The company's technology stack includes standard datacenter interfaces (PCIe, DDR, Ethernet) alongside proprietary elements like the IPU and Poplar software. The subsidiary structure under SoftBank provides backing for continued development of both the silicon and the software layers required to compete in AI compute infrastructure, where the trade-offs between custom silicon development costs and performance gains define commercial viability.

197 jobs
PE

Perplexity

Perplexity operates an AI-powered answer engine processing over 150 million questions weekly across web, mobile, and enterprise platforms. Founded in 2022, the system combines real-time web search with multiple LLMs to deliver source-attributed answers. The architecture serves both consumer and enterprise workloads, with enterprise deployments requiring security guarantees for knowledge worker use cases including legal research partnerships with organizations like Latham & Watkins. The technical stack runs on AWS infrastructure with Terraform for provisioning, Python and Go for backend services, and PyTorch with DeepSpeed and FSDP for model training and inference. Data pipelines use dbt, SQL, Snowflake, and Databricks. Frontend implementations use React and TypeScript, with Docker containerization and Open Policy Agent for access control. This architecture must handle tail latency and throughput requirements for real-time search retrieval paired with LLM inference at consumer scale, while maintaining source credibility verification in the critical path. The engineering focus centers on information retrieval accuracy, model response quality, and citation reliability rather than advertising optimization. Production systems must balance inference cost against answer quality across multiple models, manage retrieval latency for real-time web indexing, and maintain reliability for both free-tier consumer traffic and enterprise SLA requirements. Pro tier monetization suggests capacity-based or model selection tiering rather than pure ad-based revenue.

76 jobs
DE

Descript

Descript builds a video and audio editing platform that replaces timeline-based manipulation with text-based editing - users cut and rearrange content by editing transcribed text rather than working directly with waveforms or video tracks. The system serves millions of creators, handling the full production pipeline from recording through collaborative editing to publication. Core technical domains span machine learning for transcription and automated design, text-based editing interfaces built on React and TypeScript, and distributed collaboration infrastructure. The platform's architecture supports both solo and team workflows across time zones, with backend systems running on PostgreSQL and Redis. Technical focus areas include generative AI capabilities that create content from natural language descriptions, automated design systems that reduce manual formatting work, and the fundamental text-to-media mapping that enables document-style editing of temporal content. The team combines creator domain expertise with systems engineering - reflected in stated priorities around human-centered design and products that handle real production constraints rather than demo cases. The stack centers on TypeScript/React for client interfaces, Python for ML pipelines, and SQL-based data infrastructure with dbt for transformation logic. REST APIs provide integration points. Current engineering emphasis appears weighted toward extending ML capabilities - transcription accuracy, generative features, design automation - alongside the operational complexity of maintaining reliable performance at scale for collaborative real-time editing workflows.

32 jobs
PI

Pinecone

Pinecone operates a fully managed vector database service designed for production AI applications requiring storage and retrieval of high-dimensional embeddings. The system handles vector search at scale across recommendation systems, semantic search, and related ML-backed services. Founded by Edo Liberty, formerly a research director at AWS with prior experience building custom vector search systems at large scale, the company is credited with establishing the vector database category as a distinct infrastructure layer. The technical stack centers on systems languages - Rust, Go, C++, and Python - with RocksDB as the storage engine and Kubernetes orchestration across AWS, GCP, and Azure. This architecture targets the operational complexity of managing embedding indices, query latency, and throughput at production scale, abstracting infrastructure decisions from engineering teams deploying AI features. The platform serves thousands of companies, positioning itself on ease of deployment and reduced time-to-production for vector-backed applications. The founding principle emphasizes accessibility for engineering teams of varying sizes, evolving the managed service model to minimize operational overhead in running vector workloads. Core focus areas include retrieval performance, reliability under production load, and cost-efficiency trade-offs inherent to high-dimensional search systems.

9 jobs