1. Home
  2. AI Companies
  3. Perplexity

About

Perplexity operates an AI-powered answer engine processing over 150 million questions weekly across web, mobile, and enterprise platforms. Founded in 2022, the system combines real-time web search with multiple LLMs to deliver source-attributed answers. The architecture serves both consumer and enterprise workloads, with enterprise deployments requiring security guarantees for knowledge worker use cases including legal research partnerships with organizations like Latham & Watkins.

The technical stack runs on AWS infrastructure with Terraform for provisioning, Python and Go for backend services, and PyTorch with DeepSpeed and FSDP for model training and inference. Data pipelines use dbt, SQL, Snowflake, and Databricks. Frontend implementations use React and TypeScript, with Docker containerization and Open Policy Agent for access control. This architecture must handle tail latency and throughput requirements for real-time search retrieval paired with LLM inference at consumer scale, while maintaining source credibility verification in the critical path.

The engineering focus centers on information retrieval accuracy, model response quality, and citation reliability rather than advertising optimization. Production systems must balance inference cost against answer quality across multiple models, manage retrieval latency for real-time web indexing, and maintain reliability for both free-tier consumer traffic and enterprise SLA requirements. Pro tier monetization suggests capacity-based or model selection tiering rather than pure ad-based revenue.

Open roles at Perplexity

Explore 67 open positions at Perplexity and find your next opportunity.

PE

Cloud Security Engineer

Perplexity

San Francisco, California, United States (On-site)

$210K – $385K Yearly3mo ago
PE

Backend Software Engineer - Mobile (San Francisco, Palo Alto, New York, Belgrade, London)

Perplexity

San Francisco, California, United States (On-site)

$210K – $385K Yearly3mo ago
PE

Backend Software Engineer - Search, Crawler Team (London, Belgrade, Berlin)

Perplexity

Belgrade, Belgrade, Serbia (On-site)

3mo ago
PE

Search Senior Machine Learning Engineer (London, Belgrade, Berlin)

Perplexity

Belgrade, Belgrade, Serbia (On-site)

3mo ago
PE

Senior C++ Developer - Search Core (London, Belgrade, Berlin)

Perplexity

Belgrade, Belgrade, Serbia (On-site)

3mo ago
PE

Forward-Deployed Engineer - API Platform | London, NYC, Seattle, SF

Perplexity

New York, New York, United States (On-site)

$205K – $335K Yearly3mo ago
PE

AI Engineer, Applied ML

Perplexity

San Francisco, California, United States (On-site)

$210K – $385K Yearly3mo ago
PE

Search Rust Engineer (London, Belgrade, Berlin)

Perplexity

Belgrade, Belgrade, Serbia (On-site)

3mo ago
PE

Enterprise Growth Lead

Perplexity

San Francisco, California, United States (On-site)

$200K – $400K Yearly3mo ago
PE

Staff Backend Software Engineer - API Platform | NYC, Seattle, SF

Perplexity

San Francisco, California, United States (On-site)

$250K – $385K Yearly3mo ago
PE

Internship - Machine Learning Research Engineer (Berlin)

Perplexity

Berlin, Berlin, Germany (On-site)

3mo ago
PE

Model Behavior Architect

Perplexity

San Francisco, California, United States (On-site)

$180K – $260K Yearly3mo ago
PE

Search Golang Engineer (London, Belgrade, Berlin)

Perplexity

Belgrade, Belgrade, Serbia (On-site)

3mo ago
PE

Product Marketing Manager

Perplexity

San Francisco, California, United States (On-site)

$220K – $290K Yearly3mo ago
PE

Tech Lead Manager - Agents

Perplexity

San Francisco, California, United States (On-site)

$300K – $385K Yearly3mo ago
PE

Strategic Finance Lead - Core

Perplexity

San Francisco, California, United States (On-site)

$170K – $230K Yearly3mo ago
PE

Frontend Engineer - Design Systems

Perplexity

San Francisco, California, United States (On-site)

$210K – $385K Yearly3mo ago
PE

Software Engineer - Security

Perplexity

San Francisco, California, United States or Remote (United States)

$210K – $385K Yearly3mo ago
PE

Senior/Staff Web Platform Engineer | NYC, Seattle, SF

Perplexity

San Francisco, California, United States (On-site)

$250K – $385K Yearly3mo ago
PE

AI Research Lead

Perplexity

San Francisco, California, United States (On-site)

$300K – $470K Yearly3mo ago

Similar companies

LA

LangChain

LangChain operates an engineering platform and open source frameworks for building, testing, and deploying AI agents. The core offering comprises LangChain and LangGraph - open source frameworks providing pre-built architectures and access to 1,000+ integrations - alongside LangSmith, a commercial platform for observability, evaluation, and deployment of LLM systems. The frameworks see over 90 million combined downloads per month and are used by millions of developers worldwide, with named deployments at Replit, Clay, Cloudflare, Harvey, Rippling, Vanta, Workday, LinkedIn, and Coinbase. The technical stack addresses the production bottlenecks of agent engineering: reliability through comprehensive observability, evaluation tooling to surface failure modes before deployment, and deployment infrastructure to move from prototype to production. LangSmith's platform provides the operational layer for teams moving LLM systems into production environments, while the open source frameworks prioritize development velocity through pre-built components and extensive integration coverage. The architecture allows granular control over agent behavior while reducing the complexity of integrating external services and managing LLM system reliability at scale. LangChain serves both major enterprises and startups building AI agents, with technical domains spanning agent engineering, LLM systems, observability, evaluation, and developer tooling. The company is led by CEO Harrison Chase and maintains a US headquarters, with a worldwide developer base. The dual model of open source frameworks and commercial platform reflects a focus on production-readiness and operational support for teams deploying agents at scale.

114 jobs
CO

Cohere

Cohere builds enterprise-focused foundational models designed for production deployment with emphasis on security, privacy, and operational trust. Founded in 2019 in Toronto, the company has raised nearly $1 billion and scaled to hundreds of employees worldwide. The technical focus spans semantic search, content generation, and customer experience applications - domains where model reliability and data governance are non-negotiable constraints for enterprise adoption. The company's architecture decisions reflect production realities over research novelty. Models are architected for deployment into regulated environments where data residency, access controls, and audit trails matter as much as accuracy metrics. This positioning addresses the gap between frontier model capabilities and enterprise operational requirements: latency SLAs, cost predictability, and compliance frameworks that prevent many organizations from operationalizing public AI APIs. Cohere Labs has published over 100 papers and built a research community of 4,500+ researchers, signaling ongoing investment in foundational work rather than pure application-layer focus. The team composition skews heavily toward researchers and engineers from academic backgrounds, which maps to the technical challenge space - building models that balance performance, safety constraints, and deployment flexibility across varied enterprise infrastructure.

106 jobs
BA

Baseten

Baseten builds AI infrastructure for production deployment and scaling of models, with work spanning kernel-level optimization for inference performance through developer tooling. The platform ships daily, measuring success by real-world impact of AI products running on it rather than vanity metrics. Engineers embed directly with customers to surface operational bottlenecks, then optimize obsessively - work ranges from TensorRT-LLM and CUDA kernel tuning to building developer tools that reduce deployment friction. The stack centers on inference at scale: TensorRT-LLM and PyTorch for model execution, NVIDIA Triton Inference Server for serving, Kubernetes (EKS) with Karpenter for autoscaling, and Knative for event-driven workloads on AWS EC2. Infrastructure decisions prioritize shipping velocity over process - small teams with real ownership iterate rapidly on production reliability, latency (including tail behavior), and cost efficiency. Docker containerization and PostgreSQL round out core operational dependencies. The team is internationally distributed, composed of engineers and designers who take craft seriously without performative posturing. Customer-embedded engineering informs both platform architecture and developer experience tradeoffs, creating tight feedback loops between deployment reality and infrastructure evolution. From founding, the approach has centered on hands-on problem solving and rapid iteration rather than abstraction layers that delay production learning.

69 jobs
RU

Runpod

RunPod operates an end-to-end AI infrastructure platform focused on GPU compute provisioning for model training, inference, and distributed agent orchestration. The platform serves over 500,000 developers, spanning solo practitioners to enterprise teams deploying at scale. Core infrastructure handles compute allocation, orchestration complexity, and operational overhead - positioning itself as accessible infrastructure rather than requiring deep systems expertise from users. The technical stack centers on Go, Python, and TypeScript with containerization through Docker and Kubernetes orchestration on Linux. Engineering domains span distributed systems, GPU compute scheduling, and developer tooling designed to abstract provisioning and scaling mechanics. The company emphasizes reducing operational friction: developers interact with compute resources without managing underlying cluster complexity or infrastructure provisioning bottlenecks. RunPod maintains a remote-first structure with team distribution across the U.S., Canada, Europe, and India. The platform's design reflects a systems-first approach to making GPU compute economically viable and operationally manageable - targeting workloads where cost, reliability, and time-to-deployment constrain AI development cycles.

26 jobs
PI

Pinecone

Pinecone operates a fully managed vector database service designed for production AI applications requiring storage and retrieval of high-dimensional embeddings. The system handles vector search at scale across recommendation systems, semantic search, and related ML-backed services. Founded by Edo Liberty, formerly a research director at AWS with prior experience building custom vector search systems at large scale, the company is credited with establishing the vector database category as a distinct infrastructure layer. The technical stack centers on systems languages - Rust, Go, C++, and Python - with RocksDB as the storage engine and Kubernetes orchestration across AWS, GCP, and Azure. This architecture targets the operational complexity of managing embedding indices, query latency, and throughput at production scale, abstracting infrastructure decisions from engineering teams deploying AI features. The platform serves thousands of companies, positioning itself on ease of deployment and reduced time-to-production for vector-backed applications. The founding principle emphasizes accessibility for engineering teams of varying sizes, evolving the managed service model to minimize operational overhead in running vector workloads. Core focus areas include retrieval performance, reliability under production load, and cost-efficiency trade-offs inherent to high-dimensional search systems.

9 jobs