Etched logoET

About

Etched, founded in 2022, designs transformer-specific ASICs with a hard architectural bet: transformers are the dominant and durable abstraction for AI workloads, so the right move is to burn that assumption into silicon rather than preserve generality. Their first chip, Sohu, is a single-model ASIC built exclusively for transformer inference. The throughput numbers are significant - Etched claims over 500,000 tokens per second on Llama 70B and an order-of-magnitude improvement in both throughput and latency relative to NVIDIA's B200. The trade-off is explicit: Sohu cannot run non-transformer workloads, and the entire value proposition collapses if the architectural assumption does.

The performance claims, if they hold under production conditions, have direct implications for workloads where GPUs currently hit hard limits. Etched points to two in particular: real-time video generation models, where per-frame latency budgets are tight and sustained throughput requirements are high, and deep chain-of-thought reasoning agents, where long output sequences and large batch depths stress both memory bandwidth and end-to-end latency. Whether the claimed gains survive real deployment - across varied sequence lengths, batch sizes, quantization schemes, and serving topologies - is the evaluation question that matters most for operators considering adoption.

On the infrastructure side, Etched is partnering with Rambus on memory and interface technologies, which speaks to where the bandwidth and signaling bottlenecks sit in a transformer-optimized design. The company has raised $120 million and carries a stated valuation of $5 billion as of available reporting. Founders Gavin Uberti, Chris Zhu, and Robert Wachen lead the company out of the US.

Open roles at Etched

Explore 25 open positions at Etched and find your next opportunity.

Etched logoET

Inference Software Engineer

Etched

Cupertino, California, United States (On-site)

2w ago
Etched logoET

Front-End Power Engineer

Etched

Cupertino, California, United States (On-site)

2w ago
Etched logoET

Power Optimization Engineer

Etched

Cupertino, California, United States (On-site)

2w ago
Etched logoET

Electrical Engineer, Hardware Systems

Etched

Cupertino, California, United States (On-site)

2w ago
Etched logoET

Head of Legal

Etched

Cupertino, California, United States (On-site)

2w ago

Similar companies

Tenstorrent logoTE

Tenstorrent

Tenstorrent is a next-generation computing company that builds computers for AI, developing AI Graph Processors, high-performance RISC-V CPUs, and configurable chiplets with an open-source software stack.

116 jobs
Cerebras logoCE

Cerebras

Cerebras Systems builds the world's fastest AI infrastructure with industry-leading speed, scale, and quality through wafer-scale AI chips.

87 jobs
d-Matrix logoD-

d-Matrix

d-Matrix builds purpose-built AI inference computing platforms to make generative AI commercially viable, efficient, and sustainable through digital in-memory compute technology.

43 jobs
SambaNova logoSA

SambaNova

SambaNova Systems is a full-stack AI infrastructure company delivering the fastest and most energy-efficient AI inference platform through custom RDU chips and software, enabling enterprises to deploy sovereign AI with complete data control.

3 jobs
Speedata logoSP

Speedata

Speedata builds custom silicon (C200 APU) to accelerate analytics and AI data processing by executing Apache Spark SQL operations in hardware.

Inferact logoIN

Inferact

Inferact commercializes vLLM, an open-source LLM inference engine built by its founders, to reduce inference latency, cost, and serving complexity at scale.