1. Home
  2. AI Companies
  3. Hugging Face
HF

Hugging Face

About

Hugging Face, founded in 2016, operates as an open-source platform where the machine learning community collaborates on models, datasets, and applications. The company develops and distributes open-source AI and machine learning tools and libraries while maintaining active engagement with a global developer community reaching millions worldwide. The technical focus spans machine learning, artificial intelligence research, open-source software, developer tooling, community platforms, and software engineering. Operations are fully distributed and remote since founding, with teams shipping code daily through asynchronous communication patterns and regular virtual gatherings.

The company's core outputs include open-source libraries maintained with public contribution pathways, transparent research shared publicly, and direct community engagement via GitHub and social platforms. The organizational structure combines engineers, researchers, and community advocates working across artificial intelligence, machine learning, developer tools, open-source software, and research verticals. Development velocity follows a daily release cadence with real-time progress sharing to the developer community.

The operational model emphasizes accessibility of AI technology, with an open-source philosophy guiding technical decisions and impact measurement. The culture supports contributor growth, celebrates contributions, and values public learning and initiative. Employee autonomy extends to location independence within the distributed structure. Co-founded by Clément Delangue and Julien Chaumond, the company is headquartered in the US while serving a global developer base.

Open roles at Hugging Face

Explore 10 open positions at Hugging Face and find your next opportunity.

HF

Wild Card

Hugging Face

US or Remote (United States)

3w ago
HF

Open-Source Machine Learning Engineer - International Remote

Hugging Face

New York, US or Remote (Worldwide)

3w ago
HF

Open-Source Machine Learning Engineer, AI for Robotics - Paris Office

Hugging Face

Paris, Paris, France (On-site)

1mo ago
HF

UX/UI Designer - US Remote

Hugging Face

United States (Remote)

1mo ago
HF

Cloud Machine Learning Evangelist - EMEA remote

Hugging Face

Paris, Paris, France or Remote (Europe, Middle East, and Africa)

2mo ago
HF

Cloud Machine Learning Engineer - EMEA remote

Hugging Face

Paris, Paris, France or Remote (Europe, Middle East, and Africa)

2mo ago
HF

Cloud Machine Learning Evangelist - US remote

Hugging Face

New York, United States or Remote (United States)

2mo ago
HF

Data/Infrastructure Advocate Engineer - US Remote

Hugging Face

New York, United States or Remote (New York, United States)

2mo ago
HF

Data/Infrastructure Advocate Engineer - EMEA Remote

Hugging Face

Paris, Paris, France or Remote (Europe, Middle East, and Africa)

2mo ago
HF

Cloud Machine Learning Engineer - US remote

Hugging Face

United States or Remote (United States)

3mo ago

Similar companies

HE

Heidi

Heidi builds an AI Care Partner that automates clinical documentation, form filling, and task management for clinicians worldwide. The system has returned over 18 million hours to clinicians in 18 months and currently supports more than 2 million patient visits weekly across 116 countries and 110+ languages. The company has raised nearly $100 million from Point72, Anthropic, and Blackbird, with a stated goal of halving the time required to deliver patient-first care. The core technical challenge sits at the intersection of multilingual NLP, healthcare informatics, and production reliability at global scale. The system must handle clinical documentation workflows across diverse regulatory environments, languages, and medical specialties while maintaining accuracy and latency requirements that directly impact clinician workflows. The stack spans TypeScript, React, Next.js, and Node.js on the frontend with Python, NestJS, and Express on the backend, using PostgreSQL and MongoDB for persistence and running on GCP and AWS infrastructure. The team includes clinicians, engineers, and designers, with most employees having healthcare backgrounds or direct experience with clinician burnout. Operational philosophy emphasizes shipping small, fast iteration cycles, and tolerance for failure in pursuit of reducing administrative burden. The Australian-based company operates globally with Docker-based deployments and CI/CD pipelines supporting continuous delivery across production environments.

112 jobs
RE

Replit

Replit operates a web-based code editor and multiplayer computing environment used by millions for collaborative software development. The platform eliminates traditional barriers to application creation through natural language interfaces, allowing users to build applications without conventional development workflows - demonstrated by architectural decisions like removing the save button from their editor. The multiplayer environment serves as infrastructure for experimentation, sharing, and collaborative growth at scale. The company measures success by the number of people empowered to create software rather than vanity metrics, reflecting a systems-level focus on removing bottlenecks in developer onboarding and productivity. Technical decisions prioritize shipping velocity and operational autonomy: the culture emphasizes extreme ownership, radical bets, and bias toward action. Engineers operate with the latitude to pursue emergent ideas and question established patterns when friction appears in the development loop. The platform's architecture supports collaborative coding workflows at scale, handling millions of concurrent users across a shared computing environment. This requires managing trade-offs between multi-tenancy constraints, latency in collaborative editing, and operational complexity of maintaining compute resources for distributed development sessions. The technical focus centers on developer tools, web-based editing infrastructure, and the reliability challenges of real-time collaborative computing.

76 jobs
ET

Etched

Etched, founded in 2022, designs transformer-specific ASICs with a hard architectural bet: transformers are the dominant and durable abstraction for AI workloads, so the right move is to burn that assumption into silicon rather than preserve generality. Their first chip, Sohu, is a single-model ASIC built exclusively for transformer inference. The throughput numbers are significant - Etched claims over 500,000 tokens per second on Llama 70B and an order-of-magnitude improvement in both throughput and latency relative to NVIDIA's B200. The trade-off is explicit: Sohu cannot run non-transformer workloads, and the entire value proposition collapses if the architectural assumption does. The performance claims, if they hold under production conditions, have direct implications for workloads where GPUs currently hit hard limits. Etched points to two in particular: real-time video generation models, where per-frame latency budgets are tight and sustained throughput requirements are high, and deep chain-of-thought reasoning agents, where long output sequences and large batch depths stress both memory bandwidth and end-to-end latency. Whether the claimed gains survive real deployment - across varied sequence lengths, batch sizes, quantization schemes, and serving topologies - is the evaluation question that matters most for operators considering adoption. On the infrastructure side, Etched is partnering with Rambus on memory and interface technologies, which speaks to where the bandwidth and signaling bottlenecks sit in a transformer-optimized design. The company has raised $120 million and carries a stated valuation of $5 billion as of available reporting. Founders Gavin Uberti, Chris Zhu, and Robert Wachen lead the company out of the US.

50 jobs
IN

Interaction

Interaction is a Palo Alto-based startup building Poke, an AI assistant that operates entirely within iMessage and SMS. The architecture constrains the system to function through messaging protocols rather than native apps, requiring the assistant to parse natural language commands, maintain conversational state, and execute actions across text, email, and calendar integrations - all mediated through message-based I/O. This introduces latency and throughput considerations inherent to SMS delivery networks and iMessage's API surface, alongside constraints on rich UI feedback mechanisms available to native applications. The company raised $15M in seed funding led by General Catalyst. The technical challenge centers on building proactive intelligence that surfaces relevant information from communication patterns while operating within the reliability and availability constraints of carrier networks and Apple's messaging infrastructure. Cross-platform integration across email and calendar systems adds complexity in authentication flows, permission models, and error handling when actions must be triggered through conversational interfaces rather than direct API calls. The team includes engineers from quantitative trading firms, MIT, Cambridge, and international science olympiad competitors. The stack includes Next.js, React, and SwiftUI, suggesting server-side processing for NLP workloads with client components for any companion interfaces. Production success depends on handling edge cases in natural language understanding, managing state across asynchronous message exchanges, and maintaining consistent behavior despite variable network conditions and platform-specific limitations in both iOS and carrier SMS systems.

1 job
TO

Toma

Toma operates a voice AI platform for automotive dealerships, processing over 1,000,000 calls since launching in 2024. The system handles inbound phone operations - service scheduling, call routing, and follow-up automation - with safeguards designed to manage transfer latency and revenue leakage. Core technical challenge: maintaining conversational quality and intent detection accuracy across high-variance dealership scenarios (service appointments, parts inquiries, sales handoffs) while minimizing false transfers and dropped context. The platform implements transfer triggers, clawback mechanisms for mistimed handoffs, and follow-up alerts when human staff doesn't complete actions, addressing the operational complexity of human-AI transition points in production telephony. Infrastructure runs on AWS with a TypeScript/Next.js frontend, PostgreSQL via Prisma for state management, and tRPC for type-safe API boundaries. The voice AI layer must handle real-time constraints - low-latency speech recognition and synthesis, sub-second intent classification - while managing concurrent call volume and dealership-specific context (inventory, scheduling systems, staff availability). Trade-offs center on model selection for conversational understanding versus inference cost at scale, and the reliability surface area of integrating with legacy dealership management systems. Founded by engineers from Scale AI, Uber, Lyft, and Amazon; backed by Andreessen Horowitz and Y Combinator with $17 million Series A funding. Deployment spans dealerships across the United States, including Pohanka Automotive Group, SCHOMP, Hudson Automotive Group, and Bergey's. Primary bottlenecks likely involve tuning voice models for domain-specific terminology (vehicle makes, service codes, dealership jargon), managing tail latency in transfer decisions where milliseconds impact customer experience, and evaluating conversational success beyond simple call completion - did the AI correctly capture appointment details, route urgency appropriately, preserve customer satisfaction? The system's value proposition hinges on converting missed calls and staff bottlenecks into captured revenue, which requires high precision on intent classification and low false-negative rates on transfer triggers to avoid revenue loss from mishandled interactions.

1 job
SA

Slingshot AI

Slingshot AI operates as a mental health research lab building a foundation model for psychology and accompanying therapy chatbot. The technical stack spans model development (PyTorch, TensorFlow, JAX) and production infrastructure (GCP, Kubernetes, Cloud Run, gRPC) with client applications in Flutter and Next.js/React. The team combines machine learning engineering, product development, and clinical research expertise, working with therapists and clinicians to align model behavior with therapeutic practices. The core technical challenge is training a domain-specific foundation model that supports user agency in mental health contexts - framing the product as a tool that helps users recognize their own capacity for change rather than an answer-dispensing assistant. This architectural constraint requires careful training objective design and evaluation frameworks that measure therapeutic alignment, not just task completion. The system operates at global scale through partnerships with mental health organizations, though specific throughput or latency metrics are not disclosed. Development follows rapid iteration cycles with emphasis on shipping velocity. The engineering stack reflects production priorities: Rust for performance-critical paths, typed languages (TypeScript, Kotlin) for application logic, and container orchestration for deployment. The team works within the constraint of adapting general-purpose ML infrastructure to specialized clinical requirements while maintaining operational reliability for users seeking mental health support.

1 job