About

NVIDIA, founded in 1993 by Jensen Huang, Chris Malachowsky, and Curtis Priem, is the world leader in accelerated computing. The company pioneered the GPU in 1999 - a specialized parallel processor that handles complex mathematical calculations concurrently, enabling the gaming, high-performance computing, and AI workloads that define modern computing infrastructure. What began as a focused effort to bring interactive 3D graphics to gaming and multimedia markets has evolved into a platform underpinning production inference systems, autonomous vehicle perception pipelines, robotics control loops, and scientific computing clusters where throughput and latency constraints are paramount.

The company's core technical domains span GPU architecture, parallel computing primitives, and accelerated computing frameworks across gaming technology, high-performance computing, artificial intelligence, autonomous vehicles, robotics, healthcare technology, and scientific computing. NVIDIA's hardware and software stack addresses the fundamental bottleneck in data-intensive applications: transforming massive datasets into actionable insights and real-time outputs where traditional CPU-bound architectures fail to meet throughput or latency requirements. This positions the company at the architectural level of systems where inference workloads - whether serving LLMs at scale, running real-time computer vision for autonomous navigation, or processing scientific simulations - require specialized compute with predictable performance characteristics.

NVIDIA operates globally across industry verticals where accelerated computing creates measurable performance advantages: PC gaming, AI model training and inference, autonomous vehicle development, robotics deployment, healthcare imaging and analysis, and scientific research computing. The company's approach centers on solving computational problems where parallelism, memory bandwidth, and specialized instruction sets provide orders-of-magnitude improvements over general-purpose processors - the precise trade-offs that matter in production inference environments where cost per token, p99 latency, and GPU utilization directly impact system economics and user experience.

Open roles at NVIDIA

Explore 820 open positions at NVIDIA and find your next opportunity.

NV

Manager, Systems Software Engineering - NVLink and AI

NVIDIA

Santa Clara, California, United States (On-site)

$224k – $356.5k Yearly1w ago
NV

MCU Firmware Engineer

NVIDIA

臺北市, Taipei, Taiwan (On-site)

1w ago
NV

Senior Solutions Architect, Simulation

NVIDIA

Shanghai, Shanghai, China (On-site)

1w ago
NV

Senior Test Methodology Engineer

NVIDIA

Santa Clara, California, United States (On-site)

$132k – $253k Yearly1w ago
NV

Senior Software Developer, AI Networking

NVIDIA

Texas, United States (Remote)

$184k – $356.5k Yearly1w ago
NV

Senior Product Marketing Manager - Data Science

NVIDIA

Santa Clara, California, United States (Hybrid)

$152k – $287.5k Yearly1w ago
NV

Senior PR Manager, Quantum Computing and CAE

NVIDIA

Santa Clara, California, United States (On-site)

$168k – $270.3k Yearly1w ago
NV

Senior Research Scientist, Fundamental LLM Research for Knowledge, Reasoning, and Agents

NVIDIA

Santa Clara, California, United States (On-site)

$224k – $356.5k Yearly1w ago
NV

Senior Manager, Networking Application

NVIDIA

Yokneam Ilit, Northern District, Israel (On-site)

1w ago
NV

Manufacture System Design Engineer

NVIDIA

Shanghai, Shanghai, China (On-site)

1w ago
NV

Senior Accelerated Computing Product Manager

NVIDIA

Santa Clara, California, United States (On-site)

$168k – $327.8k Yearly1w ago
NV

Senior Developer Relations Manager - Supercomputing

NVIDIA

東京都, Tokyo Prefecture, Japan (On-site)

1w ago
NV

Senior Solutions Architect, Networking - Hyperscale

NVIDIA

Santa Clara, California, United States (On-site)

$224k – $431.3k Yearly1w ago
NV

Senior Network Performance Exploration Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-site)

1w ago
NV

Senior Director, Global Commodity Management - Optics

NVIDIA

Santa Clara, California, United States (On-site)

$292k – $442.8k Yearly1w ago
NV

AI Compute Engineer

NVIDIA

Yokneam Ilit, Northern District, Israel (On-site)

1w ago
NV

Senior Software Engineer, Robotics - Isaac Lab

NVIDIA

Santa Clara, California, United States (On-site)

$152k – $287.5k Yearly1w ago
NV

Tapeout Mask Design Engineer

NVIDIA

Santa Clara, California, United States (Hybrid)

$104k – $207k Yearly1w ago
NV

Senior System Software Engineer - Dynamo-Triton Inference Server

NVIDIA

Santa Clara, California, United States (On-site)

$152k – $241.5k Yearly1w ago

Similar companies

NE

Nebius

Nebius is a Nasdaq-listed technology company (NBIS) building full-stack AI infrastructure from its Amsterdam headquarters, with GPU clusters deployed across Europe and the United States. Led by CEO Arkady Volozh, the company operates AI-optimized sustainable data centers - including a facility 60 kilometers from Helsinki and a new Vineland, New Jersey site - and has raised significant capital ($700 million from investors including Accel, NVIDIA, and Orbis). The engineering organization, numbering in the hundreds, maintains deep expertise in world-class infrastructure and runs an in-house AI R&D team that dogfoods the platform to validate it against production ML practitioner requirements. The infrastructure stack spans hyperscaler-scale features with supercomputer-grade performance characteristics. ISEG, Nebius's supercomputer, ranks among the world's most powerful systems. The platform integrates NVIDIA GPUs with NVIDIA InfiniBand networking, exposing workload orchestration through both Kubernetes and Slurm. The operational layer includes standard observability (Prometheus, Grafana), data infrastructure (PostgreSQL, Apache Spark), and ML tooling (MLflow, vLLM, Triton, Ray), with infrastructure-as-code managed via Terraform. This architecture targets the latency, throughput, and reliability requirements of AI training and inference workloads at scale. The company has secured a multi-billion dollar agreement with Microsoft to deliver dedicated AI infrastructure from its Vineland data center. Nebius serves startups, research institutes, and enterprises across healthcare and life sciences, robotics, finance, and entertainment verticals. The technical approach emphasizes production-grade infrastructure that handles the operational complexity of large-scale AI deployments - managing GPU utilization, network bottlenecks, and the cost-performance trade-offs inherent in serving diverse AI workloads from model training through inference serving.

233 jobs
CA

Cartesia

Cartesia builds real-time multimodal AI models for voice applications, with production systems spanning text-to-speech and speech-to-text. The company emerged from Stanford's AI Lab, where the founding team - led by CEO Karan Goel - pioneered work on State Space Models (SSMs) before transitioning to commercial infrastructure. Their technical approach combines model innovation with systems engineering, focusing on the latency, throughput, and operational constraints that define production voice AI. The core product line includes Sonic, a text-to-speech model designed for emotive, human-like output, and Ink, a recently launched speech-to-text system purpose-built for real-time voice applications. Both systems address the fundamental trade-offs in voice AI: achieving low-latency inference while maintaining quality at scale. The company's technical domains span foundation model development, real-time multimodal intelligence, and developer tooling - infrastructure that runs where users are rather than requiring server-side processing. Cartesia's engineering stack runs on Python, Go, and TypeScript, supporting developers building voice interfaces that demand sub-second response times and reliable performance under production load. The team's research background in SSMs informs their approach to model efficiency and scalability, though the company now focuses on shipping production systems rather than pure research. Their stated mission centers on ubiquitous, interactive intelligence - systems that handle the operational complexity of real-time voice while remaining accessible to developers building conversational interfaces.

25 jobs
GL

Gladia

Gladia operates speech-to-text APIs across two distinct workloads: real-time streaming at sub-300ms latency and asynchronous batch transcription, both supporting over 100 languages. The real-time path handles streaming audio with integrated speaker diarization, word-level timestamps, and sentiment analysis in the inference loop. The async path processes batch jobs with code-switching detection - single utterances spanning multiple languages - and comparable feature coverage. Over 150,000 users and 700 enterprise deployments (including VEED.IO, Circleback, Attention) generate production traffic against these endpoints. The core technical challenge is maintaining sub-300ms end-to-end latency on the streaming path while running diarization and alignment models alongside the primary ASR stack. Meeting this threshold at scale - across 100+ language models with varying acoustic characteristics - requires careful management of model load times, batching strategies, and inference queue depth. The async API trades latency tolerance for throughput optimization on longer-form audio, though specific cost-per-hour or throughput metrics are not disclosed. Code-switching introduces additional complexity: language detection, model routing, and boundary stitching must occur without degrading transcription accuracy or introducing alignment artifacts at switch points. Founded in 2022, the company raised $16 million Series A from Sequoia Capital, XAnge, and New Wave. Founders Jean-Louis Quéguiner and Jonathan Soto positioned the service as audio infrastructure for voice-first platforms rather than a narrow transcription tool. The engineering focus centers on reliability and operational predictability across multilingual inference workloads - handling acoustic variability, speaker overlap, background noise, and model version rollouts without service degradation. Production deployment at this user scale surfaces edge cases in language detection, diarization boundary errors, and latency tail behavior that define the system's actual robustness beyond benchmarked WER numbers.

1 job