About

NVIDIA, founded in 1993 by Jensen Huang, Chris Malachowsky, and Curtis Priem, is the world leader in accelerated computing. The company pioneered the GPU in 1999 - a specialized parallel processor that handles complex mathematical calculations concurrently, enabling the gaming, high-performance computing, and AI workloads that define modern computing infrastructure. What began as a focused effort to bring interactive 3D graphics to gaming and multimedia markets has evolved into a platform underpinning production inference systems, autonomous vehicle perception pipelines, robotics control loops, and scientific computing clusters where throughput and latency constraints are paramount.

The company's core technical domains span GPU architecture, parallel computing primitives, and accelerated computing frameworks across gaming technology, high-performance computing, artificial intelligence, autonomous vehicles, robotics, healthcare technology, and scientific computing. NVIDIA's hardware and software stack addresses the fundamental bottleneck in data-intensive applications: transforming massive datasets into actionable insights and real-time outputs where traditional CPU-bound architectures fail to meet throughput or latency requirements. This positions the company at the architectural level of systems where inference workloads - whether serving LLMs at scale, running real-time computer vision for autonomous navigation, or processing scientific simulations - require specialized compute with predictable performance characteristics.

NVIDIA operates globally across industry verticals where accelerated computing creates measurable performance advantages: PC gaming, AI model training and inference, autonomous vehicle development, robotics deployment, healthcare imaging and analysis, and scientific research computing. The company's approach centers on solving computational problems where parallelism, memory bandwidth, and specialized instruction sets provide orders-of-magnitude improvements over general-purpose processors - the precise trade-offs that matter in production inference environments where cost per token, p99 latency, and GPU utilization directly impact system economics and user experience.

Open roles at NVIDIA

Explore 820 open positions at NVIDIA and find your next opportunity.

NV

Senior GPU Low Power Architect

NVIDIA

Santa Clara, California, United States (On-site)

$136k – $264.5k Yearly1w ago
NV

GPU Architecture Engineer - New College Grad 2025

NVIDIA

Santa Clara, California, United States (On-site)

$124k – $241.5k Yearly1w ago
NV

Senior Technical Data Analyst - Operations E2E Data Intelligent Systems

NVIDIA

Santa Clara, California, United States (On-site)

$168k – $258.8k Yearly1w ago
NV

Senior Software Program Manager – CSP Engagements

NVIDIA

Santa Clara, California, United States (On-site)

$168k – $322k Yearly1w ago
NV

Sales Development Specialist

NVIDIA

München, Bavaria, Germany (On-site)

1w ago
NV

Senior GPU Functional Modeling Architect

NVIDIA

Santa Clara, California, United States (On-site)

$152k – $287.5k Yearly1w ago
NV

Senior Software Engineer, Hardware-Oriented

NVIDIA

Yokneam Ilit, Northern District, Israel (On-site)

1w ago
NV

Manager, Next-Generation AI Cluster Architecture

NVIDIA

Santa Clara, California, United States (On-site)

$224k – $356.5k Yearly1w ago
NV

Circuit Calibration Design Engineer

NVIDIA

臺北市, Taipei, Taiwan (On-site)

1w ago
NV

Senior Developer Technology Engineer, CPU Performance

NVIDIA

Santa Clara, California, United States (Hybrid)

$152k – $287.5k Yearly1w ago
NV

Manager, SOC Modelling

NVIDIA

Santa Clara, California, United States (On-site)

$224k – $431.3k Yearly1w ago
NV

Principal Hardware Functional Safety Expert

NVIDIA

Santa Clara, California, United States (Hybrid)

$272k – $431.3k Yearly1w ago
NV

Technical Program Manager – Silicon Solutions

NVIDIA

Santa Clara, California, United States (Hybrid)

$136k – $258.8k Yearly1w ago
NV

Manager, Embedded System Software - GPU Firmware

NVIDIA

Santa Clara, California, United States (On-site)

$224k – $356.5k Yearly1w ago
NV

Senior AI Developer Technology Engineer, Financial Sector

NVIDIA

Santa Clara, California, United States (Hybrid)

$152k – $241.5k Yearly1w ago
NV

Manager, Operations ADAS

NVIDIA

Seoul, Seoul, South Korea (On-site)

1w ago
NV

RTL Design Engineer, DFT

NVIDIA

Yokneam Ilit, Northern District, Israel (On-site)

1w ago
NV

Senior Learning Partner

NVIDIA

Santa Clara, California, United States (On-site)

$136k – $270.3k Yearly1w ago
NV

Senior Integration Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-site)

1w ago
NV

Senior Software Engineer, DOCA Verification

NVIDIA

Yokneam Ilit, Northern District, Israel (Hybrid)

1w ago

Similar companies

NE

Nebius

Nebius is a Nasdaq-listed technology company (NBIS) building full-stack AI infrastructure from its Amsterdam headquarters, with GPU clusters deployed across Europe and the United States. Led by CEO Arkady Volozh, the company operates AI-optimized sustainable data centers - including a facility 60 kilometers from Helsinki and a new Vineland, New Jersey site - and has raised significant capital ($700 million from investors including Accel, NVIDIA, and Orbis). The engineering organization, numbering in the hundreds, maintains deep expertise in world-class infrastructure and runs an in-house AI R&D team that dogfoods the platform to validate it against production ML practitioner requirements. The infrastructure stack spans hyperscaler-scale features with supercomputer-grade performance characteristics. ISEG, Nebius's supercomputer, ranks among the world's most powerful systems. The platform integrates NVIDIA GPUs with NVIDIA InfiniBand networking, exposing workload orchestration through both Kubernetes and Slurm. The operational layer includes standard observability (Prometheus, Grafana), data infrastructure (PostgreSQL, Apache Spark), and ML tooling (MLflow, vLLM, Triton, Ray), with infrastructure-as-code managed via Terraform. This architecture targets the latency, throughput, and reliability requirements of AI training and inference workloads at scale. The company has secured a multi-billion dollar agreement with Microsoft to deliver dedicated AI infrastructure from its Vineland data center. Nebius serves startups, research institutes, and enterprises across healthcare and life sciences, robotics, finance, and entertainment verticals. The technical approach emphasizes production-grade infrastructure that handles the operational complexity of large-scale AI deployments - managing GPU utilization, network bottlenecks, and the cost-performance trade-offs inherent in serving diverse AI workloads from model training through inference serving.

233 jobs
CA

Cartesia

Cartesia builds real-time multimodal AI models for voice applications, with production systems spanning text-to-speech and speech-to-text. The company emerged from Stanford's AI Lab, where the founding team - led by CEO Karan Goel - pioneered work on State Space Models (SSMs) before transitioning to commercial infrastructure. Their technical approach combines model innovation with systems engineering, focusing on the latency, throughput, and operational constraints that define production voice AI. The core product line includes Sonic, a text-to-speech model designed for emotive, human-like output, and Ink, a recently launched speech-to-text system purpose-built for real-time voice applications. Both systems address the fundamental trade-offs in voice AI: achieving low-latency inference while maintaining quality at scale. The company's technical domains span foundation model development, real-time multimodal intelligence, and developer tooling - infrastructure that runs where users are rather than requiring server-side processing. Cartesia's engineering stack runs on Python, Go, and TypeScript, supporting developers building voice interfaces that demand sub-second response times and reliable performance under production load. The team's research background in SSMs informs their approach to model efficiency and scalability, though the company now focuses on shipping production systems rather than pure research. Their stated mission centers on ubiquitous, interactive intelligence - systems that handle the operational complexity of real-time voice while remaining accessible to developers building conversational interfaces.

25 jobs
GL

Gladia

Gladia operates speech-to-text APIs across two distinct workloads: real-time streaming at sub-300ms latency and asynchronous batch transcription, both supporting over 100 languages. The real-time path handles streaming audio with integrated speaker diarization, word-level timestamps, and sentiment analysis in the inference loop. The async path processes batch jobs with code-switching detection - single utterances spanning multiple languages - and comparable feature coverage. Over 150,000 users and 700 enterprise deployments (including VEED.IO, Circleback, Attention) generate production traffic against these endpoints. The core technical challenge is maintaining sub-300ms end-to-end latency on the streaming path while running diarization and alignment models alongside the primary ASR stack. Meeting this threshold at scale - across 100+ language models with varying acoustic characteristics - requires careful management of model load times, batching strategies, and inference queue depth. The async API trades latency tolerance for throughput optimization on longer-form audio, though specific cost-per-hour or throughput metrics are not disclosed. Code-switching introduces additional complexity: language detection, model routing, and boundary stitching must occur without degrading transcription accuracy or introducing alignment artifacts at switch points. Founded in 2022, the company raised $16 million Series A from Sequoia Capital, XAnge, and New Wave. Founders Jean-Louis Quéguiner and Jonathan Soto positioned the service as audio infrastructure for voice-first platforms rather than a narrow transcription tool. The engineering focus centers on reliability and operational predictability across multilingual inference workloads - handling acoustic variability, speaker overlap, background noise, and model version rollouts without service degradation. Production deployment at this user scale surfaces edge cases in language detection, diarization boundary errors, and latency tail behavior that define the system's actual robustness beyond benchmarked WER numbers.

1 job