1. Home
  2. Jobs
  3. Israel
  4. Northern District
  5. Yokne'am
  6. DevOps
  7. Senior Performance Engineer
NV
NVIDIAnvidia.com

Senior Performance Engineer

Yokne'am, Northern District, IsraelFull-time9h ago

NVIDIA is seeking a highly skilled Senior Performance Engineer to join our Performance and R&D organizations. In this role, you will help build and evolve systems that support performance analysis, telemetry, and optimization for large-scale GPU- and CPU-based clusters used in AI and high-performance computing environments. You will work closely with hardware, networking, firmware, and software teams to collect, analyze, and interpret performance data from live systems. This is a fast-paced R&D environment where system behavior and requirements evolve rapidly, requiring adaptable engineering solutions and strong analytical thinking.

What you’ll be doing:

  • Profile, benchmark, and analyze AI and HPC workloads on GPU and CPU clusters

  • Explore performance characteristics of high-performance networking and collective communications (e.g., NCCL, RDMA, MPI, RoCE)

  • Identify performance bottlenecks across networking, compute, memory, and system architecture

  • Develop and enhance performance analysis, benchmarking, and diagnostic tools

  • Define performance test plans and establish expectations for new technologies and platforms

  • Collaborate across hardware, firmware, networking, systems, and software teams to provide actionable performance insights

  • Support telemetry collection and data refinement efforts to enable accurate performance analysis

  • Maintain high standards for data quality, reproducibility, and traceability of performance results

What we need to see:

  • B.Sc. or M.Sc. in Computer Science, Computer Engineering, Software Engineering, or equivalent experience

  • 5+ years of experience in performance analysis, systems engineering, or HPC/AI infrastructure

  • Demonstrated expertise in performance analysis skills and methodologies

  • Hands-on experience with high-performance networking (RDMA, MPI, NCCL, congestion control)

  • Strong understanding of system performance metrics (latency, throughput, resource utilization)

  • Exposure to hardware, firmware, or embedded telemetry environments

  • Strong analytical, problem-solving, and communication skills

  • Ability to work effectively in cross-functional, fast-paced R&D teams

Ways to stand out from the crowd:

  • Knowledge of CUDA, NCCL internals, and congestion control algorithms

  • Deep system-level understanding of CPU architectures, GPUs, HCAs, memory, and PCIe

  • Experience with NVIDIA GPUs, CUDA, and deep learning frameworks such as PyTorch or TensorFlow

  • Experience with cloud platforms 

  • Proficiency in Python; experience with Bash and C/C++ is a plus as well as a strong experience working in Linux environments