1. Home
  2. Jobs
  3. China
  4. Shanghai
  5. AI/ML Engineer
  6. AI Computing Development Engineer, TensorRT and TensorRT-LLM
NVIDIA logoNV
NVIDIAnvidia.com

AI Computing Development Engineer, TensorRT and TensorRT-LLM

Shanghai, Shanghai, ChinaFull-time1d ago

NVIDIA is hiring software engineers for its AI Computing team. Academic and commercial groups around the world are using GPUs to power a revolution in deep learning-powered AI, enabling breakthroughs in areas like generative AI, computer vision, speech recognition, recommender systems, and large-scale language and multimodal models. Join the team building the inferencing software (TensorRT/TensorRT-LLM) that will be used across our product lines. The ability to work in a fast-paced, delivery-focused environment is required, and excellent interpersonal skills are a must.

What you'll be doing:

  • Design and develop robust inferencing software (TensorRT/TensorRT-LLM) optimized for functionality and performance across platforms

  • Perform performance analysis, optimization, and tuning of deep learning inference workloads

  • Track and integrate academic and industry advancements in AI and feature-update TensorRT/TensorRT-LLM accordingly

  • Provide feedback into architecture and hardware design and development

  • Collaborate across hardware, software, and research teams to shape the direction of machine learning inferencing across NVIDIA platforms

  • Own and deliver technical work with scope based on experience, ranging from complex features to substantial parts of larger projects, with increasing independence and technical leadership over time

  • Publish key technical results at leading scientific and engineering conferences

What we need to see:

  • Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics, or related computing-focused field (or equivalent experience)

  • Strong C/C++ or Python programming and software design experience, including debugging, performance profiling, and test design

  • 2+ years working experience

  • Strong curiosity about artificial intelligence and familiarity with the latest developments in deep learning — including generative models, multimodal systems, and large neural networks

  • Experience working with deep learning frameworks such as PyTorch, TensorRT/TensorRT-LLM, NeMo, or vLLM

  • Proactive, self-driven, and able to work independently

  • Excellent written and verbal communication skills in English

  • Demonstrated ability, commensurate with experience, to take technical ownership, solve complex problems, and contribute effectively in cross-functional environments

NVIDIA is widely considered to be one of technology’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. Does the idea of contributing to and pushing the boundaries of state-of-the-art AI and compute systems excite you? Interested in getting exposure to the entire deep learning software stack? Come join us and help build the GPU-accelerated AI platform used worldwide.