NVIDIA is looking for a Senior Reliability Engineer to join our LPU packaging team.
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. We've built an incredible legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing, an era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand and engage with the world! Doing what’s never been done before takes vision, innovation, and the world’s best talent.
As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their life's work. Come join the team and see how you can make a lasting impact on the world!
What you’ll be doing:
Own the package‑level reliability spec for assigned products
Define qualification requirements and pass/fail criteria for device/package‑level reliability (e.g., HTSL, TCT, UHAST, pre‑conditioning, JESD22 methods) and Package‑on‑board / board‑level reliability (thermal cycling, shock/vibration, connector/cage interactions)
Leads materials and stackup selection (substrate, solder alloy, underfill, mold, lid, TIM, etc.) and DFR trade‑offs for new packages and 2nd sources
Evaluates thermo‑mechanical and SI/PI impact of those choices
Analyzes qual and stress data (including HTOL, package qual, SLT/system stress) and convert to design / process/ material changes for next revision.
Understands how the component FIT rolls up from component level to system level and overall availability requirement
What we need to see:
MS/PhD in Electrical Engineering, Materials Science, Mechanical Engineering, or related field, or equivalent experience.
8+ years of relevant expiernce
Strong background in IC packaging and board level reliability, with hands on experience in:
Experience with BGA/FCBGA, 2.5D/3D integration, HBM or similar high power, high pin count packages
Background with JEDEC device/package reliability standards (e.g., HTSL, TCT, UHAST, pre con, board level solder reliability)
Knowledge in system‑level stress or SLT and interpreting logs/telemetry for reliability analysis.
Solid understanding of FIT, MTTF/MTBF, AFR, and RAS concepts
Ways to stand out from the crowd:
Demonstrated ability to lead cross‑functional efforts across design, package R&D, SI/PI, validation, QA, and operations.
Experience with data center or cloud hardware and understanding of: Rack and cluster‑level availability targets and constraints.
Familiar with how component FIT drive sparing, repair rate, and availability planning
Strong communication: Able to present trade‑offs and reliability risks clearly to both technical and program stakeholders.
With competitive salaries and a generous benefits package, NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most brilliant people in the world working for us and, due to unprecedented growth, our teams are rapidly growing. Are you passionate about becoming a part of a best-in-class team supporting the latest in GPU and AI technology? If so, we want to hear from you.
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 168,000 USD - 264,500 USD for Level 4, and 196,000 USD - 310,500 USD for Level 5.You will also be eligible for equity and benefits.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.