Post your job offer for free on H1BConnect with no upfront cost!

Logo

Hire with Us
NVIDIA Corporation logo

Senior Software Engineer, AI Resiliency

NVIDIA Corporation

3/27/2025

US, CA, Santa Clara

Full-time

Salary: $184,000 - $287,500 per year


Job Description

NVIDIA is seeking a Senior Software Engineer for AI Resiliency to lead the development of AI software resiliency for AI supercomputers at a massive scale.

Requirements

  • Bachelor’s, Master’s or PhD in Computer Science, Electrical Engineering, or a related field, or equivalent experience
  • Proficiency in C++ and Python
  • 6+ years of relevant experience
  • Strong understanding of distributed systems concepts, parallel programming, and fault tolerance in large-scale computing environments
  • Familiarity with AI frameworks such as PyTorch, JAX/XLA, TensorFlow, or similar
  • Experience with debugging and profiling tools
  • Excellent problem-solving skills and ability to work in a fast-paced, highly collaborative environment

Responsibilities

  • Develop AI Software Resiliency Features
  • Hands-On Coding & Optimization
  • Fault Tolerance & Debugging
  • Collaborate Across Teams
  • Testing & Automation
  • Support Production Deployments

Benefits

  • Multiple relocation packages
  • Two weeklong shutdowns (mid-summer and year-end) in the US (in addition to PTO)
  • 8-week parental leave
  • 9 Employee Resource Groups
  • Annual bonus offering
  • Flexible work arrangements
  • Up to 6% 401K matching
Logo

© 2024 H1BConnect. All rights reserved.

Check out our sister site LatamDev for tech jobs in Latin America! 🌎