Compute Architect - Shanghai, China

Shanghai, China

3 days ago

Compute Architect

Nvidia

Are you passionate about compiler technology and computer architectures for deep learning? Do you thrive at the intersection of hardware and software? NVIDIA is seeking world-class compiler engineers and performance architects who are excited to push the boundaries of machine learning infrastructure. In this role, you will develop and optimize MLIR-based compiler infrastructure that powers our deep learning libraries and influences the direction of future GPU architectures. This position offers the opportunity to make a significant impact in a fast-moving, technology-focused company.

What You'll Be Doing:

Design, implement, and optimize MLIR-based compiler passes for deep learning and data analytics workloads.

Analyze and improve the performance of machine learning and deep learning algorithms on current and next-generation architectures with compiler technologies.

Identify performance bottlenecks in compiler-generated code and propose creative solutions.

Collaborate with hardware architects and software teams to co-design features that maximize performance and efficiency.

Contribute to the evolution of NVIDIA’s deep learning compiler stack and libraries.

What We Need to See:

MS or PhD in Computer Science, Electrical Engineering, Mathematics, or a related field, or equivalent experience.

5+ years of working experience

Proven experience developing compilers or compiler infrastructure, preferably with MLIR, LLVM, or similar frameworks.

Strong programming skills in C++ and Python.

Solid understanding of computer architecture, especially as it relates to performance optimization.

Experience optimizing code for CPUs or GPUs, including low-level programming (assembly, SIMD, or vectorization).

Experience with deep learning algorithms, especially matrix multiplication and convolution.

Ways to Stand Out from the Crowd:

Hands-on experience with MLIR, LLVM, or other modern compiler frameworks.

Deep understanding of parallel programming models and GPU architectures.

Strong communication and organizational skills.

Demonstrated ability to work collaboratively in a fast-paced, cross-functional environment.

If you are excited about building the next generation of machine learning compilers and want to work with world-class teams at the forefront of AI and hardware innovation, we want to hear from you!

#deeplearning

Save & Apply Later Applying Later... Click to ApplyI AppliedDidn't Apply

Confirm your E-mail: Send Email

Apply for this job

Next Job »

All Jobs from Nvidia

74 Nvidia jobs in Shanghai