Your personal AI career agent
Senior System Software Engineer, NCCL - Partner Enablement(m/w/x)
Resolving NCCL issues and performance analysis for AI/HPC applications. C/C++ and Linux expertise required. Work on new system tools and HPC methodologies.
Requirements
- B.S./M.S. degree in CS/CE or equivalent experience
- Excellent C/C++ programming skills
- Experience with engineering or academic research community supporting HPC or AI
- Practical experience with high performance networking
- Expert in Linux fundamentals and scripting language
- Familiar with containers, cloud provisioning and scheduling tools
- Adaptability and passion to learn new areas and tools
- Flexibility to work across different teams and timezones
- Experience conducting performance benchmarking and developing infrastructure on HPC clusters
- Prior system administration experience, esp for large clusters
- Experience debugging network configuration issues in large scale deployments
- Familiarity with CUDA programming and/or GPUs
- Good understanding of Machine Learning concepts
- Experience with Deep Learning Frameworks
- Deep understanding of technology
- Passionate about work
- Experience with parallel programming
- Experience with at least one communication runtime
Tasks
- Engage with partners and customers to resolve NCCL issues
- Conduct performance analysis of NCCL and DL applications
- Develop tools for isolating issues on new systems
- Support customers and teams on HPC methodologies
- Document NCCL processes and best practices
- Conduct trainings and webinars on NCCL
- Collaborate with internal teams across time zones
- Guide customers on running applications in multi-node clusters
- Automate issue isolation on cloud platforms
Work Experience
- 5 years
Education
- Master's degree
Languages
- English – Native
Tools & Technologies
- C/C++
- MPI
- NCCL
- UCX
- NVSHMEM
- Linux
- Python
- Docker
- Docker Swarm
- Kubernetes
- SLURM
- Ansible
- CUDA
- GPUs
- PyTorch
- TensorFlow
Benefits
Competitive Pay
- Highly competitive salaries
Other Benefits
- Extensive benefits package
Informal Culture
- Work environment that promotes diversity
- Work environment that promotes inclusion
Flexible Working
- Work environment that promotes flexibility
Like this job?
BetaYour Career Agent finds similar jobs for you every day.
Not a perfect match?
- NVIDIA Switzerland AGFull-timeOn-siteSeniorZürich
- NVIDIA
Senior GPU Networking Architect(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
HPC and AI Software Architect(m/w/x)
Full-timeOn-siteExperiencedZürich - NVIDIA Switzerland AG
Principal Software Architect, GPU Networking Research(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Full-timeOn-siteSeniorZürich
Senior System Software Engineer, NCCL - Partner Enablement(m/w/x)
Resolving NCCL issues and performance analysis for AI/HPC applications. C/C++ and Linux expertise required. Work on new system tools and HPC methodologies.
Requirements
- B.S./M.S. degree in CS/CE or equivalent experience
- Excellent C/C++ programming skills
- Experience with engineering or academic research community supporting HPC or AI
- Practical experience with high performance networking
- Expert in Linux fundamentals and scripting language
- Familiar with containers, cloud provisioning and scheduling tools
- Adaptability and passion to learn new areas and tools
- Flexibility to work across different teams and timezones
- Experience conducting performance benchmarking and developing infrastructure on HPC clusters
- Prior system administration experience, esp for large clusters
- Experience debugging network configuration issues in large scale deployments
- Familiarity with CUDA programming and/or GPUs
- Good understanding of Machine Learning concepts
- Experience with Deep Learning Frameworks
- Deep understanding of technology
- Passionate about work
- Experience with parallel programming
- Experience with at least one communication runtime
Tasks
- Engage with partners and customers to resolve NCCL issues
- Conduct performance analysis of NCCL and DL applications
- Develop tools for isolating issues on new systems
- Support customers and teams on HPC methodologies
- Document NCCL processes and best practices
- Conduct trainings and webinars on NCCL
- Collaborate with internal teams across time zones
- Guide customers on running applications in multi-node clusters
- Automate issue isolation on cloud platforms
Work Experience
- 5 years
Education
- Master's degree
Languages
- English – Native
Tools & Technologies
- C/C++
- MPI
- NCCL
- UCX
- NVSHMEM
- Linux
- Python
- Docker
- Docker Swarm
- Kubernetes
- SLURM
- Ansible
- CUDA
- GPUs
- PyTorch
- TensorFlow
Benefits
Competitive Pay
- Highly competitive salaries
Other Benefits
- Extensive benefits package
Informal Culture
- Work environment that promotes diversity
- Work environment that promotes inclusion
Flexible Working
- Work environment that promotes flexibility
Like this job?
BetaYour Career Agent finds similar jobs for you every day.
About the Company
CH01 NVIDIA Switzerland AG
Industry
IT
Description
NVIDIA has been defining computer graphics, PC gaming, and accelerated computing for more than 25 years.
Not a perfect match?
- NVIDIA Switzerland AG
Senior HPC and AI Network Software Architect(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
Senior GPU Networking Architect(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
HPC and AI Software Architect(m/w/x)
Full-timeOn-siteExperiencedZürich - NVIDIA Switzerland AG
Principal Software Architect, GPU Networking Research(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Full-timeOn-siteSeniorZürich