The AI Job Search Engine
HPC and AI Software Architect(m/w/x)
Description
In this role, you will design scalable software systems for AI training and inference while optimizing performance. You will collaborate with AI framework teams and contribute to hardware design, ensuring efficient data movement and enhancing communication libraries.
Let AI find the perfect jobs for you!
Upload your CV and Nejo AI will find matching job offers for you.
Requirements
- •Ph.D. or equivalent industry experience in computer science, computer engineering, or a closely related field
- •2+ years of experience in systems programming, parallel or distributed computing, or high-performance data movement
- •Strong programming background in C++, Python, and ideally CUDA or other GPU programming models
- •Practical experience with AI frameworks (e.g., PyTorch, TensorFlow) and familiarity with communication libraries
- •Experience in designing or optimizing software for high-throughput, low-latency systems
- •Strong collaboration skills in a multi-national, interdisciplinary environment
- •Expertise with NCCL, Gloo, UCX, or similar libraries used in distributed AI workloads
- •Background in networking and communication protocols, RDMA, collective communications, or accelerator-aware networking
- •Deep understanding of large model training, inference serving at scale, and associated communication bottlenecks
- •Knowledge of quantization, tensor/activation fusion, or memory optimization for inference
- •Familiarity with infrastructure for deployment of LLMs or transformer-based models, including sharding, pipelining, or hybrid parallelism
Education
Work Experience
2 years
Tasks
- •Design and prototype scalable software systems for distributed AI training and inference
- •Optimize throughput, latency, and memory efficiency
- •Develop and evaluate enhancements to communication libraries like NCCL, UCX, and UCC
- •Collaborate with AI framework teams to improve communication backend integration and performance
- •Co-design hardware features to accelerate data movement for inference and model serving
- •Contribute to the evolution of runtime systems and AI-specific protocol layers
Tools & Technologies
Languages
English – Business Fluent
- NVIDIA Switzerland AGFull-timeOn-siteSeniorZürich
- NVIDIA
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA Switzerland AG
Research Scientist, ML Systems - PhD New College Grad(m/w/x)
Full-timeOn-siteExperiencedZürich - NVIDIA
Senior Software Developer(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
Compute DevTech Engineer(m/w/x)
Full-timeOn-siteSeniorZürich
HPC and AI Software Architect(m/w/x)
The AI Job Search Engine
Description
In this role, you will design scalable software systems for AI training and inference while optimizing performance. You will collaborate with AI framework teams and contribute to hardware design, ensuring efficient data movement and enhancing communication libraries.
Let AI find the perfect jobs for you!
Upload your CV and Nejo AI will find matching job offers for you.
Requirements
- •Ph.D. or equivalent industry experience in computer science, computer engineering, or a closely related field
- •2+ years of experience in systems programming, parallel or distributed computing, or high-performance data movement
- •Strong programming background in C++, Python, and ideally CUDA or other GPU programming models
- •Practical experience with AI frameworks (e.g., PyTorch, TensorFlow) and familiarity with communication libraries
- •Experience in designing or optimizing software for high-throughput, low-latency systems
- •Strong collaboration skills in a multi-national, interdisciplinary environment
- •Expertise with NCCL, Gloo, UCX, or similar libraries used in distributed AI workloads
- •Background in networking and communication protocols, RDMA, collective communications, or accelerator-aware networking
- •Deep understanding of large model training, inference serving at scale, and associated communication bottlenecks
- •Knowledge of quantization, tensor/activation fusion, or memory optimization for inference
- •Familiarity with infrastructure for deployment of LLMs or transformer-based models, including sharding, pipelining, or hybrid parallelism
Education
Work Experience
2 years
Tasks
- •Design and prototype scalable software systems for distributed AI training and inference
- •Optimize throughput, latency, and memory efficiency
- •Develop and evaluate enhancements to communication libraries like NCCL, UCX, and UCC
- •Collaborate with AI framework teams to improve communication backend integration and performance
- •Co-design hardware features to accelerate data movement for inference and model serving
- •Contribute to the evolution of runtime systems and AI-specific protocol layers
Tools & Technologies
Languages
English – Business Fluent
About the Company
NVIDIA
Industry
IT
Description
The company is developing groundbreaking solutions in Virtual Reality, Artificial Intelligence, Deep Learning, and Autonomous Vehicles.
- NVIDIA Switzerland AG
Principal Software Architect, GPU Networking Research(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA Switzerland AG
Research Scientist, ML Systems - PhD New College Grad(m/w/x)
Full-timeOn-siteExperiencedZürich - NVIDIA
Senior Software Developer(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
Compute DevTech Engineer(m/w/x)
Full-timeOn-siteSeniorZürich