Die KI-Suchmaschine für Jobs
HPC and AI Software Architect(m/w/x)
Optimizing distributed AI training and inference systems, enhancing communication libraries (NCCL, UCX, UCC) and co-designing hardware for data movement acceleration for AI/VR solutions. Ph.D. or equivalent industry experience in computer science, with 2+ years in high-performance data movement or distributed computing, required. Direct impact on groundbreaking AI/VR/Autonomous Vehicle solutions.
Anforderungen
- Ph.D. or equivalent industry experience in computer science, computer engineering, or a closely related field
- 2+ years of experience in systems programming, parallel or distributed computing, or high-performance data movement
- Strong programming background in C++, Python, and ideally CUDA or other GPU programming models
- Practical experience with AI frameworks (e.g., PyTorch, TensorFlow) and familiarity with communication libraries
- Experience in designing or optimizing software for high-throughput, low-latency systems
- Strong collaboration skills in a multi-national, interdisciplinary environment
- Expertise with NCCL, Gloo, UCX, or similar libraries used in distributed AI workloads
- Background in networking and communication protocols, RDMA, collective communications, or accelerator-aware networking
- Deep understanding of large model training, inference serving at scale, and associated communication bottlenecks
- Knowledge of quantization, tensor/activation fusion, or memory optimization for inference
- Familiarity with infrastructure for deployment of LLMs or transformer-based models, including sharding, pipelining, or hybrid parallelism
Aufgaben
- Design and prototype scalable software systems for distributed AI training and inference
- Optimize throughput, latency, and memory efficiency
- Develop and evaluate enhancements to communication libraries like NCCL, UCX, and UCC
- Collaborate with AI framework teams to improve communication backend integration and performance
- Co-design hardware features to accelerate data movement for inference and model serving
- Contribute to the evolution of runtime systems and AI-specific protocol layers
Berufserfahrung
- 2 Jahre
Ausbildung
- Doktor / Ph.D.
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- C++
- Python
- CUDA
- PyTorch
- TensorFlow
- NCCL
- Gloo
- UCX
Noch nicht perfekt?
- NVIDIA Switzerland AGVollzeitPraktikumnur vor OrtZürich
- NVIDIA Switzerland AG
Principal Software Architect, GPU Networking Research(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA Switzerland AG
Research Scientist, ML Systems - PhD New College Grad(m/w/x)
Vollzeitnur vor OrtBerufserfahrenZürich - Hewlett Packard Enterprise
Research Engineer HPC/AI Focus Daedalus System(m/w/x)
Vollzeitnur vor OrtBerufserfahrenBasel, Zürich
HPC and AI Software Architect(m/w/x)
Optimizing distributed AI training and inference systems, enhancing communication libraries (NCCL, UCX, UCC) and co-designing hardware for data movement acceleration for AI/VR solutions. Ph.D. or equivalent industry experience in computer science, with 2+ years in high-performance data movement or distributed computing, required. Direct impact on groundbreaking AI/VR/Autonomous Vehicle solutions.
Anforderungen
- Ph.D. or equivalent industry experience in computer science, computer engineering, or a closely related field
- 2+ years of experience in systems programming, parallel or distributed computing, or high-performance data movement
- Strong programming background in C++, Python, and ideally CUDA or other GPU programming models
- Practical experience with AI frameworks (e.g., PyTorch, TensorFlow) and familiarity with communication libraries
- Experience in designing or optimizing software for high-throughput, low-latency systems
- Strong collaboration skills in a multi-national, interdisciplinary environment
- Expertise with NCCL, Gloo, UCX, or similar libraries used in distributed AI workloads
- Background in networking and communication protocols, RDMA, collective communications, or accelerator-aware networking
- Deep understanding of large model training, inference serving at scale, and associated communication bottlenecks
- Knowledge of quantization, tensor/activation fusion, or memory optimization for inference
- Familiarity with infrastructure for deployment of LLMs or transformer-based models, including sharding, pipelining, or hybrid parallelism
Aufgaben
- Design and prototype scalable software systems for distributed AI training and inference
- Optimize throughput, latency, and memory efficiency
- Develop and evaluate enhancements to communication libraries like NCCL, UCX, and UCC
- Collaborate with AI framework teams to improve communication backend integration and performance
- Co-design hardware features to accelerate data movement for inference and model serving
- Contribute to the evolution of runtime systems and AI-specific protocol layers
Berufserfahrung
- 2 Jahre
Ausbildung
- Doktor / Ph.D.
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- C++
- Python
- CUDA
- PyTorch
- TensorFlow
- NCCL
- Gloo
- UCX
Über das Unternehmen
NVIDIA
Branche
IT
Beschreibung
The company is developing groundbreaking solutions in Virtual Reality, Artificial Intelligence, Deep Learning, and Autonomous Vehicles.
Noch nicht perfekt?
- NVIDIA Switzerland AG
HPC and AI Software Architecture Intern(m/w/x)
VollzeitPraktikumnur vor OrtZürich - NVIDIA Switzerland AG
Principal Software Architect, GPU Networking Research(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA Switzerland AG
Research Scientist, ML Systems - PhD New College Grad(m/w/x)
Vollzeitnur vor OrtBerufserfahrenZürich - Hewlett Packard Enterprise
Research Engineer HPC/AI Focus Daedalus System(m/w/x)
Vollzeitnur vor OrtBerufserfahrenBasel, Zürich