Dein persönlicher KI-Karriere-Agent
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Optimizing large-scale inference pipelines on GPU architectures for AI/VR/AV customer solutions. Modern NLP/LLM architecture knowledge (transformer, diffusion) and DevOps tools (Docker, Kubernetes) proficiency required. Direct customer engagement on groundbreaking AI/VR/AV solutions.
Anforderungen
- MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or other Engineering fields
- 5+ years work or research experience with Python, C++, or other software development
- Work experience and knowledge of modern NLP including understanding of transformer, state space, diffusion, MOE model architectures
- Understanding of key libraries used for NLP/LLM training and/or deployment
- Proficient with DevOps tools including Docker, Kubernetes, and Singularity
- Demonstrated experience in running and debugging large-scale distributed deep learning training or inference processes
- Experience working with larger transformer-based architectures for NLP, CV, ASR, or other
- Applied NLP technology in production environments
- Enthusiasm for collaborating with various teams and departments
- Self-starter with demeanor for growth and passion for continuous learning
Aufgaben
- Work directly with key customers to understand their technology
- Provide optimal AI solutions for customer needs
- Analyze and optimize performance on GPU architecture systems
- Support optimization of large-scale inference pipelines
- Collaborate with Engineering, Product, and Sales teams
- Develop and plan suitable solutions based on customer requirements
- Gather customer feedback to enhance product features
- Conduct proof-of-concept evaluations
Berufserfahrung
- 5 Jahre
Ausbildung
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- TRT LLM
- vLLM
- SGLang
- Python
- C++
- Megatron-LM
- NeMo
- DeepSpeed
- TensorRT-LLM
- Triton Inference Server
- Docker
- Kubernetes
- Singularity
Gefällt dir diese Stelle?
BetaDein Career Agent findet täglich ähnliche Jobs für dich.
Noch nicht perfekt?
- NVIDIA Switzerland AGVollzeitnur vor OrtSeniorZürich
- NVIDIA Switzerland AG
Deep Learning Engineer, LLM Accuracy Evaluation(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA
Senior GPU Networking Architect(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA
HPC and AI Software Architect(m/w/x)
Vollzeitnur vor OrtBerufserfahrenZürich - NVIDIA Switzerland AG
Senior HPC and AI Network Software Architect(m/w/x)
Vollzeitnur vor OrtSeniorZürich
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Optimizing large-scale inference pipelines on GPU architectures for AI/VR/AV customer solutions. Modern NLP/LLM architecture knowledge (transformer, diffusion) and DevOps tools (Docker, Kubernetes) proficiency required. Direct customer engagement on groundbreaking AI/VR/AV solutions.
Anforderungen
- MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or other Engineering fields
- 5+ years work or research experience with Python, C++, or other software development
- Work experience and knowledge of modern NLP including understanding of transformer, state space, diffusion, MOE model architectures
- Understanding of key libraries used for NLP/LLM training and/or deployment
- Proficient with DevOps tools including Docker, Kubernetes, and Singularity
- Demonstrated experience in running and debugging large-scale distributed deep learning training or inference processes
- Experience working with larger transformer-based architectures for NLP, CV, ASR, or other
- Applied NLP technology in production environments
- Enthusiasm for collaborating with various teams and departments
- Self-starter with demeanor for growth and passion for continuous learning
Aufgaben
- Work directly with key customers to understand their technology
- Provide optimal AI solutions for customer needs
- Analyze and optimize performance on GPU architecture systems
- Support optimization of large-scale inference pipelines
- Collaborate with Engineering, Product, and Sales teams
- Develop and plan suitable solutions based on customer requirements
- Gather customer feedback to enhance product features
- Conduct proof-of-concept evaluations
Berufserfahrung
- 5 Jahre
Ausbildung
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- TRT LLM
- vLLM
- SGLang
- Python
- C++
- Megatron-LM
- NeMo
- DeepSpeed
- TensorRT-LLM
- Triton Inference Server
- Docker
- Kubernetes
- Singularity
Gefällt dir diese Stelle?
BetaDein Career Agent findet täglich ähnliche Jobs für dich.
Über das Unternehmen
NVIDIA
Branche
IT
Beschreibung
The company is developing groundbreaking solutions in Virtual Reality, Artificial Intelligence, Deep Learning, and Autonomous Vehicles.
Noch nicht perfekt?
- NVIDIA Switzerland AG
Solutions Architect, Cloud Inference Services(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA Switzerland AG
Deep Learning Engineer, LLM Accuracy Evaluation(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA
Senior GPU Networking Architect(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA
HPC and AI Software Architect(m/w/x)
Vollzeitnur vor OrtBerufserfahrenZürich - NVIDIA Switzerland AG
Senior HPC and AI Network Software Architect(m/w/x)
Vollzeitnur vor OrtSeniorZürich