The AI Job Search Engine
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Optimizing large-scale inference pipelines on GPU architectures for AI/VR/AV customer solutions. Modern NLP/LLM architecture knowledge (transformer, diffusion) and DevOps tools (Docker, Kubernetes) proficiency required. Direct customer engagement on groundbreaking AI/VR/AV solutions.
Requirements
- MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or other Engineering fields
- 5+ years work or research experience with Python, C++, or other software development
- Work experience and knowledge of modern NLP including understanding of transformer, state space, diffusion, MOE model architectures
- Understanding of key libraries used for NLP/LLM training and/or deployment
- Proficient with DevOps tools including Docker, Kubernetes, and Singularity
- Demonstrated experience in running and debugging large-scale distributed deep learning training or inference processes
- Experience working with larger transformer-based architectures for NLP, CV, ASR, or other
- Applied NLP technology in production environments
- Enthusiasm for collaborating with various teams and departments
- Self-starter with demeanor for growth and passion for continuous learning
Tasks
- Work directly with key customers to understand their technology
- Provide optimal AI solutions for customer needs
- Analyze and optimize performance on GPU architecture systems
- Support optimization of large-scale inference pipelines
- Collaborate with Engineering, Product, and Sales teams
- Develop and plan suitable solutions based on customer requirements
- Gather customer feedback to enhance product features
- Conduct proof-of-concept evaluations
Work Experience
- 5 years
Education
- Master's degree
Languages
- English – Business Fluent
Tools & Technologies
- TRT LLM
- vLLM
- SGLang
- Python
- C++
- Megatron-LM
- NeMo
- DeepSpeed
- TensorRT-LLM
- Triton Inference Server
- Docker
- Kubernetes
- Singularity
Not a perfect match?
- NVIDIAFull-timeOn-siteExperiencedZürich
- Red Hat (Switzerland) SARL
Senior Machine Learning Engineer - Red Hat Inference(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA Switzerland AG
Principal Software Architect, GPU Networking Research(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
Senior Software Developer(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA Switzerland AG
Robotics Developer Relations Manager(m/w/x)
Full-timeOn-siteSeniorZürich
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Optimizing large-scale inference pipelines on GPU architectures for AI/VR/AV customer solutions. Modern NLP/LLM architecture knowledge (transformer, diffusion) and DevOps tools (Docker, Kubernetes) proficiency required. Direct customer engagement on groundbreaking AI/VR/AV solutions.
Requirements
- MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or other Engineering fields
- 5+ years work or research experience with Python, C++, or other software development
- Work experience and knowledge of modern NLP including understanding of transformer, state space, diffusion, MOE model architectures
- Understanding of key libraries used for NLP/LLM training and/or deployment
- Proficient with DevOps tools including Docker, Kubernetes, and Singularity
- Demonstrated experience in running and debugging large-scale distributed deep learning training or inference processes
- Experience working with larger transformer-based architectures for NLP, CV, ASR, or other
- Applied NLP technology in production environments
- Enthusiasm for collaborating with various teams and departments
- Self-starter with demeanor for growth and passion for continuous learning
Tasks
- Work directly with key customers to understand their technology
- Provide optimal AI solutions for customer needs
- Analyze and optimize performance on GPU architecture systems
- Support optimization of large-scale inference pipelines
- Collaborate with Engineering, Product, and Sales teams
- Develop and plan suitable solutions based on customer requirements
- Gather customer feedback to enhance product features
- Conduct proof-of-concept evaluations
Work Experience
- 5 years
Education
- Master's degree
Languages
- English – Business Fluent
Tools & Technologies
- TRT LLM
- vLLM
- SGLang
- Python
- C++
- Megatron-LM
- NeMo
- DeepSpeed
- TensorRT-LLM
- Triton Inference Server
- Docker
- Kubernetes
- Singularity
About the Company
NVIDIA
Industry
IT
Description
The company is developing groundbreaking solutions in Virtual Reality, Artificial Intelligence, Deep Learning, and Autonomous Vehicles.
Not a perfect match?
- NVIDIA
HPC and AI Software Architect(m/w/x)
Full-timeOn-siteExperiencedZürich - Red Hat (Switzerland) SARL
Senior Machine Learning Engineer - Red Hat Inference(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA Switzerland AG
Principal Software Architect, GPU Networking Research(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
Senior Software Developer(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA Switzerland AG
Robotics Developer Relations Manager(m/w/x)
Full-timeOn-siteSeniorZürich