New Job?Nejo!

Your personal AI career agent

NVNVIDIA

5mo ago

Deep Learning Solutions Architect – Inference Optimization(m/w/x)

Zürich

Full-timeOn-siteSenior

AI/ML

Data Science

Nejo AI Summary

Apply now

Optimizing large-scale inference pipelines on GPU architectures for AI/VR/AV customer solutions. Modern NLP/LLM architecture knowledge (transformer, diffusion) and DevOps tools (Docker, Kubernetes) proficiency required. Direct customer engagement on groundbreaking AI/VR/AV solutions.

Requirements

MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or other Engineering fields
5+ years work or research experience with Python, C++, or other software development
Work experience and knowledge of modern NLP including understanding of transformer, state space, diffusion, MOE model architectures
Understanding of key libraries used for NLP/LLM training and/or deployment
Proficient with DevOps tools including Docker, Kubernetes, and Singularity
Demonstrated experience in running and debugging large-scale distributed deep learning training or inference processes
Experience working with larger transformer-based architectures for NLP, CV, ASR, or other
Applied NLP technology in production environments
Enthusiasm for collaborating with various teams and departments
Self-starter with demeanor for growth and passion for continuous learning

Tasks

Work directly with key customers to understand their technology
Provide optimal AI solutions for customer needs
Analyze and optimize performance on GPU architecture systems
Support optimization of large-scale inference pipelines
Collaborate with Engineering, Product, and Sales teams
Develop and plan suitable solutions based on customer requirements
Gather customer feedback to enhance product features
Conduct proof-of-concept evaluations

Work Experience

5 years

Education

Master's degree

Languages

English – Business Fluent

Tools & Technologies

TRT LLM
vLLM
SGLang
Python
C++
Megatron-LM
NeMo
DeepSpeed
TensorRT-LLM
Triton Inference Server
Docker
Kubernetes
Singularity

Find the original job posting in its most current version here. Nejo automatically captured this job from the website of NVIDIA and processed the information on Nejo with the help of AI for you. Despite careful analysis, some information may be incomplete or inaccurate. Please always verify all details in the original posting! Content and copyrights of the original posting belong to the advertising company.

Like this job?

Beta

Your Career Agent finds similar jobs for you every day.

Not a perfect match?

100+ Similar Jobs in Zürich View all

NVIDIA Switzerland AG
Solutions Architect, Cloud Inference Services(m/w/x)
Full-timeOn-siteSenior
Zürich
NVIDIA Switzerland AG
Deep Learning Engineer, LLM Accuracy Evaluation(m/w/x)
Full-timeOn-siteSenior
Zürich
NVIDIA
Senior GPU Networking Architect(m/w/x)
Full-timeOn-siteSenior
Zürich
NVIDIA
HPC and AI Software Architect(m/w/x)
Full-timeOn-siteExperienced
Zürich
NVIDIA Switzerland AG
Senior HPC and AI Network Software Architect(m/w/x)
Full-timeOn-siteSenior
Zürich

View all 100+ similar jobs

NVNVIDIA

5mo ago