Die KI-Suchmaschine für Jobs
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Optimizing large-scale inference pipelines on GPU architectures for AI/VR/AV customer solutions. Modern NLP/LLM architecture knowledge (transformer, diffusion) and DevOps tools (Docker, Kubernetes) proficiency required. Direct customer engagement on groundbreaking AI/VR/AV solutions.
Anforderungen
- MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or other Engineering fields
- 5+ years work or research experience with Python, C++, or other software development
- Work experience and knowledge of modern NLP including understanding of transformer, state space, diffusion, MOE model architectures
- Understanding of key libraries used for NLP/LLM training and/or deployment
- Proficient with DevOps tools including Docker, Kubernetes, and Singularity
- Demonstrated experience in running and debugging large-scale distributed deep learning training or inference processes
- Experience working with larger transformer-based architectures for NLP, CV, ASR, or other
- Applied NLP technology in production environments
- Enthusiasm for collaborating with various teams and departments
- Self-starter with demeanor for growth and passion for continuous learning
Aufgaben
- Work directly with key customers to understand their technology
- Provide optimal AI solutions for customer needs
- Analyze and optimize performance on GPU architecture systems
- Support optimization of large-scale inference pipelines
- Collaborate with Engineering, Product, and Sales teams
- Develop and plan suitable solutions based on customer requirements
- Gather customer feedback to enhance product features
- Conduct proof-of-concept evaluations
Berufserfahrung
- 5 Jahre
Ausbildung
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- TRT LLM
- vLLM
- SGLang
- Python
- C++
- Megatron-LM
- NeMo
- DeepSpeed
- TensorRT-LLM
- Triton Inference Server
- Docker
- Kubernetes
- Singularity
Noch nicht perfekt?
- NVIDIAVollzeitnur vor OrtBerufserfahrenZürich
- Red Hat (Switzerland) SARL
Senior Machine Learning Engineer - Red Hat Inference(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA Switzerland AG
Principal Software Architect, GPU Networking Research(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA
Senior Software Developer(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA Switzerland AG
Robotics Developer Relations Manager(m/w/x)
Vollzeitnur vor OrtSeniorZürich
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Optimizing large-scale inference pipelines on GPU architectures for AI/VR/AV customer solutions. Modern NLP/LLM architecture knowledge (transformer, diffusion) and DevOps tools (Docker, Kubernetes) proficiency required. Direct customer engagement on groundbreaking AI/VR/AV solutions.
Anforderungen
- MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or other Engineering fields
- 5+ years work or research experience with Python, C++, or other software development
- Work experience and knowledge of modern NLP including understanding of transformer, state space, diffusion, MOE model architectures
- Understanding of key libraries used for NLP/LLM training and/or deployment
- Proficient with DevOps tools including Docker, Kubernetes, and Singularity
- Demonstrated experience in running and debugging large-scale distributed deep learning training or inference processes
- Experience working with larger transformer-based architectures for NLP, CV, ASR, or other
- Applied NLP technology in production environments
- Enthusiasm for collaborating with various teams and departments
- Self-starter with demeanor for growth and passion for continuous learning
Aufgaben
- Work directly with key customers to understand their technology
- Provide optimal AI solutions for customer needs
- Analyze and optimize performance on GPU architecture systems
- Support optimization of large-scale inference pipelines
- Collaborate with Engineering, Product, and Sales teams
- Develop and plan suitable solutions based on customer requirements
- Gather customer feedback to enhance product features
- Conduct proof-of-concept evaluations
Berufserfahrung
- 5 Jahre
Ausbildung
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- TRT LLM
- vLLM
- SGLang
- Python
- C++
- Megatron-LM
- NeMo
- DeepSpeed
- TensorRT-LLM
- Triton Inference Server
- Docker
- Kubernetes
- Singularity
Über das Unternehmen
NVIDIA
Branche
IT
Beschreibung
The company is developing groundbreaking solutions in Virtual Reality, Artificial Intelligence, Deep Learning, and Autonomous Vehicles.
Noch nicht perfekt?
- NVIDIA
HPC and AI Software Architect(m/w/x)
Vollzeitnur vor OrtBerufserfahrenZürich - Red Hat (Switzerland) SARL
Senior Machine Learning Engineer - Red Hat Inference(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA Switzerland AG
Principal Software Architect, GPU Networking Research(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA
Senior Software Developer(m/w/x)
Vollzeitnur vor OrtSeniorZürich - NVIDIA Switzerland AG
Robotics Developer Relations Manager(m/w/x)
Vollzeitnur vor OrtSeniorZürich