Your personal AI career agent
Solutions Architect, Cloud Inference Services(m/w/x)
Deploying E2E AI solutions for NVIDIA Cloud Partners, focusing on LLMs and Agentic Pipelines. Master's or Ph.D. in CS/AI or equivalent experience required. Support for AI services from training to inference.
Requirements
- Excellent verbal and written communication skills
- Excellent technical presentation skills in English
- Master's or Ph.D. in Computer Science or AI
- Equivalent experience in Computer Science or AI
- 5+ years industry/academic experience in ML/DL/Data Science
- Preference for DNN inference experience
- Work experience with modern LLM, VLM, diffusion architectures
- Emphasis on MoE architectures
- Understanding of DNN inference libraries
- Understanding of agentic pipeline development
- Excited to work with multiple levels and teams
- Collaboration with Engineering, Product, Sales, Marketing
- Strong analytical skills
- Strong problem-solving skills
- Self-starter drive for growth
- Passion for continuous learning
- Sharing findings across the team
- Strong time-management skills
- Strong organization skills
- Coordinating multiple initiatives and priorities
- Implementing new technology and products
- Experience with inference of large MoE architectures
- Experience with NLP, CV, or ASR inference
- Experience using DevOps technologies
- Understanding of HPC systems
- Understanding of data center design
- Understanding of high speed interconnect InfiniBand
- Understanding of Cluster Storage
- Understanding of Scheduling related design/management
Tasks
- Help NVIDIA Cloud Partners integrate AI stacks
- Develop and deploy E2E AI solutions
- Support AI services from training to inference
- Participate in projects involving LLMs, VLMs, Physical-AI, Agentic Pipelines
- Coordinate between customers, marketing, business development, and engineering
- Work on proof-of-concept demonstrations
- Lead discussions with developers, product teams, and executives
- Encourage adoption of NVIDIA’s AI technology platform
- Simplify deployment of AI technology to production
- Engage with different roles within NVIDIA and partners
- Understand NCPs' technology and provide solutions
- Develop and demonstrate NLP and LLM solutions
- Integrate solutions into agentic pipelines
- Perform in-depth GPU system analysis and optimization
- Optimize end-to-end agentic pipelines
- Partner with Engineering, Product, and Sales teams
- Develop and plan suitable solutions for customers
- Enable product feature growth through customer feedback
- Build industry expertise in AI Cloud solutions
- Contribute to integrating NVIDIA technology in Enterprise Computing
Work Experience
- 5 years
Education
- Master's degree
Languages
- English – Native
Tools & Technologies
- Machine learning
- Deep learning
- Data science
- DNN inference
- LLM
- VLM
- Diffusion architectures
- MoE
- TRT-LLM
- Dynamo
- RedHat Inference Server
- DevOps
- Docker
- Kubernetes
- Singularity
- HPC systems
- InfiniBand
- Cluster Storage
Like this job?
BetaYour Career Agent finds similar jobs for you every day.
Not a perfect match?
- NVIDIAFull-timeOn-siteSeniorZürich
- NVIDIA Switzerland AG
Senior HPC and AI Network Software Architect(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA Switzerland AG
Deep Learning Engineer, LLM Accuracy Evaluation(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
HPC and AI Software Architect(m/w/x)
Full-timeOn-siteExperiencedZürich - NVIDIA
Senior GPU Networking Architect(m/w/x)
Full-timeOn-siteSeniorZürich
Solutions Architect, Cloud Inference Services(m/w/x)
Deploying E2E AI solutions for NVIDIA Cloud Partners, focusing on LLMs and Agentic Pipelines. Master's or Ph.D. in CS/AI or equivalent experience required. Support for AI services from training to inference.
Requirements
- Excellent verbal and written communication skills
- Excellent technical presentation skills in English
- Master's or Ph.D. in Computer Science or AI
- Equivalent experience in Computer Science or AI
- 5+ years industry/academic experience in ML/DL/Data Science
- Preference for DNN inference experience
- Work experience with modern LLM, VLM, diffusion architectures
- Emphasis on MoE architectures
- Understanding of DNN inference libraries
- Understanding of agentic pipeline development
- Excited to work with multiple levels and teams
- Collaboration with Engineering, Product, Sales, Marketing
- Strong analytical skills
- Strong problem-solving skills
- Self-starter drive for growth
- Passion for continuous learning
- Sharing findings across the team
- Strong time-management skills
- Strong organization skills
- Coordinating multiple initiatives and priorities
- Implementing new technology and products
- Experience with inference of large MoE architectures
- Experience with NLP, CV, or ASR inference
- Experience using DevOps technologies
- Understanding of HPC systems
- Understanding of data center design
- Understanding of high speed interconnect InfiniBand
- Understanding of Cluster Storage
- Understanding of Scheduling related design/management
Tasks
- Help NVIDIA Cloud Partners integrate AI stacks
- Develop and deploy E2E AI solutions
- Support AI services from training to inference
- Participate in projects involving LLMs, VLMs, Physical-AI, Agentic Pipelines
- Coordinate between customers, marketing, business development, and engineering
- Work on proof-of-concept demonstrations
- Lead discussions with developers, product teams, and executives
- Encourage adoption of NVIDIA’s AI technology platform
- Simplify deployment of AI technology to production
- Engage with different roles within NVIDIA and partners
- Understand NCPs' technology and provide solutions
- Develop and demonstrate NLP and LLM solutions
- Integrate solutions into agentic pipelines
- Perform in-depth GPU system analysis and optimization
- Optimize end-to-end agentic pipelines
- Partner with Engineering, Product, and Sales teams
- Develop and plan suitable solutions for customers
- Enable product feature growth through customer feedback
- Build industry expertise in AI Cloud solutions
- Contribute to integrating NVIDIA technology in Enterprise Computing
Work Experience
- 5 years
Education
- Master's degree
Languages
- English – Native
Tools & Technologies
- Machine learning
- Deep learning
- Data science
- DNN inference
- LLM
- VLM
- Diffusion architectures
- MoE
- TRT-LLM
- Dynamo
- RedHat Inference Server
- DevOps
- Docker
- Kubernetes
- Singularity
- HPC systems
- InfiniBand
- Cluster Storage
Like this job?
BetaYour Career Agent finds similar jobs for you every day.
About the Company
NVIDIA Switzerland AG
Industry
IT
Description
NVIDIA is a leading technology company specializing in Deep Learning, Artificial Intelligence, and Supercomputing.
Not a perfect match?
- NVIDIA
Deep Learning Solutions Architect – Inference Optimization(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA Switzerland AG
Senior HPC and AI Network Software Architect(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA Switzerland AG
Deep Learning Engineer, LLM Accuracy Evaluation(m/w/x)
Full-timeOn-siteSeniorZürich - NVIDIA
HPC and AI Software Architect(m/w/x)
Full-timeOn-siteExperiencedZürich - NVIDIA
Senior GPU Networking Architect(m/w/x)
Full-timeOn-siteSeniorZürich