Dein persönlicher KI-Karriere-Agent
Senior ML Engineer - Token Factory(m/w/x)
Optimizing LLM inference performance, implementing novel speculative decoding architectures at cloud computing provider for global AI. Experience profiling GPU workloads and GPU memory hierarchy understanding required. Flexible working arrangements.
Anforderungen
- Understanding of machine learning foundations
- Experience profiling GPU workloads
- Understanding of GPU memory hierarchy
- Familiarity with LLM architectures
- Understanding of neural network training
- Strong software engineering skills
- Experience with deep learning frameworks
- Proficiency in CI/CD and versioning
- Strong communication and leadership abilities
- Experience with open-source inference engines
- Experience with kernel languages
- Track record of delivering products
- Experience developing large distributed systems
- Open-source projects showcasing engineering prowess
- Excellent command of English language
Aufgaben
- Identify LLM inference bottlenecks
- Drive production speedups
- Maximize performance for LLM architectures
- Support and optimize inference engines
- Implement novel speculative decoding architectures
- Optimize dense and MoE components
- Contribute to open-source inference engines
- Design low-precision training pipelines
- Productionize FP8 and NVFP4 inference
- Improve throughput and cost-efficiency
Berufserfahrung
- ca. 4 - 6 Jahre
Ausbildung
- Bachelor-AbschlussODER
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- Nsight
- PyTorch profiler
- Python
- CI/CD
- vLLM
- SGLang
- TensorRT-LLM
- Triton
- Cute
- CUTLASS
- CUDA
Benefits
Flexibles Arbeiten
- Flexible working arrangements
Attraktive Vergütung
- Competitive salary
Sonstige Vorteile
- Comprehensive benefits package
Karriere- und Weiterentwicklung
- Professional growth opportunities
Lockere Unternehmenskultur
- Dynamic and collaborative work environment
Noch nicht perfekt?
- NebiusVollzeitmit HomeofficeSeniorBerlin
- FactoryPal
Senior Machine Learning Engineer(m/w/x)
Vollzeitmit HomeofficeSeniorBerlin - RepRisk AG
Senior Machine Learning Engineer(m/w/x)
Vollzeitmit HomeofficeSeniorBerlin - AUTO1 Group
Senior Machine Learning Platform/Ops Engineer(m/w/x)
Vollzeitmit HomeofficeSeniorBerlin - ZDF Sparks GmbH
Senior AI Engineer(m/w/x)
Vollzeitmit HomeofficeSeniorBerlin
Senior ML Engineer - Token Factory(m/w/x)
Optimizing LLM inference performance, implementing novel speculative decoding architectures at cloud computing provider for global AI. Experience profiling GPU workloads and GPU memory hierarchy understanding required. Flexible working arrangements.
Anforderungen
- Understanding of machine learning foundations
- Experience profiling GPU workloads
- Understanding of GPU memory hierarchy
- Familiarity with LLM architectures
- Understanding of neural network training
- Strong software engineering skills
- Experience with deep learning frameworks
- Proficiency in CI/CD and versioning
- Strong communication and leadership abilities
- Experience with open-source inference engines
- Experience with kernel languages
- Track record of delivering products
- Experience developing large distributed systems
- Open-source projects showcasing engineering prowess
- Excellent command of English language
Aufgaben
- Identify LLM inference bottlenecks
- Drive production speedups
- Maximize performance for LLM architectures
- Support and optimize inference engines
- Implement novel speculative decoding architectures
- Optimize dense and MoE components
- Contribute to open-source inference engines
- Design low-precision training pipelines
- Productionize FP8 and NVFP4 inference
- Improve throughput and cost-efficiency
Berufserfahrung
- ca. 4 - 6 Jahre
Ausbildung
- Bachelor-AbschlussODER
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- Nsight
- PyTorch profiler
- Python
- CI/CD
- vLLM
- SGLang
- TensorRT-LLM
- Triton
- Cute
- CUTLASS
- CUDA
Benefits
Flexibles Arbeiten
- Flexible working arrangements
Attraktive Vergütung
- Competitive salary
Sonstige Vorteile
- Comprehensive benefits package
Karriere- und Weiterentwicklung
- Professional growth opportunities
Lockere Unternehmenskultur
- Dynamic and collaborative work environment
Über das Unternehmen
Nebius
Branche
IT
Beschreibung
The company is leading a new era in cloud computing to serve the global AI economy by creating tools and resources for real-world challenges.
Noch nicht perfekt?
- Nebius
Senior Backend Developer (Token Factory)(m/w/x)
Vollzeitmit HomeofficeSeniorBerlin - FactoryPal
Senior Machine Learning Engineer(m/w/x)
Vollzeitmit HomeofficeSeniorBerlin - RepRisk AG
Senior Machine Learning Engineer(m/w/x)
Vollzeitmit HomeofficeSeniorBerlin - AUTO1 Group
Senior Machine Learning Platform/Ops Engineer(m/w/x)
Vollzeitmit HomeofficeSeniorBerlin - ZDF Sparks GmbH
Senior AI Engineer(m/w/x)
Vollzeitmit HomeofficeSeniorBerlin