Skip to content
New Job?Nejo!

Your personal AI career agent

NENebius

Senior ML Engineer - Token Factory(m/w/x)

Berlin
Full-timeWith Home OfficeSenior
AI/ML
Data Science

Optimizing LLM inference performance, implementing novel speculative decoding architectures at cloud computing provider for global AI. Experience profiling GPU workloads and GPU memory hierarchy understanding required. Flexible working arrangements.

Requirements

  • Understanding of machine learning foundations
  • Experience profiling GPU workloads
  • Understanding of GPU memory hierarchy
  • Familiarity with LLM architectures
  • Understanding of neural network training
  • Strong software engineering skills
  • Experience with deep learning frameworks
  • Proficiency in CI/CD and versioning
  • Strong communication and leadership abilities
  • Experience with open-source inference engines
  • Experience with kernel languages
  • Track record of delivering products
  • Experience developing large distributed systems
  • Open-source projects showcasing engineering prowess
  • Excellent command of English language

Tasks

  • Identify LLM inference bottlenecks
  • Drive production speedups
  • Maximize performance for LLM architectures
  • Support and optimize inference engines
  • Implement novel speculative decoding architectures
  • Optimize dense and MoE components
  • Contribute to open-source inference engines
  • Design low-precision training pipelines
  • Productionize FP8 and NVFP4 inference
  • Improve throughput and cost-efficiency

Work Experience

  • approx. 4 - 6 years

Education

  • Bachelor's degreeOR
  • Master's degree

Languages

  • EnglishBusiness Fluent

Tools & Technologies

  • Nsight
  • PyTorch profiler
  • Python
  • CI/CD
  • vLLM
  • SGLang
  • TensorRT-LLM
  • Triton
  • Cute
  • CUTLASS
  • CUDA

Benefits

Flexible Working

  • Flexible working arrangements

Competitive Pay

  • Competitive salary

Other Benefits

  • Comprehensive benefits package

Career Advancement

  • Professional growth opportunities

Informal Culture

  • Dynamic and collaborative work environment
Find the original job posting in its most current version here. Nejo automatically captured this job from the website of Nebius and processed the information on Nejo with the help of AI for you. Despite careful analysis, some information may be incomplete or inaccurate. Please always verify all details in the original posting! Content and copyrights of the original posting belong to the advertising company.

  • Nebius

    Senior Backend Developer (Token Factory)(m/w/x)

    Full-timeWith HomeofficeSenior
    Berlin
  • FactoryPal

    Senior Machine Learning Engineer(m/w/x)

    Full-timeWith HomeofficeSenior
    Berlin
  • RepRisk AG

    Senior Machine Learning Engineer(m/w/x)

    Full-timeWith HomeofficeSenior
    Berlin
  • AUTO1 Group

    Senior Machine Learning Platform/Ops Engineer(m/w/x)

    Full-timeWith HomeofficeSenior
    Berlin
  • Super.AI

    Machine Learning Engineer(m/w/x)

    Full-timeWith HomeofficeExperienced
    Berlin
View all 100+ similar jobs

Nejo is an AI – results may be incomplete or contain mistakes