Skip to content
New Job?Nejo!

Your personal AI career agent

ALAleph Alpha

Senior AI Software Engineer - Model Evaluation(m/w/x)

Heidelberg
Full-timeWith Home OfficeSenior
AI/ML

Evaluating foundation models for finance, manufacturing, and public administration clients. LLM evaluation, benchmark design, and Python skills required. 30 days vacation, subsidized transport ticket.

Requirements

  • Experience with LLM evaluation, benchmark design, dataset curation, and experimental design
  • Familiarity with statistical methods for evaluation and experiment design
  • Track record of shipping impactful technical work (research, infrastructure, or both)
  • Strong Python skills and comfort with ML tooling
  • Ability to reason about evaluation measurements and their relevance
  • Ownership mentality: seeing problems through from diagnosis to solution to deployment
  • Willingness to relocate to Heidelberg or travel regularly
  • Understanding of foundation model training (data, scale, architecture effects)
  • Experience with large-scale data processing or ML infrastructure
  • German language proficiency (helpful for evaluating German capabilities, not required)
  • PhD in machine learning, NLP, statistics, or related field (valued but not required)

Tasks

  • Define evaluation criteria for models
  • Build systems to measure model performance
  • Ensure training team has reliable evaluation signals
  • Select and implement evaluation benchmarks
  • Maintain dataset curation and scoring infrastructure
  • Develop and optimize evaluation pipelines
  • Ensure pipeline speed, reliability, and reproducibility
  • Design benchmark result aggregation
  • Create tools for interpretable results
  • Identify model capability gaps
  • Integrate benchmarks for measuring progress
  • Evaluate German language capabilities rigorously
  • Correlate pre-training metrics with performance

Work Experience

  • approx. 4 - 6 years

Education

  • Doctoral / PhD

Languages

  • GermanBasic

Tools & Technologies

  • LLM
  • Python
  • PyTorch
  • ML tooling
  • evaluation frameworks
  • distributed systems

Benefits

More Vacation Days

  • 30 days of paid vacation

Healthcare & Fitness

  • Access to fitness & wellness offerings via Wellhub

Mental Health Support

  • Mental health support through nilo.health

Retirement Plans

  • Substantially subsidized company pension plan

Public Transport Subsidies

  • Subsidized Germany-wide transportation ticket

Additional Allowances

  • Budget for additional technical equipment

Flexible Working

  • Flexible working hours
  • Hybrid working model

Competitive Pay

  • Virtual Stock Option Plan

Company Bike

  • JobRad® Bike Lease
Find the original job posting in its most current version here. Nejo automatically captured this job from the website of Aleph Alpha and processed the information on Nejo with the help of AI for you. Despite careful analysis, some information may be incomplete or inaccurate. Please always verify all details in the original posting! Content and copyrights of the original posting belong to the advertising company.

  • Aleph Alpha

    Senior AI Engineer – Pre-training Data(m/w/x)

    Full-timeWith HomeofficeSenior
    Heidelberg
  • Aleph Alpha

    Senior Performance Engineer- Pretraining(m/w/x)

    Full-timeWith HomeofficeSenior
    Heidelberg
  • Buhl Data Service GmbH

    Senior AI / Data Science Engineer(m/w/x)

    Full-timeWith HomeofficeSenior
    Mannheim
  • Aleph Alpha

    Senior AI Researcher- Reinforcement learning(m/w/x)

    Full-timeWith HomeofficeSenior
    Heidelberg
  • HMS Analytical Software GmbH

    Senior Data Scientist / Senior AI Engineer(m/w/x)

    Full-time/Part-timeWith HomeofficeSenior
    Heidelberg, Berlin, Ulm
View all 100+ similar jobs

Nejo is an AI – results may be incomplete or contain mistakes