Dein persönlicher KI-Karriere-Agent
Senior AI Software Engineer - Model Evaluation(m/w/x)
Evaluating foundation models for finance, manufacturing, and public administration clients. LLM evaluation, benchmark design, and Python skills required. 30 days vacation, subsidized transport ticket.
Anforderungen
- Experience with LLM evaluation, benchmark design, dataset curation, and experimental design
- Familiarity with statistical methods for evaluation and experiment design
- Track record of shipping impactful technical work (research, infrastructure, or both)
- Strong Python skills and comfort with ML tooling
- Ability to reason about evaluation measurements and their relevance
- Ownership mentality: seeing problems through from diagnosis to solution to deployment
- Willingness to relocate to Heidelberg or travel regularly
- Understanding of foundation model training (data, scale, architecture effects)
- Experience with large-scale data processing or ML infrastructure
- German language proficiency (helpful for evaluating German capabilities, not required)
- PhD in machine learning, NLP, statistics, or related field (valued but not required)
Aufgaben
- Define evaluation criteria for models
- Build systems to measure model performance
- Ensure training team has reliable evaluation signals
- Select and implement evaluation benchmarks
- Maintain dataset curation and scoring infrastructure
- Develop and optimize evaluation pipelines
- Ensure pipeline speed, reliability, and reproducibility
- Design benchmark result aggregation
- Create tools for interpretable results
- Identify model capability gaps
- Integrate benchmarks for measuring progress
- Evaluate German language capabilities rigorously
- Correlate pre-training metrics with performance
Berufserfahrung
- ca. 4 - 6 Jahre
Ausbildung
- Doktor / Ph.D.
Sprachen
- Deutsch – Grundkenntnisse
Tools & Technologien
- LLM
- Python
- PyTorch
- ML tooling
- evaluation frameworks
- distributed systems
Benefits
Mehr Urlaubstage
- 30 days of paid vacation
Gesundheits- & Fitnessangebote
- Access to fitness & wellness offerings via Wellhub
Mentale Gesundheitsförderung
- Mental health support through nilo.health
Betriebliche Altersvorsorge
- Substantially subsidized company pension plan
Öffi Tickets
- Subsidized Germany-wide transportation ticket
Sonstige Zulagen
- Budget for additional technical equipment
Flexibles Arbeiten
- Flexible working hours
- Hybrid working model
Attraktive Vergütung
- Virtual Stock Option Plan
Firmenfahrrad
- JobRad® Bike Lease
- Home
- Jobs in Deutschland
- Heidelberg
- Senior AI Software Engineer - Model EvaluationSenior AI Software Engineer - Model Evaluation bei Aleph ...
Noch nicht perfekt?
- Aleph AlphaVollzeitmit HomeofficeSeniorHeidelberg
- Aleph Alpha
Senior Performance Engineer- Pretraining(m/w/x)
Vollzeitmit HomeofficeSeniorHeidelberg - Buhl Data Service GmbH
Senior AI / Data Science Engineer(m/w/x)
Vollzeitmit HomeofficeSeniorMannheim - Aleph Alpha
Senior AI Researcher- Reinforcement learning(m/w/x)
Vollzeitmit HomeofficeSeniorHeidelberg - HMS Analytical Software GmbH
Senior Data Scientist / Senior AI Engineer(m/w/x)
Vollzeit/Teilzeitmit HomeofficeSeniorHeidelberg, Berlin, Ulm
- Home
- Jobs in Deutschland
- Heidelberg
- Senior AI Software Engineer - Model EvaluationSenior AI Software Engineer - Model Evaluation bei Aleph ...
Senior AI Software Engineer - Model Evaluation(m/w/x)
Evaluating foundation models for finance, manufacturing, and public administration clients. LLM evaluation, benchmark design, and Python skills required. 30 days vacation, subsidized transport ticket.
Anforderungen
- Experience with LLM evaluation, benchmark design, dataset curation, and experimental design
- Familiarity with statistical methods for evaluation and experiment design
- Track record of shipping impactful technical work (research, infrastructure, or both)
- Strong Python skills and comfort with ML tooling
- Ability to reason about evaluation measurements and their relevance
- Ownership mentality: seeing problems through from diagnosis to solution to deployment
- Willingness to relocate to Heidelberg or travel regularly
- Understanding of foundation model training (data, scale, architecture effects)
- Experience with large-scale data processing or ML infrastructure
- German language proficiency (helpful for evaluating German capabilities, not required)
- PhD in machine learning, NLP, statistics, or related field (valued but not required)
Aufgaben
- Define evaluation criteria for models
- Build systems to measure model performance
- Ensure training team has reliable evaluation signals
- Select and implement evaluation benchmarks
- Maintain dataset curation and scoring infrastructure
- Develop and optimize evaluation pipelines
- Ensure pipeline speed, reliability, and reproducibility
- Design benchmark result aggregation
- Create tools for interpretable results
- Identify model capability gaps
- Integrate benchmarks for measuring progress
- Evaluate German language capabilities rigorously
- Correlate pre-training metrics with performance
Berufserfahrung
- ca. 4 - 6 Jahre
Ausbildung
- Doktor / Ph.D.
Sprachen
- Deutsch – Grundkenntnisse
Tools & Technologien
- LLM
- Python
- PyTorch
- ML tooling
- evaluation frameworks
- distributed systems
Benefits
Mehr Urlaubstage
- 30 days of paid vacation
Gesundheits- & Fitnessangebote
- Access to fitness & wellness offerings via Wellhub
Mentale Gesundheitsförderung
- Mental health support through nilo.health
Betriebliche Altersvorsorge
- Substantially subsidized company pension plan
Öffi Tickets
- Subsidized Germany-wide transportation ticket
Sonstige Zulagen
- Budget for additional technical equipment
Flexibles Arbeiten
- Flexible working hours
- Hybrid working model
Attraktive Vergütung
- Virtual Stock Option Plan
Firmenfahrrad
- JobRad® Bike Lease
Über das Unternehmen
Aleph Alpha
Branche
IT
Beschreibung
The company develops cutting-edge generative AI solutions with a strong emphasis on sovereignty, ethical development, and societal benefit.
Noch nicht perfekt?
- Aleph Alpha
Senior AI Engineer – Pre-training Data(m/w/x)
Vollzeitmit HomeofficeSeniorHeidelberg - Aleph Alpha
Senior Performance Engineer- Pretraining(m/w/x)
Vollzeitmit HomeofficeSeniorHeidelberg - Buhl Data Service GmbH
Senior AI / Data Science Engineer(m/w/x)
Vollzeitmit HomeofficeSeniorMannheim - Aleph Alpha
Senior AI Researcher- Reinforcement learning(m/w/x)
Vollzeitmit HomeofficeSeniorHeidelberg - HMS Analytical Software GmbH
Senior Data Scientist / Senior AI Engineer(m/w/x)
Vollzeit/Teilzeitmit HomeofficeSeniorHeidelberg, Berlin, Ulm