The AI Job Search Engine
Senior AI Researcher- Reinforcement learning(m/w/x)
Large-scale experiments and code-base maintenance for general-purpose model methodology at AI lab with 50+ researchers. Proven experience in multi-node LLM training and RL theory required. Virtual Stock Option Plan, 30 days vacation.
Requirements
- Deep understanding of Reinforcement Learning theory
- Experience with multi-node LLM training
- Familiarity with statistical evaluation methods
- Ability to analyze evaluation environments
- Strong Python and ML tooling skills
- Willingness to relocate or travel
- PhD in RL or equivalent research
- Contributions to top-tier RL venues
- Experience evaluating LLM models
Tasks
- Shape and improve underlying RL methodology
- Maintain a high-quality training code-base
- Conduct large-scale reinforcement learning experiments
- Derive hypotheses from experimental results
- Iterate on implementation and methodology
- Execute large-scale LLM training runs
- Analyze evaluation scores in depth
- Propose and implement performance improvements
- Maximize performance on internal benchmarks
- Identify and implement novel multi-turn RL approaches
- Stay current with bleeding-edge RL research
- Identify and resolve training infrastructure bottlenecks
- Optimize RL loops for large-scale training
- Partner with post-training teams on feedback
- Convert raw feedback into actionable training signals
- Ensure RL iterations improve downstream performance
Work Experience
- approx. 4 - 6 years
Education
- Doctoral / PhD
Languages
- English – Business Fluent
Tools & Technologies
- Python
- torch distributed
- LLM
- ML tooling
Benefits
Flexible Working
- Flexible working hours
- Hybrid working model
Competitive Pay
- Virtual Stock Option Plan
More Vacation Days
- 30 days paid vacation
Healthcare & Fitness
- Fitness & wellness offerings
Mental Health Support
- Mental health support
Retirement Plans
- Subsidized company pension plan
Public Transport Subsidies
- Subsidized transportation ticket
Additional Allowances
- Technical equipment budget
Company Bike
- JobRad Bike Lease
Not a perfect match?
- Buhl Data Service GmbHFull-timeWith HomeofficeSeniorMannheim
- SAP
Principal Machine Learning Expert/ Development Architect(m/w/x)
Full-timeWith HomeofficeSeniorWalldorf - Aleph Alpha
Senior Performance Engineer- Pretraining(m/w/x)
Full-timeWith HomeofficeSeniorHeidelberg - Exxeta
Senior Data Scientist - Physical AI & Computer Vision(m/w/x)
Full-timeWith HomeofficeSeniorBerlin, Karlsruhe, Mannheim - ABB AG
Senior Scientist – Agentic AI and Applications(m/w/x)
Full-timeWith HomeofficeSeniorMannheim
Senior AI Researcher- Reinforcement learning(m/w/x)
Large-scale experiments and code-base maintenance for general-purpose model methodology at AI lab with 50+ researchers. Proven experience in multi-node LLM training and RL theory required. Virtual Stock Option Plan, 30 days vacation.
Requirements
- Deep understanding of Reinforcement Learning theory
- Experience with multi-node LLM training
- Familiarity with statistical evaluation methods
- Ability to analyze evaluation environments
- Strong Python and ML tooling skills
- Willingness to relocate or travel
- PhD in RL or equivalent research
- Contributions to top-tier RL venues
- Experience evaluating LLM models
Tasks
- Shape and improve underlying RL methodology
- Maintain a high-quality training code-base
- Conduct large-scale reinforcement learning experiments
- Derive hypotheses from experimental results
- Iterate on implementation and methodology
- Execute large-scale LLM training runs
- Analyze evaluation scores in depth
- Propose and implement performance improvements
- Maximize performance on internal benchmarks
- Identify and implement novel multi-turn RL approaches
- Stay current with bleeding-edge RL research
- Identify and resolve training infrastructure bottlenecks
- Optimize RL loops for large-scale training
- Partner with post-training teams on feedback
- Convert raw feedback into actionable training signals
- Ensure RL iterations improve downstream performance
Work Experience
- approx. 4 - 6 years
Education
- Doctoral / PhD
Languages
- English – Business Fluent
Tools & Technologies
- Python
- torch distributed
- LLM
- ML tooling
Benefits
Flexible Working
- Flexible working hours
- Hybrid working model
Competitive Pay
- Virtual Stock Option Plan
More Vacation Days
- 30 days paid vacation
Healthcare & Fitness
- Fitness & wellness offerings
Mental Health Support
- Mental health support
Retirement Plans
- Subsidized company pension plan
Public Transport Subsidies
- Subsidized transportation ticket
Additional Allowances
- Technical equipment budget
Company Bike
- JobRad Bike Lease
About the Company
Aleph Alpha
Industry
IT
Description
The company develops cutting-edge generative AI solutions with a strong emphasis on sovereignty, ethical development, and societal benefit.
Not a perfect match?
- Buhl Data Service GmbH
Senior AI / Data Science Engineer(m/w/x)
Full-timeWith HomeofficeSeniorMannheim - SAP
Principal Machine Learning Expert/ Development Architect(m/w/x)
Full-timeWith HomeofficeSeniorWalldorf - Aleph Alpha
Senior Performance Engineer- Pretraining(m/w/x)
Full-timeWith HomeofficeSeniorHeidelberg - Exxeta
Senior Data Scientist - Physical AI & Computer Vision(m/w/x)
Full-timeWith HomeofficeSeniorBerlin, Karlsruhe, Mannheim - ABB AG
Senior Scientist – Agentic AI and Applications(m/w/x)
Full-timeWith HomeofficeSeniorMannheim