The AI Job Search Engine
Research Scientist, Frontier(m/w/x)
Description
In this role, you will lead the development of cutting-edge post-training strategies for AI models, focusing on enhancing reasoning and instruction-following capabilities. You will collaborate across teams to ensure high-quality performance across different modalities.
Let AI find the perfect jobs for you!
Upload your CV and Nejo AI will find matching job offers for you.
Requirements
- •PhD in machine learning, artificial intelligence, or computer science or equivalent practical experience
- •Strong background in Large Language Models, Reinforcement Learning, or preference learning
- •Research interest in aligning AI systems with human feedback and utility
- •Familiarity with experiment design and analyzing large-scale user data
- •Strong coding and communication skills
- •Experience with RLHF or DPO
- •Experience building or improving reward models and conducting human evaluation studies
- •Proven track record of publications in top-tier conferences
- •Experience with Chain-of-Thought reasoning research or process-based supervision
- •Deep understanding and experience training models from scratch or using self-play/self-improvement techniques
Education
Work Experience
approx. 1 - 4 years
Tasks
- •Design and validate novel post-training pipelines for frontier-class models
- •Lead research into next-generation Reward Models
- •Investigate new architectures for Reward Modeling
- •Reduce reward hacking in preference data
- •Improve signal-to-noise ratios in preference data
- •Develop innovative methods to enhance internal reasoning capabilities
- •Focus on correctness and logic in multi-step tasks
- •Revamp and optimize RL prompts and feedback mechanisms
- •Create robust mechanisms to convert user signals into training data
- •Collaborate across teams to apply advanced recipes to various model sizes and modalities
Languages
English – Business Fluent
- DeepMindFull-timeOn-siteExperiencedZürich
- Lakera
Senior Research Engineer - Security Foundation Models(m/w/x)
Full-timeOn-siteSeniorZürich - Intrinsic
Research Scientist, Deep Learning(m/w/x)
Full-timeOn-siteExperiencedZürich - Lakera
Research Internship(m/w/x)
Full-timeInternshipOn-siteZürich - NVIDIA Switzerland AG
Research Scientist, ML Systems - PhD New College Grad(m/w/x)
Full-timeOn-siteExperiencedZürich
Research Scientist, Frontier(m/w/x)
The AI Job Search Engine
Description
In this role, you will lead the development of cutting-edge post-training strategies for AI models, focusing on enhancing reasoning and instruction-following capabilities. You will collaborate across teams to ensure high-quality performance across different modalities.
Let AI find the perfect jobs for you!
Upload your CV and Nejo AI will find matching job offers for you.
Requirements
- •PhD in machine learning, artificial intelligence, or computer science or equivalent practical experience
- •Strong background in Large Language Models, Reinforcement Learning, or preference learning
- •Research interest in aligning AI systems with human feedback and utility
- •Familiarity with experiment design and analyzing large-scale user data
- •Strong coding and communication skills
- •Experience with RLHF or DPO
- •Experience building or improving reward models and conducting human evaluation studies
- •Proven track record of publications in top-tier conferences
- •Experience with Chain-of-Thought reasoning research or process-based supervision
- •Deep understanding and experience training models from scratch or using self-play/self-improvement techniques
Education
Work Experience
approx. 1 - 4 years
Tasks
- •Design and validate novel post-training pipelines for frontier-class models
- •Lead research into next-generation Reward Models
- •Investigate new architectures for Reward Modeling
- •Reduce reward hacking in preference data
- •Improve signal-to-noise ratios in preference data
- •Develop innovative methods to enhance internal reasoning capabilities
- •Focus on correctness and logic in multi-step tasks
- •Revamp and optimize RL prompts and feedback mechanisms
- •Create robust mechanisms to convert user signals into training data
- •Collaborate across teams to apply advanced recipes to various model sizes and modalities
Languages
English – Business Fluent
About the Company
DeepMind
Industry
IT
Description
The company advances the state of the art in artificial intelligence for public benefit and scientific discovery.
- DeepMind
Research Engineer, Multimodal Reinforcement Learning(m/w/x)
Full-timeOn-siteExperiencedZürich - Lakera
Senior Research Engineer - Security Foundation Models(m/w/x)
Full-timeOn-siteSeniorZürich - Intrinsic
Research Scientist, Deep Learning(m/w/x)
Full-timeOn-siteExperiencedZürich - Lakera
Research Internship(m/w/x)
Full-timeInternshipOn-siteZürich - NVIDIA Switzerland AG
Research Scientist, ML Systems - PhD New College Grad(m/w/x)
Full-timeOn-siteExperiencedZürich