The AI Job Search Engine
Applied Scientist I(m/w/x)
Description
In this role, you will focus on designing evaluation pipelines for large language models, collaborating with experts to enhance model performance, and championing best practices in AI evaluation to ensure reliability and ethical standards.
Let AI find the perfect jobs for you!
Upload your CV and Nejo AI will find matching job offers for you.
Requirements
- •PhD in Computer Science, Artificial Intelligence, Machine Learning, or related field
- •Research or hands-on experience with large language models, NLP evaluation, or agent-based AI systems
- •Strong understanding of LLM performance measurement, prompt evaluation, and reliability testing
- •Proficiency in Python and familiarity with ML libraries such as PyTorch, Transformers, and LangChain
- •Comfort with experimental design, data analysis, and communicating technical findings clearly
- •Experience with LLM evaluation frameworks (e.g., OpenAI Evals, HELM, LM Harness, or custom auto-eval tools)
- •Familiarity with retrieval-augmented generation (RAG), tool-using agents, or agentic evaluation methodologies
- •Experience in cloud-based ML development (AWS, Azure, or GCP)
- •Record of publications or preprints in top-tier venues (e.g., NeurIPS, ACL, EMNLP, ICLR) or equivalent research contributions
- •Interest in Responsible AI, fairness, and interpretability research
Education
Work Experience
approx. 1 - 4 years
Tasks
- •Design and execute evaluation pipelines for LLMs and agentic systems
- •Assess reasoning, factual accuracy, and alignment
- •Build tools for automatic evaluation and synthetic dataset creation
- •Implement LLM-as-a-judge workflows and continuous benchmarking systems
- •Collaborate with applied scientists, ML engineers, and product managers
- •Translate evaluation results into model improvements and product insights
- •Prototype new evaluation metrics and contribute to internal reports
- •Support publications and presentations on evaluation methods
- •Promote reproducibility, transparency, and ethical AI evaluation
Tools & Technologies
Languages
English – Business Fluent
Benefits
Flexible Working
- •Flexible hybrid working environment
- •Flexible work arrangements
Other Benefits
- •Supportive workplace policies
- •Comprehensive benefit plans
More Vacation Days
- •Flexible vacation
- •Paid volunteer days off
Mental Health Support
- •Mental Health Days off
- •Access to Headspace app
- •Resources for mental wellbeing
Retirement Plans
- •Retirement savings
Additional Allowances
- •Tuition reimbursement
- •Resources for financial wellbeing
Bonuses & Incentives
- •Employee incentive programs
Healthcare & Fitness
- •Resources for physical wellbeing
Social Impact
- •Pro-bono consulting opportunities
Sustainability Focus
- •Environmental, Social, and Governance initiatives
- Thomson ReutersFull-timeWith HomeofficeSeniorZug
- Thomson Reuters
Lead Applied Scientist - Legal Tech(m/w/x)
Full-timeWith HomeofficeSeniorZug - Thomson Reuters
Senior Applied Scientist, Knowledge Graphs and ML(m/w/x)
Full-timeWith HomeofficeSeniorZug - Thomson Reuters
Applied Scientist Intern(m/w/x)
Full-timeInternshipWith HomeofficeZug - Thomson Reuters
Principal Scientist(m/w/x)
Full-timeWith HomeofficeSeniorZug
Applied Scientist I(m/w/x)
The AI Job Search Engine
Description
In this role, you will focus on designing evaluation pipelines for large language models, collaborating with experts to enhance model performance, and championing best practices in AI evaluation to ensure reliability and ethical standards.
Let AI find the perfect jobs for you!
Upload your CV and Nejo AI will find matching job offers for you.
Requirements
- •PhD in Computer Science, Artificial Intelligence, Machine Learning, or related field
- •Research or hands-on experience with large language models, NLP evaluation, or agent-based AI systems
- •Strong understanding of LLM performance measurement, prompt evaluation, and reliability testing
- •Proficiency in Python and familiarity with ML libraries such as PyTorch, Transformers, and LangChain
- •Comfort with experimental design, data analysis, and communicating technical findings clearly
- •Experience with LLM evaluation frameworks (e.g., OpenAI Evals, HELM, LM Harness, or custom auto-eval tools)
- •Familiarity with retrieval-augmented generation (RAG), tool-using agents, or agentic evaluation methodologies
- •Experience in cloud-based ML development (AWS, Azure, or GCP)
- •Record of publications or preprints in top-tier venues (e.g., NeurIPS, ACL, EMNLP, ICLR) or equivalent research contributions
- •Interest in Responsible AI, fairness, and interpretability research
Education
Work Experience
approx. 1 - 4 years
Tasks
- •Design and execute evaluation pipelines for LLMs and agentic systems
- •Assess reasoning, factual accuracy, and alignment
- •Build tools for automatic evaluation and synthetic dataset creation
- •Implement LLM-as-a-judge workflows and continuous benchmarking systems
- •Collaborate with applied scientists, ML engineers, and product managers
- •Translate evaluation results into model improvements and product insights
- •Prototype new evaluation metrics and contribute to internal reports
- •Support publications and presentations on evaluation methods
- •Promote reproducibility, transparency, and ethical AI evaluation
Tools & Technologies
Languages
English – Business Fluent
Benefits
Flexible Working
- •Flexible hybrid working environment
- •Flexible work arrangements
Other Benefits
- •Supportive workplace policies
- •Comprehensive benefit plans
More Vacation Days
- •Flexible vacation
- •Paid volunteer days off
Mental Health Support
- •Mental Health Days off
- •Access to Headspace app
- •Resources for mental wellbeing
Retirement Plans
- •Retirement savings
Additional Allowances
- •Tuition reimbursement
- •Resources for financial wellbeing
Bonuses & Incentives
- •Employee incentive programs
Healthcare & Fitness
- •Resources for physical wellbeing
Social Impact
- •Pro-bono consulting opportunities
Sustainability Focus
- •Environmental, Social, and Governance initiatives
About the Company
Thomson Reuters
Industry
Media
Description
The company provides trusted content and technology for professionals in legal, tax, accounting, compliance, government, and media.
- Thomson Reuters
Lead Applied Scientist I(m/w/x)
Full-timeWith HomeofficeSeniorZug - Thomson Reuters
Lead Applied Scientist - Legal Tech(m/w/x)
Full-timeWith HomeofficeSeniorZug - Thomson Reuters
Senior Applied Scientist, Knowledge Graphs and ML(m/w/x)
Full-timeWith HomeofficeSeniorZug - Thomson Reuters
Applied Scientist Intern(m/w/x)
Full-timeInternshipWith HomeofficeZug - Thomson Reuters
Principal Scientist(m/w/x)
Full-timeWith HomeofficeSeniorZug