The AI Job Search Engine
Data Scientist(m/w/x)
Building LLM-as-a-Judge evaluation models for AI product quality in fintech for small businesses. Experience with LLMs/NLP systems and applying data science to real-world products required. Corporate pension with 20% matching, 30-day sabbatical after 3 years.
Requirements
- 3+ years experience applying data science to real-world products (ideally AI systems, customer support, quality measurement, or evaluation frameworks)
- Solid understanding of machine learning and statistics, including experimentation and metric design
- Experience with LLMs or NLP systems, and interest in evaluation methods (e.g., rubric-based scoring, prompt-based judges, calibration approaches, or human-in-the-loop evaluation)
- Strong ability to write clean, reliable code in Python (pandas, numpy, scikit-learn) and strong working knowledge of SQL
- Experience building or contributing to data workflows and pipelines, and familiarity with taking models into production environments
- Confidence communicating results and insights to technical and non-technical stakeholders, with focus on practical business impact
- Proactive mindset, task ownership, and close collaboration with team in fast-paced setting
Tasks
- Build and improve LLM-as-a-Judge evaluation models.
- Assess AI product quality, safety, and effectiveness.
- Design and implement automated QA and evaluation pipelines.
- Monitor and benchmark AI product performance at scale.
- Contribute to model development from problem framing to monitoring.
- Own forecasting models.
- Collaborate with WFM to improve model accuracy.
- Develop and maintain data pipelines for evaluations and insights.
- Support creation of qualitative and quantitative observability frameworks.
- Produce actionable insights to identify improvement opportunities.
- Communicate recommendations to product and operations stakeholders.
- Build and iterate on ML models for support efficiency.
- Collaborate with data and product teams.
- Deliver measurable improvements to customer support.
- Learn and share best practices.
- Contribute to a culture of continuous improvement.
Work Experience
- 3 years
Education
- Bachelor's degreeOR
- Master's degree
Languages
- English – Business Fluent
Tools & Technologies
- AI systems
- machine learning
- statistics
- LLMs
- NLP systems
- Python
- pandas
- numpy
- scikit-learn
- SQL
Benefits
Informal Culture
- Inclusive work environment
Learning & Development
- Annual L&D budget
Retirement Plans
- Corporate pension scheme with 20% matching
Workation & Sabbatical
- 30-day sabbatical after 3 years
Bonuses & Incentives
- Referral rewards
Not a perfect match?
- PerplexityFull-timeOn-siteExperiencedBerlin
- Prior Labs
Data Scientist(m/w/x)
Full-timeOn-siteExperiencedFreiburg im Breisgau, Berlin - Perplexity
Data Scientist/Engineer – Online Metrics(m/w/x)
Full-timeOn-siteExperiencedBerlin - Cresta
Senior Machine Learning Engineer(m/w/x)
Full-timeOn-siteSeniorBerlin - idealo internet GmbH
Senior Data Scientist - Search(m/w/x)
Full-timeOn-siteSeniorBerlin
Data Scientist(m/w/x)
Building LLM-as-a-Judge evaluation models for AI product quality in fintech for small businesses. Experience with LLMs/NLP systems and applying data science to real-world products required. Corporate pension with 20% matching, 30-day sabbatical after 3 years.
Requirements
- 3+ years experience applying data science to real-world products (ideally AI systems, customer support, quality measurement, or evaluation frameworks)
- Solid understanding of machine learning and statistics, including experimentation and metric design
- Experience with LLMs or NLP systems, and interest in evaluation methods (e.g., rubric-based scoring, prompt-based judges, calibration approaches, or human-in-the-loop evaluation)
- Strong ability to write clean, reliable code in Python (pandas, numpy, scikit-learn) and strong working knowledge of SQL
- Experience building or contributing to data workflows and pipelines, and familiarity with taking models into production environments
- Confidence communicating results and insights to technical and non-technical stakeholders, with focus on practical business impact
- Proactive mindset, task ownership, and close collaboration with team in fast-paced setting
Tasks
- Build and improve LLM-as-a-Judge evaluation models.
- Assess AI product quality, safety, and effectiveness.
- Design and implement automated QA and evaluation pipelines.
- Monitor and benchmark AI product performance at scale.
- Contribute to model development from problem framing to monitoring.
- Own forecasting models.
- Collaborate with WFM to improve model accuracy.
- Develop and maintain data pipelines for evaluations and insights.
- Support creation of qualitative and quantitative observability frameworks.
- Produce actionable insights to identify improvement opportunities.
- Communicate recommendations to product and operations stakeholders.
- Build and iterate on ML models for support efficiency.
- Collaborate with data and product teams.
- Deliver measurable improvements to customer support.
- Learn and share best practices.
- Contribute to a culture of continuous improvement.
Work Experience
- 3 years
Education
- Bachelor's degreeOR
- Master's degree
Languages
- English – Business Fluent
Tools & Technologies
- AI systems
- machine learning
- statistics
- LLMs
- NLP systems
- Python
- pandas
- numpy
- scikit-learn
- SQL
Benefits
Informal Culture
- Inclusive work environment
Learning & Development
- Annual L&D budget
Retirement Plans
- Corporate pension scheme with 20% matching
Workation & Sabbatical
- 30-day sabbatical after 3 years
Bonuses & Incentives
- Referral rewards
About the Company
SumUp
Industry
FinancialServices
Description
The company is a leading global fintech company committed to leveling the playing field for small businesses.
Not a perfect match?
- Perplexity
Data Scientist, Evals(m/w/x)
Full-timeOn-siteExperiencedBerlin - Prior Labs
Data Scientist(m/w/x)
Full-timeOn-siteExperiencedFreiburg im Breisgau, Berlin - Perplexity
Data Scientist/Engineer – Online Metrics(m/w/x)
Full-timeOn-siteExperiencedBerlin - Cresta
Senior Machine Learning Engineer(m/w/x)
Full-timeOn-siteSeniorBerlin - idealo internet GmbH
Senior Data Scientist - Search(m/w/x)
Full-timeOn-siteSeniorBerlin