The AI Job Search Engine
Senior Applied Scientist, NLP/GenAI(m/w/x)
At a legal tech provider, building LLM-based KG pipelines for legal document understanding and enrichment. 5+ years building document understanding systems or KG pipelines required. Work from anywhere up to 8 weeks/year.
Requirements
- PhD in CS, AI, NLP, or related field, or Master's with equivalent experience
- 5+ years experience building/deploying document understanding systems, IE pipelines, or KG construction
- Ability to translate complex document understanding problems into AI applications
- Professional experience scaling and leading in applied research
- Strong programming skills (Python) and modern deep learning frameworks experience
- Publications at relevant venues (ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD)
- Deep understanding of document understanding fundamentals
- Expertise in knowledge extraction and knowledge graph construction
- Expertise in LLM-based information extraction, few-shot/multi-task learning, post-training/knowledge distillation
- Solid understanding of synthetic data generation for NLP
- Solid understanding of efficiency optimization (knowledge distillation, model compression, SLM solutions)
- Solid understanding of DL/ML approaches for NLP
- Experience designing annotation workflows, creating labeled datasets, and evaluation frameworks
- Prior work on legal document understanding, IE, knowledge representation, or legal AI
- Prior work handling complex legal document structures
- Experience building systems for analysis, Q&A, or retrieval across large document collections
- Experience with knowledge graph frameworks for legal/enterprise applications
- Understanding of RAG and agentic workflows
- Publications at relevant venues (ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD)
- Experience with AzureML or AWS SageMaker
Tasks
- Design, build, test, and deploy AI solutions for legal document understanding.
- Develop advanced models for semantic chunking of legal documents.
- Build document enrichment systems for classification and metadata extraction.
- Create LLM-based knowledge graph construction pipelines.
- Develop scalable synthetic data generation systems.
- Simulate complex legal research queries.
- Generate hallucination-free answers.
- Collaborate with engineering for software delivery.
- Ensure software reliability at scale.
- Develop comprehensive data and evaluation strategies.
- Leverage human annotation and synthetic data for evaluation.
- Apply robust training and evaluation methodologies.
- Balance model performance with latency requirements.
- Apply knowledge distillation techniques.
- Compress large models into efficient SLMs.
- Optimize and deploy efficient SLM solutions.
- Determine architectures for semantic chunking.
- Address diverse document formats and structures.
- Adapt chunking granularity.
- Determine architectures for document classification.
- Support varying legal taxonomies and customer schemas.
- Determine architectures for LLM-based knowledge extraction.
- Handle citation errors and contextual references.
- Determine architectures for multi-document reasoning.
- Generate synthetic multi-hop queries.
- Balance accuracy, efficiency, and scalability.
- Solve real-world document challenges.
- Handle diverse document formats and content types.
- Partner with Engineering and Product teams.
- Translate legal document challenges into solutions.
- Engage stakeholders across product lines.
- Understand use case requirements.
- Shape objectives for document understanding.
- Align capabilities with diverse business needs.
- Support next-generation search and legal research.
- Maintain scientific and technical expertise.
- Demonstrate expertise through product deliverables.
- Publish research at top venues.
- Contribute to intellectual property.
Work Experience
- 5 years
Education
- Master's degree
Languages
- English – Business Fluent
Tools & Technologies
- Python
- PyTorch
- Hugging Face Transformers
- DeepSpeed
- LLMs
- SLM
- RAG
- AzureML
- AWS SageMaker
Benefits
Flexible Working
- Hybrid work model
- Flex My Way policies
- Flexible work arrangements
Workation & Sabbatical
- Work from anywhere (up to 8 weeks/year)
Learning & Development
- Continuous learning culture
- Skill development
- Grow My Way programming
Other Benefits
- Skills-first approach
More Vacation Days
- Flexible vacation
Mental Health Support
- Two company-wide Mental Health Days off
- Headspace app access
- Mental wellbeing resources
Retirement Plans
- Retirement savings
Additional Allowances
- Tuition reimbursement
- Financial wellbeing resources
Bonuses & Incentives
- Employee incentive programs
Healthcare & Fitness
- Physical wellbeing resources
Social Impact
- Social Impact Institute
- Two paid volunteer days off annually
- Pro-bono consulting project opportunities
Sustainability Focus
- ESG initiative involvement
Not a perfect match?
- Thomson Reuters Enterprise Centre GmbHFull-timeWith HomeofficeSeniorZug
- Thomson Reuters
Applied Scientist, NLP/GenAI(m/w/x)
Full-timeWith HomeofficeExperiencedZug - Thomson Reuters
Lead Applied Scientist - Legal Tech(m/w/x)
Full-timeWith HomeofficeSeniorZug - Thomson Reuters
Senior Applied Scientist, Knowledge Graphs and ML(m/w/x)
Full-timeWith HomeofficeSeniorZug - Thomson Reuters
Lead Applied Scientist I(m/w/x)
Full-timeWith HomeofficeSeniorZug
Senior Applied Scientist, NLP/GenAI(m/w/x)
At a legal tech provider, building LLM-based KG pipelines for legal document understanding and enrichment. 5+ years building document understanding systems or KG pipelines required. Work from anywhere up to 8 weeks/year.
Requirements
- PhD in CS, AI, NLP, or related field, or Master's with equivalent experience
- 5+ years experience building/deploying document understanding systems, IE pipelines, or KG construction
- Ability to translate complex document understanding problems into AI applications
- Professional experience scaling and leading in applied research
- Strong programming skills (Python) and modern deep learning frameworks experience
- Publications at relevant venues (ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD)
- Deep understanding of document understanding fundamentals
- Expertise in knowledge extraction and knowledge graph construction
- Expertise in LLM-based information extraction, few-shot/multi-task learning, post-training/knowledge distillation
- Solid understanding of synthetic data generation for NLP
- Solid understanding of efficiency optimization (knowledge distillation, model compression, SLM solutions)
- Solid understanding of DL/ML approaches for NLP
- Experience designing annotation workflows, creating labeled datasets, and evaluation frameworks
- Prior work on legal document understanding, IE, knowledge representation, or legal AI
- Prior work handling complex legal document structures
- Experience building systems for analysis, Q&A, or retrieval across large document collections
- Experience with knowledge graph frameworks for legal/enterprise applications
- Understanding of RAG and agentic workflows
- Publications at relevant venues (ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD)
- Experience with AzureML or AWS SageMaker
Tasks
- Design, build, test, and deploy AI solutions for legal document understanding.
- Develop advanced models for semantic chunking of legal documents.
- Build document enrichment systems for classification and metadata extraction.
- Create LLM-based knowledge graph construction pipelines.
- Develop scalable synthetic data generation systems.
- Simulate complex legal research queries.
- Generate hallucination-free answers.
- Collaborate with engineering for software delivery.
- Ensure software reliability at scale.
- Develop comprehensive data and evaluation strategies.
- Leverage human annotation and synthetic data for evaluation.
- Apply robust training and evaluation methodologies.
- Balance model performance with latency requirements.
- Apply knowledge distillation techniques.
- Compress large models into efficient SLMs.
- Optimize and deploy efficient SLM solutions.
- Determine architectures for semantic chunking.
- Address diverse document formats and structures.
- Adapt chunking granularity.
- Determine architectures for document classification.
- Support varying legal taxonomies and customer schemas.
- Determine architectures for LLM-based knowledge extraction.
- Handle citation errors and contextual references.
- Determine architectures for multi-document reasoning.
- Generate synthetic multi-hop queries.
- Balance accuracy, efficiency, and scalability.
- Solve real-world document challenges.
- Handle diverse document formats and content types.
- Partner with Engineering and Product teams.
- Translate legal document challenges into solutions.
- Engage stakeholders across product lines.
- Understand use case requirements.
- Shape objectives for document understanding.
- Align capabilities with diverse business needs.
- Support next-generation search and legal research.
- Maintain scientific and technical expertise.
- Demonstrate expertise through product deliverables.
- Publish research at top venues.
- Contribute to intellectual property.
Work Experience
- 5 years
Education
- Master's degree
Languages
- English – Business Fluent
Tools & Technologies
- Python
- PyTorch
- Hugging Face Transformers
- DeepSpeed
- LLMs
- SLM
- RAG
- AzureML
- AWS SageMaker
Benefits
Flexible Working
- Hybrid work model
- Flex My Way policies
- Flexible work arrangements
Workation & Sabbatical
- Work from anywhere (up to 8 weeks/year)
Learning & Development
- Continuous learning culture
- Skill development
- Grow My Way programming
Other Benefits
- Skills-first approach
More Vacation Days
- Flexible vacation
Mental Health Support
- Two company-wide Mental Health Days off
- Headspace app access
- Mental wellbeing resources
Retirement Plans
- Retirement savings
Additional Allowances
- Tuition reimbursement
- Financial wellbeing resources
Bonuses & Incentives
- Employee incentive programs
Healthcare & Fitness
- Physical wellbeing resources
Social Impact
- Social Impact Institute
- Two paid volunteer days off annually
- Pro-bono consulting project opportunities
Sustainability Focus
- ESG initiative involvement
About the Company
Thomson Reuters
Industry
Legal
Description
The company provides trusted content and technology for professionals in legal, tax, accounting, compliance, government, and media.
Not a perfect match?
- Thomson Reuters Enterprise Centre GmbH
Lead Applied Scientist, NLP/GenAI(m/w/x)
Full-timeWith HomeofficeSeniorZug - Thomson Reuters
Applied Scientist, NLP/GenAI(m/w/x)
Full-timeWith HomeofficeExperiencedZug - Thomson Reuters
Lead Applied Scientist - Legal Tech(m/w/x)
Full-timeWith HomeofficeSeniorZug - Thomson Reuters
Senior Applied Scientist, Knowledge Graphs and ML(m/w/x)
Full-timeWith HomeofficeSeniorZug - Thomson Reuters
Lead Applied Scientist I(m/w/x)
Full-timeWith HomeofficeSeniorZug