Die KI-Suchmaschine für Jobs
Applied Scientist, NLP/GenAI(m/w/x)
Developing AI pipelines for legal document understanding, knowledge extraction, and synthetic data generation at a legal, tax, and media content provider. 3+ years building/deploying deep learning/LLM-based document understanding systems required. Work from anywhere for up to 8 weeks per year.
Anforderungen
- PhD in Computer Science, AI, NLP, or related field, or Master's with equivalent research/industry experience
- 3+ years experience building/deploying document understanding, information extraction, or knowledge graph systems (deep learning, LLMs, NLP)
- Ability to translate complex document understanding problems into innovative AI applications
- Professional experience scaling and leading in applied research
- Strong programming skills (Python) and modern deep learning frameworks experience
- Publications at relevant venues (ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD)
- Deep understanding of document understanding fundamentals (layout analysis, semantic chunking, classification, taxonomies, multi-label, domain schemas)
- Expertise in knowledge extraction and knowledge graph construction (entity recognition, relation extraction, citation parsing, graph representations)
- Expertise in LLM-based information extraction, few-shot/multi-task learning, post-training, knowledge distillation
- Solid understanding of synthetic data generation techniques for NLP (query-answer generation, data augmentation)
- Solid understanding of efficiency optimization (knowledge distillation, model compression, SLM-based solutions)
- Solid understanding of DL/ML approaches for NLP tasks
- Experience designing annotation workflows, creating labeled datasets, and developing evaluation frameworks
- Prior work on legal document understanding, information extraction, knowledge representation (legal citations, domain concepts), or legal AI applications
- Prior work handling complex legal document structures (non-uniform formatting, nested hierarchies, cross-references, embedded elements)
- Experience building systems for analysis, question answering, or retrieval across large document collections
- Experience with knowledge graph frameworks/methodologies for legal or enterprise applications
- Understanding of RAG and agentic workflows for enterprise knowledge
- Publications at relevant venues (ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD)
- Experience with AzureML or AWS SageMaker
Aufgaben
- Develop and deploy AI solutions for legal document understanding.
- Develop advanced models for semantic chunking lengthy legal documents.
- Build document enrichment systems for classification and rich metadata extraction.
- Create LLM-based pipelines for extracting and linking legal knowledge.
- Develop scalable synthetic data generation systems.
- Support model training with synthetic data.
- Simulate complex legal research queries.
- Generate hallucination-free answers.
- Collaborate with engineering for software delivery and reliability.
- Develop comprehensive data and evaluation strategies.
- Leverage human annotation and synthetic data for evaluation.
- Apply robust training and evaluation methodologies.
- Balance model performance with latency requirements for SLM solutions.
- Apply knowledge distillation to compress models into efficient SLMs.
- Determine appropriate architectures for challenging document understanding.
- Develop semantic chunking strategies for diverse documents.
- Design document classification approaches for legal taxonomies.
- Implement LLM-based knowledge extraction methods.
- Build multi-document reasoning architectures.
- Balance accuracy, efficiency, and scalability for real-world challenges.
- Partner with Engineering and Product to translate legal challenges.
- Engage stakeholders to understand use case requirements.
- Align document understanding capabilities with business needs.
- Maintain scientific and technical expertise in relevant areas.
Berufserfahrung
- 3 Jahre
Ausbildung
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- Python
- PyTorch
- Hugging Face Transformers
- DeepSpeed
- LLMs
- SLM
- RAG
- AzureML
- AWS SageMaker
Benefits
Flexibles Arbeiten
- Flexible hybrid work environment
- Flex My Way policies
- Flexible work arrangements
Workation & Sabbatical
- Work from anywhere for up to 8 weeks per year
Familienfreundlichkeit
- Work-life balance
Weiterbildungsangebote
- Culture of continuous learning
- Skill development
- Grow My Way programming
Sonstige Vorteile
- Skills-first approach
Mehr Urlaubstage
- Flexible vacation
Mentale Gesundheitsförderung
- Two company-wide Mental Health Days off
- Access to Headspace app
- Resources for mental wellbeing
Betriebliche Altersvorsorge
- Retirement savings
Sonstige Zulagen
- Tuition reimbursement
- Resources for financial wellbeing
Boni & Prämien
- Employee incentive programs
Gesundheits- & Fitnessangebote
- Resources for physical wellbeing
Gemeinnützige Ausrichtung
- Two paid volunteer days off annually
- Pro-bono consulting project opportunities
Fokus auf Nachhaltigkeit
- ESG initiative involvement opportunities
Noch nicht perfekt?
- Thomson ReutersVollzeitmit HomeofficeSeniorZug
- Thomson Reuters Enterprise Centre GmbH
Lead Applied Scientist, NLP/GenAI(m/w/x)
Vollzeitmit HomeofficeSeniorZug - Thomson Reuters
Lead Applied Scientist - Legal Tech(m/w/x)
Vollzeitmit HomeofficeSeniorZug - Thomson Reuters
Senior Applied Scientist, Knowledge Graphs and ML(m/w/x)
Vollzeitmit HomeofficeSeniorZug - Thomson Reuters
Lead Applied Scientist I(m/w/x)
Vollzeitmit HomeofficeSeniorZug
Applied Scientist, NLP/GenAI(m/w/x)
Developing AI pipelines for legal document understanding, knowledge extraction, and synthetic data generation at a legal, tax, and media content provider. 3+ years building/deploying deep learning/LLM-based document understanding systems required. Work from anywhere for up to 8 weeks per year.
Anforderungen
- PhD in Computer Science, AI, NLP, or related field, or Master's with equivalent research/industry experience
- 3+ years experience building/deploying document understanding, information extraction, or knowledge graph systems (deep learning, LLMs, NLP)
- Ability to translate complex document understanding problems into innovative AI applications
- Professional experience scaling and leading in applied research
- Strong programming skills (Python) and modern deep learning frameworks experience
- Publications at relevant venues (ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD)
- Deep understanding of document understanding fundamentals (layout analysis, semantic chunking, classification, taxonomies, multi-label, domain schemas)
- Expertise in knowledge extraction and knowledge graph construction (entity recognition, relation extraction, citation parsing, graph representations)
- Expertise in LLM-based information extraction, few-shot/multi-task learning, post-training, knowledge distillation
- Solid understanding of synthetic data generation techniques for NLP (query-answer generation, data augmentation)
- Solid understanding of efficiency optimization (knowledge distillation, model compression, SLM-based solutions)
- Solid understanding of DL/ML approaches for NLP tasks
- Experience designing annotation workflows, creating labeled datasets, and developing evaluation frameworks
- Prior work on legal document understanding, information extraction, knowledge representation (legal citations, domain concepts), or legal AI applications
- Prior work handling complex legal document structures (non-uniform formatting, nested hierarchies, cross-references, embedded elements)
- Experience building systems for analysis, question answering, or retrieval across large document collections
- Experience with knowledge graph frameworks/methodologies for legal or enterprise applications
- Understanding of RAG and agentic workflows for enterprise knowledge
- Publications at relevant venues (ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD)
- Experience with AzureML or AWS SageMaker
Aufgaben
- Develop and deploy AI solutions for legal document understanding.
- Develop advanced models for semantic chunking lengthy legal documents.
- Build document enrichment systems for classification and rich metadata extraction.
- Create LLM-based pipelines for extracting and linking legal knowledge.
- Develop scalable synthetic data generation systems.
- Support model training with synthetic data.
- Simulate complex legal research queries.
- Generate hallucination-free answers.
- Collaborate with engineering for software delivery and reliability.
- Develop comprehensive data and evaluation strategies.
- Leverage human annotation and synthetic data for evaluation.
- Apply robust training and evaluation methodologies.
- Balance model performance with latency requirements for SLM solutions.
- Apply knowledge distillation to compress models into efficient SLMs.
- Determine appropriate architectures for challenging document understanding.
- Develop semantic chunking strategies for diverse documents.
- Design document classification approaches for legal taxonomies.
- Implement LLM-based knowledge extraction methods.
- Build multi-document reasoning architectures.
- Balance accuracy, efficiency, and scalability for real-world challenges.
- Partner with Engineering and Product to translate legal challenges.
- Engage stakeholders to understand use case requirements.
- Align document understanding capabilities with business needs.
- Maintain scientific and technical expertise in relevant areas.
Berufserfahrung
- 3 Jahre
Ausbildung
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- Python
- PyTorch
- Hugging Face Transformers
- DeepSpeed
- LLMs
- SLM
- RAG
- AzureML
- AWS SageMaker
Benefits
Flexibles Arbeiten
- Flexible hybrid work environment
- Flex My Way policies
- Flexible work arrangements
Workation & Sabbatical
- Work from anywhere for up to 8 weeks per year
Familienfreundlichkeit
- Work-life balance
Weiterbildungsangebote
- Culture of continuous learning
- Skill development
- Grow My Way programming
Sonstige Vorteile
- Skills-first approach
Mehr Urlaubstage
- Flexible vacation
Mentale Gesundheitsförderung
- Two company-wide Mental Health Days off
- Access to Headspace app
- Resources for mental wellbeing
Betriebliche Altersvorsorge
- Retirement savings
Sonstige Zulagen
- Tuition reimbursement
- Resources for financial wellbeing
Boni & Prämien
- Employee incentive programs
Gesundheits- & Fitnessangebote
- Resources for physical wellbeing
Gemeinnützige Ausrichtung
- Two paid volunteer days off annually
- Pro-bono consulting project opportunities
Fokus auf Nachhaltigkeit
- ESG initiative involvement opportunities
Über das Unternehmen
Thomson Reuters
Branche
Legal
Beschreibung
The company provides trusted content and technology for professionals in legal, tax, accounting, compliance, government, and media.
Noch nicht perfekt?
- Thomson Reuters
Senior Applied Scientist, NLP/GenAI(m/w/x)
Vollzeitmit HomeofficeSeniorZug - Thomson Reuters Enterprise Centre GmbH
Lead Applied Scientist, NLP/GenAI(m/w/x)
Vollzeitmit HomeofficeSeniorZug - Thomson Reuters
Lead Applied Scientist - Legal Tech(m/w/x)
Vollzeitmit HomeofficeSeniorZug - Thomson Reuters
Senior Applied Scientist, Knowledge Graphs and ML(m/w/x)
Vollzeitmit HomeofficeSeniorZug - Thomson Reuters
Lead Applied Scientist I(m/w/x)
Vollzeitmit HomeofficeSeniorZug