You work on pre-training AI models and designing innovative architectures while conducting experiments and refining methodologies to improve model performance and efficiency.
Anforderungen
- •Degree in Computer Science or related field
- •PhD in NLP or Machine Learning preferred
- •Solid track record in AI R&D
- •Hands-on experience with large-scale LLM training
- •Familiarity with distributed training frameworks
- •Deep knowledge of transformer modifications
- •Strong expertise in PyTorch and Hugging Face
Deine Aufgaben
- •Conduct pre-training of AI models on large servers.
- •Design and prototype innovative architectures.
- •Execute experiments and analyze results independently.
- •Refine methodologies for optimal model performance.
- •Investigate and improve model efficiency and performance.
- •Advance training systems for scalability and efficiency.
Original Beschreibung
## AI Research Engineer (Pre-training)
**About the job:**
As a member of the AI model team, you will drive innovation in architecture development for cutting-edge models of various scales, including small, large, and multi-modal systems. Your work will enhance intelligence, improve efficiency, and introduce new capabilities to advance the field.
You will have a deep expertise in LLM architectures, a strong grasp of pre-training optimization with a hands-on, research-driven approach. Your mission is to explore and implement novel techniques and algorithms that lead to groundbreaking advancements: data curation, strengthening baselines, identifying and resolving existing pre-training bottlenecks to push the limits of AI performance.
**Responsibilities**:
* Conduct pre-training AI models on large, distributed servers equipped with thousands of NVIDIA GPUs.
* Design, prototype, and scale innovative architectures to enhance model intelligence.
* Independently and collaboratively execute experiments, analyze results, and refine methodologies for optimal performance.
* Investigate, debug, and improve both model efficiency and computational performance.
* Contribute to the advancement of training systems to ensure seamless scalability and efficiency on target platforms.
## Requirements
* A degree in Computer Science or related field. Ideally PhD in NLP, Machine Learning, or a related field, complemented by a solid track record in AI R&D (with good publications in A\* conferences).
* Hands-on experience contributing to large-scale LLM training runs on large, distributed servers equipped with thousands of NVIDIA GPUs, ensuring scalability and impactful advancements in model performance.
* Familiarity and practical experience with large-scale, distributed training frameworks, libraries and tools.
* Deep knowledge of state-of-the-art transformer and non-transformer modifications aimed at enhancing intelligence, efficiency and scalability.
* Strong expertise in PyTorch and Hugging Face libraries with practical experience in model development, continual pretraining, and deployment.