Your personal AI career agent
.AI Research Engineer (Model Compression & Quantization)(m/w/x)
Applying low-bit quantization to generative AI models for financial token integration. PhD in NLP/ML and A* publications required. Focus on model compression and efficient deployment.
Requirements
- Degree in Computer Science or related field
- PhD in NLP, Machine Learning, or related field
- Solid track record in AI R&D with A* publications
- Experience with PyTorch or equivalent frameworks
- Hands-on experience with model quantization (QAT and PTQ)
- Research and hands-on experience with knowledge distillation
- Research and hands-on experience with model pruning
- Solid understanding of neural network architectures and training
- Understanding of transformers (LLMs, VLMs), backpropagation, optimization, fine-tuning
- Familiarity with C++ (advantageous)
Tasks
- Drive innovation in model compression and efficient deployment
- Reduce model footprint and computational cost
- Apply low-bit quantization to generative AI models
- Maintain accuracy and output quality during quantization
- Leverage knowledge distillation for efficient multimodal reasoning
- Implement pruning techniques to reduce computational overhead
- Analyze trade-offs between model efficiency and accuracy
- Propose improvements based on empirical findings
- Research mixed-precision quantization and advanced compression strategies
- Stay current with the latest research in model compression
- Document methodologies, experiments, and results clearly
- Support reproducibility and internal collaboration
- Communicate results to stakeholders
- Author technical papers and publish findings
- Advance the field of model compression for multimodal AI
Work Experience
- approx. 4 - 6 years
Education
- Bachelor's degree
Languages
- English – Business Fluent
Tools & Technologies
- PyTorch
- model quantization
- Quantization-Aware Training (QAT)
- Post-Training Quantization (PTQ)
- knowledge distillation
- model pruning
- neural network architectures
- transformers
- LLMs
- VLMs
- backpropagation
- optimization
- fine-tuning
- C++
Like this job?
BetaYour Career Agent finds similar jobs for you every day.
Not a perfect match?
- Tether Operations LimitedFull-timeRemoteSeniorZürich
- ANYbotics
Senior AI Research Engineer, Visual Perception(m/w/x)
Full-timeWith HomeofficeSeniorZürich - Anthropic
Research Engineer / Research Scientist, Pre-training(m/w/x)
Full-timeWith HomeofficeExperiencedZürichfrom CHF 280,000 - 680,000 / year - Avaloq
.AI Software Engineer(m/w/x)
Full-timeWith HomeofficeSeniorZürich - Mistral
AI Scientist(m/w/x)
Full-timeWith HomeofficeNot specifiedZürich
.AI Research Engineer (Model Compression & Quantization)(m/w/x)
Applying low-bit quantization to generative AI models for financial token integration. PhD in NLP/ML and A* publications required. Focus on model compression and efficient deployment.
Requirements
- Degree in Computer Science or related field
- PhD in NLP, Machine Learning, or related field
- Solid track record in AI R&D with A* publications
- Experience with PyTorch or equivalent frameworks
- Hands-on experience with model quantization (QAT and PTQ)
- Research and hands-on experience with knowledge distillation
- Research and hands-on experience with model pruning
- Solid understanding of neural network architectures and training
- Understanding of transformers (LLMs, VLMs), backpropagation, optimization, fine-tuning
- Familiarity with C++ (advantageous)
Tasks
- Drive innovation in model compression and efficient deployment
- Reduce model footprint and computational cost
- Apply low-bit quantization to generative AI models
- Maintain accuracy and output quality during quantization
- Leverage knowledge distillation for efficient multimodal reasoning
- Implement pruning techniques to reduce computational overhead
- Analyze trade-offs between model efficiency and accuracy
- Propose improvements based on empirical findings
- Research mixed-precision quantization and advanced compression strategies
- Stay current with the latest research in model compression
- Document methodologies, experiments, and results clearly
- Support reproducibility and internal collaboration
- Communicate results to stakeholders
- Author technical papers and publish findings
- Advance the field of model compression for multimodal AI
Work Experience
- approx. 4 - 6 years
Education
- Bachelor's degree
Languages
- English – Business Fluent
Tools & Technologies
- PyTorch
- model quantization
- Quantization-Aware Training (QAT)
- Post-Training Quantization (PTQ)
- knowledge distillation
- model pruning
- neural network architectures
- transformers
- LLMs
- VLMs
- backpropagation
- optimization
- fine-tuning
- C++
Like this job?
BetaYour Career Agent finds similar jobs for you every day.
About the Company
Tether Operations Limited
Industry
FinancialServices
Description
The company pioneers a global financial revolution with blockchain solutions, enabling secure and instant digital token transactions.
Not a perfect match?
- Tether Operations Limited
AI Research Engineer - Kernel & Inference Optimization(m/w/x)
Full-timeRemoteSeniorZürich - ANYbotics
Senior AI Research Engineer, Visual Perception(m/w/x)
Full-timeWith HomeofficeSeniorZürich - Anthropic
Research Engineer / Research Scientist, Pre-training(m/w/x)
Full-timeWith HomeofficeExperiencedZürichfrom CHF 280,000 - 680,000 / year - Avaloq
.AI Software Engineer(m/w/x)
Full-timeWith HomeofficeSeniorZürich - Mistral
AI Scientist(m/w/x)
Full-timeWith HomeofficeNot specifiedZürich