Neuer Job?Nejo!

Dein persönlicher KI-Karriere-Agent

TETether Operations Limited

vor 18 Tagen

AI Research Engineer - Kernel & Inference Optimization(m/w/x)

Zürich

VollzeitRemoteSenior

AI/ML

Nejo KI-Zusammenfassung

Jetzt bewerben

Optimizing AI model serving pipelines for low-latency, high-throughput transactions. PhD in NLP/ML and custom compute shader experience required. Focus on Metal Shading Language (MSL) implementation.

Anforderungen

Degree in Computer Science or related field
PhD in NLP, Machine Learning, or related field
Solid track record in AI R&D with publications
Knowledge of Metal Shading Language (MSL)
Comfortable writing custom compute shaders
Proven experience in low-level kernel optimizations
Proven experience in inference optimization on mobile devices
Contributions leading to measurable improvements in inference latency, throughput, and memory footprint
Deep understanding of modern model serving architectures
Deep understanding of inference optimization techniques
Strong expertise in writing GPU kernels for mobile devices
Deep understanding of model serving frameworks and engines
Practical experience in developing end-to-end inference pipelines
Ability to apply empirical research to model serving challenges
Proficient in designing robust evaluation frameworks
Designing and optimizing high-performance inference engines
Experience with Tensor Parallelism, Pipeline Parallelism, and Expert Parallelism
Deep understanding of Diffusion Models math and structure
Deep understanding of Vision Transformers math and structure
Understanding of Pruning
Understanding of Quantization
Understanding of Flash attention
Understanding of KV Cache
Understanding of Speculative Decoding (Eagle)

Aufgaben

Drive innovation in model serving and inference architectures
Optimize model deployment and inference strategies
Design and deploy state-of-the-art model serving pipelines
Ensure high throughput and low latency in model serving
Optimize memory usage in model serving pipelines
Establish clear performance targets for latency and memory
Build and run controlled inference tests
Monitor key performance indicators in production
Document and validate performance across platforms
Identify and prepare high-quality test datasets
Set criteria for evaluating model performance
Analyze computational efficiency and diagnose bottlenecks
Address suboptimal batch processing and network delays
Optimize serving infrastructure for scalability and reliability
Integrate optimized serving frameworks into production
Define success metrics for real-world performance
Ensure continuous monitoring and iterative refinements

Berufserfahrung

ca. 4 - 6 Jahre

Ausbildung

Bachelor-Abschluss

Sprachen

Englisch – verhandlungssicher

Tools & Technologien

Metal Shading Language (MSL)
Compute shaders
GPU kernels
Diffusion Models
Vision Transformers
Pruning
Quantization
Flash attention
KV Cache
Speculative Decoding (Eagle)
Tensor Parallelism
Pipeline Parallelism
Expert Parallelism

Die Originalanzeige dieses Stellenangebotes in der aktuellsten Version findest du hier. Nejo hat diesen Job automatisch von der Website des Unternehmens Tether Operations Limited erfasst und die Informationen auf Nejo mit Hilfe von KI für dich aufbereitet. Trotz sorgfältiger Analyse können einzelne Informationen unvollständig oder ungenau sein. Bitte prüfe immer alle Angaben in der Originalanzeige! Inhalte und Urheberrechte der Originalanzeige liegen beim ausschreibenden Unternehmen.

Gefällt dir diese Stelle?

Beta

Dein Career Agent findet täglich ähnliche Jobs für dich.

Noch nicht perfekt?

Tether Operations Limited
.AI Research Engineer (Model Compression & Quantization)(m/w/x)
VollzeitRemoteSenior
Zürich
Anthropic
Research Engineer / Research Scientist, Pre-training(m/w/x)
Vollzeitmit HomeofficeBerufserfahren
Zürich
ab CHF 280.000 - 680.000 / Jahr
ANYbotics
Senior AI Research Engineer, Visual Perception(m/w/x)
Vollzeitmit HomeofficeSenior
Zürich
Mistral
AI Scientist(m/w/x)
Vollzeitmit HomeofficeKeine Angabe
Zürich
Anthropic
Research Engineer, Production Model Post Training(m/w/x)
Vollzeitmit HomeofficeBerufserfahren
Zürich

Alle 100+ ähnlichen Jobs ansehen

TETether Operations Limited

vor 18 Tagen