Skip to content
New Job?Nejo!

The AI Job Search Engine

NVNVIDIA

Software Engineering Intern, CUDA Core Libraries(m/w/x)

München
Full-timeInternshipOn-site
AI/ML
Data Science

Developing core libraries in C++/Python, optimizing GPU algorithms for VR, AI, and AV solutions. Parallel or heterogeneous programming experience, with strong C++/Python skills, required. Direct collaboration with experienced CUDA engineers.

Requirements

  • Pursuing BS, MS, or PhD in Computer Science, Computer Engineering, or related field
  • Strong programming skills in C++, Python, or both
  • Familiarity with modern C++ and/or Python library development and packaging
  • Experience with parallel or heterogeneous programming (CUDA, OpenMP, GPU-accelerated Python, or similar)
  • Experience with software libraries or open-source projects
  • Ability to work independently and drive a project from exploration to completion
  • Clear written communication for design discussions and documentation
  • Knowledge of CPU/GPU architecture and algorithmic performance
  • Hands-on experience with CUDA C++, CUDA Python, Pytorch, JAX, Numba, CuPy, or related GPU-accelerated Python stacks
  • Familiarity with libraries such as Thrust, CUB, libcudacxx, or similar modern C++/GPU libraries
  • Familiarity with compiler infrastructure and tooling such as LLVM, Clang/LLVM tooling, or MLIR
  • Comfort navigating and debugging large, multi-language codebases (C++, Python, CMake, GitHub Actions CI systems)

Tasks

  • Contribute to the design and implementation of CUDA Core Libraries in C++ and Python
  • Design and optimize GPU algorithms and APIs
  • Tune performance involving memory, parallelism, and synchronization
  • Enhance developer experience through tests, benchmarks, CI, packaging, and documentation
  • Collaborate with experienced CUDA engineers
  • Participate in design reviews and code reviews
  • Engage in open-source-style workflows

Education

  • Currently in higher education

Languages

  • EnglishBusiness Fluent

Tools & Technologies

  • C++
  • Python
  • CUDA
  • OpenMP
  • Pytorch
  • JAX
  • Numba
  • CuPy
  • LLVM
  • Clang
  • CMake
  • GitHub Actions
Find the original job posting in its most current version here. Nejo automatically captured this job from the website of NVIDIA and processed the information on Nejo with the help of AI for you. Despite careful analysis, some information may be incomplete or inaccurate. Please always verify all details in the original posting! Content and copyrights of the original posting belong to the advertising company.

  • Analog Devices, Inc.

    Intern – Build LLM‑Powered Tools for Hardware Requirements & Verification(m/w/x)

    Full-timeInternshipOn-site
    Ismaning, München
  • NVIDIA

    Principal AI Developer Technology Engineer(m/w/x)

    Full-timeOn-siteSenior
    München
  • DE63 NXP Semiconductors Germany GmbH

    Intern AI Compiler Engineering(m/w/x)

    Full-timeInternshipOn-site
    München
  • NVIDIA Germany

    Senior Developer Technology Engineer, Artificial Intelligence(m/w/x)

    Full-timeOn-siteSenior
    München
  • Analog Devices, Inc.

    Intern, Embedded Software(m/w/x)

    Full-timeInternshipOn-site
    München
View all 100+ similar jobs

Nejo is an AI – results may be incomplete or contain mistakes