Your personal AI career agent
Evolving observability platform with large-scale telemetry pipelines for a European savings leader. Deep expertise with Prometheus, OpenTelemetry, and Mimir architectures required. Flexible hybrid setup, relocation support.
Requirements
- 5+ years in observability, platform engineering, or SRE/infrastructure
- Senior to staff level experience
- Deep expertise with observability stack (Prometheus, OpenTelemetry, Grafana, or equivalent)
- Hands-on experience with Mimir, Loki, and Tempo architectures
- Design and operate high-throughput telemetry pipelines
- Strong command of SLO-based reliability practices
- Hands-on with Kubernetes, Terraform
- Running production workloads on AWS, GCP, or Azure
- Turning observability best practices into adoptable standards
- Driving cross-team technical initiatives end-to-end
- Contribute to architectural decisions
- Communicate trade-offs to engineers and leadership
Tasks
- Build and evolve the observability platform
- Design and operate large-scale telemetry pipelines
- Improve core components with a focus on automation
- Improve core components with a focus on reliability
- Improve core components with a focus on developer experience
- Architect high-throughput telemetry systems
- Implement sampling strategies for telemetry systems
- Implement data tiering for telemetry systems
- Implement retention policies for telemetry systems
- Define observability standards
- Implement observability standards
- Define reliability standards
- Implement reliability standards
- Define SLOs
- Implement SLOs
- Define error budgets
- Implement error budgets
- Define low-noise alerting
- Implement low-noise alerting
- Support engineering teams in adopting standards
- Participate in the on-call rotation for the observability platform
- Ensure end-to-end ownership of built systems
- Ensure end-to-end ownership of operated systems
- Define long-term observability direction
- Drive cross-team initiatives from kickoff to delivery
- Align observability strategy with engineering reliability goals
- Align observability strategy with business goals
Work Experience
- 5 years
Education
- Bachelor's degreeOR
- Master's degree
Languages
- English – Business Fluent
Tools & Technologies
- Prometheus
- OpenTelemetry
- Grafana
- Mimir
- Loki
- Tempo
- Kubernetes
- Terraform
- AWS
- GCP
- Azure
Benefits
Flexible Working
- Flexible hybrid setup
Other Benefits
- Relocation support
Not a perfect match?
- WorkatoFull-timeOn-siteSeniorBerlin, Frankfurt am Main, München
- Trade Republic
(Senior) Platform Engineer (Go)(m/w/x)
Full-timeOn-siteExperiencedBerlin - Trade Republic
Database Platform Tech Lead(m/w/x)
Full-timeOn-siteSeniorBerlin - emnify
Staff/Senior AWS Cloud Platform Engineer(m/w/x)
Full-timeOn-siteSeniorBerlin - Trade Republic
Senior Kafka Platform Engineer(m/w/x)
Full-timeOn-siteSeniorBerlin
Evolving observability platform with large-scale telemetry pipelines for a European savings leader. Deep expertise with Prometheus, OpenTelemetry, and Mimir architectures required. Flexible hybrid setup, relocation support.
Requirements
- 5+ years in observability, platform engineering, or SRE/infrastructure
- Senior to staff level experience
- Deep expertise with observability stack (Prometheus, OpenTelemetry, Grafana, or equivalent)
- Hands-on experience with Mimir, Loki, and Tempo architectures
- Design and operate high-throughput telemetry pipelines
- Strong command of SLO-based reliability practices
- Hands-on with Kubernetes, Terraform
- Running production workloads on AWS, GCP, or Azure
- Turning observability best practices into adoptable standards
- Driving cross-team technical initiatives end-to-end
- Contribute to architectural decisions
- Communicate trade-offs to engineers and leadership
Tasks
- Build and evolve the observability platform
- Design and operate large-scale telemetry pipelines
- Improve core components with a focus on automation
- Improve core components with a focus on reliability
- Improve core components with a focus on developer experience
- Architect high-throughput telemetry systems
- Implement sampling strategies for telemetry systems
- Implement data tiering for telemetry systems
- Implement retention policies for telemetry systems
- Define observability standards
- Implement observability standards
- Define reliability standards
- Implement reliability standards
- Define SLOs
- Implement SLOs
- Define error budgets
- Implement error budgets
- Define low-noise alerting
- Implement low-noise alerting
- Support engineering teams in adopting standards
- Participate in the on-call rotation for the observability platform
- Ensure end-to-end ownership of built systems
- Ensure end-to-end ownership of operated systems
- Define long-term observability direction
- Drive cross-team initiatives from kickoff to delivery
- Align observability strategy with engineering reliability goals
- Align observability strategy with business goals
Work Experience
- 5 years
Education
- Bachelor's degreeOR
- Master's degree
Languages
- English – Business Fluent
Tools & Technologies
- Prometheus
- OpenTelemetry
- Grafana
- Mimir
- Loki
- Tempo
- Kubernetes
- Terraform
- AWS
- GCP
- Azure
Benefits
Flexible Working
- Flexible hybrid setup
Other Benefits
- Relocation support
About the Company
Trade Republic
Industry
FinancialServices
Description
Trade Republic is the largest savings platform in Europe, empowering everyone to build wealth with easy access to financial systems.
Not a perfect match?
- Workato
Senior Infrastructure Engineer - Observability(m/w/x)
Full-timeOn-siteSeniorBerlin, Frankfurt am Main, München - Trade Republic
(Senior) Platform Engineer (Go)(m/w/x)
Full-timeOn-siteExperiencedBerlin - Trade Republic
Database Platform Tech Lead(m/w/x)
Full-timeOn-siteSeniorBerlin - emnify
Staff/Senior AWS Cloud Platform Engineer(m/w/x)
Full-timeOn-siteSeniorBerlin - Trade Republic
Senior Kafka Platform Engineer(m/w/x)
Full-timeOn-siteSeniorBerlin