New Job?Nejo!

Your personal AI career agent

GEGetYourGuide

15d ago

Staff Site Reliability Engineer(m/w/x)

Berlin

Full-timeWith Home OfficeSenior

AI/ML

Nejo AI Summary

Apply now

Improving system reliability for global travel booking platform, focusing on observability tooling and incident prevention. Deep understanding of observability tooling and proven experience reducing MTTD, MTTR, and change failure rate required. Work from anywhere (40 days/year), flexible working arrangements.

Requirements

Deep understanding of observability tooling (Datadog)
Proven experience reducing MTTD, MTTR, change failure rate
Strong coding skills in Java
Comfortable reading/contributing in Go
Frontend context for React/Vue collaboration
Experience with Kubernetes
Experience with AWS
Experience with service mesh technologies (Istio/Envoy)
Solid understanding of distributed systems
Solid understanding of networking
Solid understanding of container technology
Hands-on experience with CI/CD
Hands-on experience with automated testing strategies
Hands-on experience with build systems
Ability to influence engineers and teams
Excellent written communication skills in English
Excellent verbal communication skills in English
Positive, proactive team player
Passionate about operational excellence
Led company-wide initiatives to improve DORA metrics
Identified systemic gaps in automated testing
Driven improvements reducing change failure rate
Driven improvements reducing production incidents
Embedded operational excellence practices into culture
Driven cost-reduction outcomes through improvements

Tasks

Prevent incidents and enhance user trust
Enable faster incident resolution
Drive operational excellence and reliability
Partner with product teams to improve system reliability
Reduce incident frequency and resolution times
Lead post-incident reviews and implement improvements
Build diagnostic and resolution tooling
Promote blameless incident handling and continuous improvement
Participate in on-call infrastructure rotation
Advance Datadog observability practices
Ensure meaningful SLOs and actionable alerts
Enable efficient production debugging
Improve change failure rate with automated testing
Reduce deployment costs and risks
Design and maintain well-documented development paths
Collaborate with product teams on system design
Guide teams on infrastructure best practices
Identify and implement cost optimization
Leverage AI for incident response and workflow improvement

Work Experience

approx. 4 - 6 years

Education

Vocational certificationOR
Bachelor's degreeOR
Master's degree

Languages

English – Business Fluent

Tools & Technologies

Datadog
Java
Go
React
Vue
Kubernetes
AWS
Istio
Envoy

Benefits

Additional Allowances

Annual personal growth budget

Mentorship & Coaching

Mentorship programs

Flexible Working

Work from anywhere (40 days/year)
Flexible working arrangements

Team Events

Quarterly team events
Yearly company-wide events

Public Transport Subsidies

Monthly transportation budget

Healthcare & Fitness

Monthly fitness budget
Health and wellness benefits

Corporate Discounts

Discounts on GetYourGuide activities

Learning & Development

Language reimbursement program

Find the original job posting in its most current version here. Nejo automatically captured this job from the website of GetYourGuide and processed the information on Nejo with the help of AI for you. Despite careful analysis, some information may be incomplete or inaccurate. Please always verify all details in the original posting! Content and copyrights of the original posting belong to the advertising company.

Like this job?

Beta

Your Career Agent finds similar jobs for you every day.

Not a perfect match?

100+ Similar Jobs in Berlin View all

GetYourGuide
Senior Engineer, Operational Excellence(m/w/x)
Full-timeWith HomeofficeSenior
Berlin
Nebius
Senior Site Reliability Engineer(m/w/x)
Full-timeWith HomeofficeSenior
Berlin
Scout24
Senior Platform Engineer - Site Reliability(m/w/x)
Full-timeWith HomeofficeManagement
Berlin
ImmoScout24
Senior Platform Engineer - Site Reliability(m/w/x)
Full-timeWith HomeofficeManagement
Berlin
IONOS SE
Site Reliability Engineer(m/w/x)
Full-timeWith HomeofficeExperienced
Berlin

View all 100+ similar jobs

GEGetYourGuide

15d ago