Neuer Job?Nejo!

Dein persönlicher KI-Karriere-Agent

GEGetYourGuide

vor 20 Tagen

Staff Site Reliability Engineer(m/w/x)

Berlin

Vollzeitmit HomeofficeSenior

AI/ML

Nejo KI-Zusammenfassung

Jetzt bewerben

Improving system reliability for global travel booking platform, focusing on observability tooling and incident prevention. Deep understanding of observability tooling and proven experience reducing MTTD, MTTR, and change failure rate required. Work from anywhere (40 days/year), flexible working arrangements.

Anforderungen

Deep understanding of observability tooling (Datadog)
Proven experience reducing MTTD, MTTR, change failure rate
Strong coding skills in Java
Comfortable reading/contributing in Go
Frontend context for React/Vue collaboration
Experience with Kubernetes
Experience with AWS
Experience with service mesh technologies (Istio/Envoy)
Solid understanding of distributed systems
Solid understanding of networking
Solid understanding of container technology
Hands-on experience with CI/CD
Hands-on experience with automated testing strategies
Hands-on experience with build systems
Ability to influence engineers and teams
Excellent written communication skills in English
Excellent verbal communication skills in English
Positive, proactive team player
Passionate about operational excellence
Led company-wide initiatives to improve DORA metrics
Identified systemic gaps in automated testing
Driven improvements reducing change failure rate
Driven improvements reducing production incidents
Embedded operational excellence practices into culture
Driven cost-reduction outcomes through improvements

Aufgaben

Prevent incidents and enhance user trust
Enable faster incident resolution
Drive operational excellence and reliability
Partner with product teams to improve system reliability
Reduce incident frequency and resolution times
Lead post-incident reviews and implement improvements
Build diagnostic and resolution tooling
Promote blameless incident handling and continuous improvement
Participate in on-call infrastructure rotation
Advance Datadog observability practices
Ensure meaningful SLOs and actionable alerts
Enable efficient production debugging
Improve change failure rate with automated testing
Reduce deployment costs and risks
Design and maintain well-documented development paths
Collaborate with product teams on system design
Guide teams on infrastructure best practices
Identify and implement cost optimization
Leverage AI for incident response and workflow improvement

Berufserfahrung

ca. 4 - 6 Jahre

Ausbildung

Abgeschlossene BerufsausbildungODER
Bachelor-AbschlussODER
Master-Abschluss

Sprachen

Englisch – verhandlungssicher

Tools & Technologien

Datadog
Java
Go
React
Vue
Kubernetes
AWS
Istio
Envoy

Benefits

Sonstige Zulagen

Annual personal growth budget

Mentoring & Coaching

Mentorship programs

Flexibles Arbeiten

Work from anywhere (40 days/year)
Flexible working arrangements

Team Events & Ausflüge

Quarterly team events
Yearly company-wide events

Öffi Tickets

Monthly transportation budget

Gesundheits- & Fitnessangebote

Monthly fitness budget
Health and wellness benefits

Mitarbeiterrabatte

Discounts on GetYourGuide activities

Weiterbildungsangebote

Language reimbursement program

Die Originalanzeige dieses Stellenangebotes in der aktuellsten Version findest du hier. Nejo hat diesen Job automatisch von der Website des Unternehmens GetYourGuide erfasst und die Informationen auf Nejo mit Hilfe von KI für dich aufbereitet. Trotz sorgfältiger Analyse können einzelne Informationen unvollständig oder ungenau sein. Bitte prüfe immer alle Angaben in der Originalanzeige! Inhalte und Urheberrechte der Originalanzeige liegen beim ausschreibenden Unternehmen.

Gefällt dir diese Stelle?

Beta

Dein Career Agent findet täglich ähnliche Jobs für dich.

Noch nicht perfekt?

GetYourGuide
Senior Engineer, Operational Excellence(m/w/x)
Vollzeitmit HomeofficeSenior
Berlin
Doctolib
Senior Site Reliability Engineer - Observability(m/w/x)
Vollzeitmit HomeofficeSenior
Berlin
Scout24
Senior Platform Engineer - Site Reliability(m/w/x)
Vollzeitmit HomeofficeManagement
Berlin
Nebius
Senior Site Reliability Engineer(m/w/x)
Vollzeitmit HomeofficeSenior
Berlin
ImmoScout24
Senior Platform Engineer - Site Reliability(m/w/x)
Vollzeitmit HomeofficeManagement
Berlin

Alle 100+ ähnlichen Jobs ansehen

GEGetYourGuide

vor 20 Tagen