Dein persönlicher KI-Karriere-Agent
Staff/Senior AWS Cloud Platform Engineer(m/w/x)
Optimizing incident management for cloud-native IoT SuperNetwork on AWS. Hands-on experience with observability tools (Prometheus, Mimir, Grafana, Loki) in SaaS/telecom required. Focus on mission-critical IoT use cases for a global platform.
Anforderungen
- Proven experience as (Site) Reliability Engineer or similar role in SaaS and/or telecom company
- Hands-on experience with observability tools (Prometheus, Mimir, Grafana, Loki, CloudWatch, Grafana IRM, Rootly)
- Experience in establishing and managing incident management processes
- Understanding of incident management frameworks and best practices
- Extensive experience with AWS cloud services (EC2, S3, RDS, Lambda, CloudWatch)
- Expert skills with modern infrastructure tooling (Kubernetes, Terraform, GitHub Actions, Jenkins)
- Good understanding of modern development tooling (microservices architecture, 12-factor applications, Docker)
- Advanced documentation skills
- Exceptional problem-solving and critical thinking
- Passion for enhancing development experiences
- Ability to work independently and as part of a team
- Knowledge of networking protocols and telecom systems
- Knowledge of secure software development
- Familiarity with programming languages (Python, Go, or Java)
- AWS Certification (e.g., AWS Certified DevOps Engineer, AWS Certified Solutions Architect)
Aufgaben
- Lead end-to-end incident management.
- Optimize incident management processes.
- Ensure timely incident detection and resolution.
- Document incidents thoroughly.
- Coordinate cross-functional incident teams.
- Conduct post-mortems and root cause analyses.
- Drive continuous workflow improvements.
- Design and implement observability frameworks.
- Continuously improve observability frameworks.
- Develop dashboards, alerts, and metrics strategies.
- Develop logging strategies.
- Monitor service health.
- Proactively detect anomalies.
- Support issue resolution.
- Ensure cost-optimized platform performance.
- Partner with cross-functional teams.
- Implement observability best practices.
- Provide training and guidance on tools.
- Leverage metrics data.
- Drive engineering priorities.
- Design resilient AWS cloud infrastructure.
- Build resilient AWS cloud infrastructure.
- Maintain resilient AWS cloud infrastructure.
- Implement security best practices.
- Implement scalability best practices.
- Implement cost optimization best practices.
- Ensure high availability and disaster recovery.
- Ensure robust platform pipelines.
- Ensure robust shared infrastructure.
- Ensure robust application services.
Berufserfahrung
- ca. 4 - 6 Jahre
Ausbildung
- Abgeschlossene BerufsausbildungODER
- Bachelor-AbschlussODER
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- Prometheus
- Mimir
- Grafana
- Loki
- CloudWatch
- Grafana IRM
- Rootly
- AWS
- EC2
- S3
- RDS
- Lambda
- Kubernetes
- Terraform
- GitHub Actions
- Jenkins
- Docker
- Python
- Go
- Java
Gefällt dir diese Stelle?
BetaDein Career Agent findet täglich ähnliche Jobs für dich.
Noch nicht perfekt?
- 1GLOBALVollzeitnur vor OrtSeniorBerlin
- Trade Republic
Cloud Platform Tech Lead(m/w/x)
Vollzeitnur vor OrtSeniorBerlin - bonify
Senior Platform Engineer (AWS)(m/w/x)
Vollzeitnur vor OrtSeniorBerlin - Trade Republic
Observability Tech Lead(m/w/x)
Vollzeitnur vor OrtSeniorBerlin - Workato
Senior Infrastructure Engineer - Observability(m/w/x)
Vollzeitnur vor OrtSeniorBerlin, Frankfurt am Main, München
Staff/Senior AWS Cloud Platform Engineer(m/w/x)
Optimizing incident management for cloud-native IoT SuperNetwork on AWS. Hands-on experience with observability tools (Prometheus, Mimir, Grafana, Loki) in SaaS/telecom required. Focus on mission-critical IoT use cases for a global platform.
Anforderungen
- Proven experience as (Site) Reliability Engineer or similar role in SaaS and/or telecom company
- Hands-on experience with observability tools (Prometheus, Mimir, Grafana, Loki, CloudWatch, Grafana IRM, Rootly)
- Experience in establishing and managing incident management processes
- Understanding of incident management frameworks and best practices
- Extensive experience with AWS cloud services (EC2, S3, RDS, Lambda, CloudWatch)
- Expert skills with modern infrastructure tooling (Kubernetes, Terraform, GitHub Actions, Jenkins)
- Good understanding of modern development tooling (microservices architecture, 12-factor applications, Docker)
- Advanced documentation skills
- Exceptional problem-solving and critical thinking
- Passion for enhancing development experiences
- Ability to work independently and as part of a team
- Knowledge of networking protocols and telecom systems
- Knowledge of secure software development
- Familiarity with programming languages (Python, Go, or Java)
- AWS Certification (e.g., AWS Certified DevOps Engineer, AWS Certified Solutions Architect)
Aufgaben
- Lead end-to-end incident management.
- Optimize incident management processes.
- Ensure timely incident detection and resolution.
- Document incidents thoroughly.
- Coordinate cross-functional incident teams.
- Conduct post-mortems and root cause analyses.
- Drive continuous workflow improvements.
- Design and implement observability frameworks.
- Continuously improve observability frameworks.
- Develop dashboards, alerts, and metrics strategies.
- Develop logging strategies.
- Monitor service health.
- Proactively detect anomalies.
- Support issue resolution.
- Ensure cost-optimized platform performance.
- Partner with cross-functional teams.
- Implement observability best practices.
- Provide training and guidance on tools.
- Leverage metrics data.
- Drive engineering priorities.
- Design resilient AWS cloud infrastructure.
- Build resilient AWS cloud infrastructure.
- Maintain resilient AWS cloud infrastructure.
- Implement security best practices.
- Implement scalability best practices.
- Implement cost optimization best practices.
- Ensure high availability and disaster recovery.
- Ensure robust platform pipelines.
- Ensure robust shared infrastructure.
- Ensure robust application services.
Berufserfahrung
- ca. 4 - 6 Jahre
Ausbildung
- Abgeschlossene BerufsausbildungODER
- Bachelor-AbschlussODER
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- Prometheus
- Mimir
- Grafana
- Loki
- CloudWatch
- Grafana IRM
- Rootly
- AWS
- EC2
- S3
- RDS
- Lambda
- Kubernetes
- Terraform
- GitHub Actions
- Jenkins
- Docker
- Python
- Go
- Java
Gefällt dir diese Stelle?
BetaDein Career Agent findet täglich ähnliche Jobs für dich.
Über das Unternehmen
emnify
Branche
IT
Beschreibung
The company enhances innovative components, bridging telco languages and internet protocols.
Noch nicht perfekt?
- 1GLOBAL
Senior Site Reliability Engineer (SRE)(m/w/x)
Vollzeitnur vor OrtSeniorBerlin - Trade Republic
Cloud Platform Tech Lead(m/w/x)
Vollzeitnur vor OrtSeniorBerlin - bonify
Senior Platform Engineer (AWS)(m/w/x)
Vollzeitnur vor OrtSeniorBerlin - Trade Republic
Observability Tech Lead(m/w/x)
Vollzeitnur vor OrtSeniorBerlin - Workato
Senior Infrastructure Engineer - Observability(m/w/x)
Vollzeitnur vor OrtSeniorBerlin, Frankfurt am Main, München