Dein persönlicher KI-Karriere-Agent
Staff/Senior AWS Cloud Platform Engineer(m/w/x)
Optimizing incident management for cloud-native IoT SuperNetwork on AWS. Hands-on experience with observability tools (Prometheus, Mimir, Grafana, Loki) in SaaS/telecom required. Focus on mission-critical IoT use cases for a global platform.
Anforderungen
- Proven experience as (Site) Reliability Engineer or similar role in SaaS and/or telecom company
- Hands-on experience with observability tools (Prometheus, Mimir, Grafana, Loki, CloudWatch, Grafana IRM, Rootly)
- Experience in establishing and managing incident management processes
- Understanding of incident management frameworks and best practices
- Extensive experience with AWS cloud services (EC2, S3, RDS, Lambda, CloudWatch)
- Expert skills with modern infrastructure tooling (Kubernetes, Terraform, GitHub Actions, Jenkins)
- Good understanding of modern development tooling (microservices architecture, 12-factor applications, Docker)
- Advanced documentation skills
- Exceptional problem-solving and critical thinking
- Passion for enhancing development experiences
- Ability to work independently and as part of a team
- Knowledge of networking protocols and telecom systems
- Knowledge of secure software development
- Familiarity with programming languages (Python, Go, or Java)
- AWS Certification (e.g., AWS Certified DevOps Engineer, AWS Certified Solutions Architect)
Aufgaben
- Lead end-to-end incident management.
- Optimize incident management processes.
- Ensure timely incident detection and resolution.
- Document incidents thoroughly.
- Coordinate cross-functional incident teams.
- Conduct post-mortems and root cause analyses.
- Drive continuous workflow improvements.
- Design and implement observability frameworks.
- Continuously improve observability frameworks.
- Develop dashboards, alerts, and metrics strategies.
- Develop logging strategies.
- Monitor service health.
- Proactively detect anomalies.
- Support issue resolution.
- Ensure cost-optimized platform performance.
- Partner with cross-functional teams.
- Implement observability best practices.
- Provide training and guidance on tools.
- Leverage metrics data.
- Drive engineering priorities.
- Design resilient AWS cloud infrastructure.
- Build resilient AWS cloud infrastructure.
- Maintain resilient AWS cloud infrastructure.
- Implement security best practices.
- Implement scalability best practices.
- Implement cost optimization best practices.
- Ensure high availability and disaster recovery.
- Ensure robust platform pipelines.
- Ensure robust shared infrastructure.
- Ensure robust application services.
Berufserfahrung
- ca. 4 - 6 Jahre
Ausbildung
- Abgeschlossene BerufsausbildungODER
- Bachelor-AbschlussODER
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- Prometheus
- Mimir
- Grafana
- Loki
- CloudWatch
- Grafana IRM
- Rootly
- AWS
- EC2
- S3
- RDS
- Lambda
- Kubernetes
- Terraform
- GitHub Actions
- Jenkins
- Docker
- Python
- Go
- Java
Noch nicht perfekt?
- Trade RepublicVollzeitnur vor OrtSeniorBerlin
- Workato
Senior Infrastructure Engineer - Observability(m/w/x)
Vollzeitnur vor OrtSeniorBerlin, Frankfurt am Main, München - Workato
Senior Infrastructure Engineer /DevOps(m/w/x)
Vollzeitnur vor OrtSeniorBerlin, Frankfurt am Main, München - Trade Republic
Observability Tech Lead(m/w/x)
Vollzeitnur vor OrtSeniorBerlin - Trade Republic
(Senior) Platform Engineer (Go)(m/w/x)
Vollzeitnur vor OrtBerufserfahrenBerlin
Staff/Senior AWS Cloud Platform Engineer(m/w/x)
Optimizing incident management for cloud-native IoT SuperNetwork on AWS. Hands-on experience with observability tools (Prometheus, Mimir, Grafana, Loki) in SaaS/telecom required. Focus on mission-critical IoT use cases for a global platform.
Anforderungen
- Proven experience as (Site) Reliability Engineer or similar role in SaaS and/or telecom company
- Hands-on experience with observability tools (Prometheus, Mimir, Grafana, Loki, CloudWatch, Grafana IRM, Rootly)
- Experience in establishing and managing incident management processes
- Understanding of incident management frameworks and best practices
- Extensive experience with AWS cloud services (EC2, S3, RDS, Lambda, CloudWatch)
- Expert skills with modern infrastructure tooling (Kubernetes, Terraform, GitHub Actions, Jenkins)
- Good understanding of modern development tooling (microservices architecture, 12-factor applications, Docker)
- Advanced documentation skills
- Exceptional problem-solving and critical thinking
- Passion for enhancing development experiences
- Ability to work independently and as part of a team
- Knowledge of networking protocols and telecom systems
- Knowledge of secure software development
- Familiarity with programming languages (Python, Go, or Java)
- AWS Certification (e.g., AWS Certified DevOps Engineer, AWS Certified Solutions Architect)
Aufgaben
- Lead end-to-end incident management.
- Optimize incident management processes.
- Ensure timely incident detection and resolution.
- Document incidents thoroughly.
- Coordinate cross-functional incident teams.
- Conduct post-mortems and root cause analyses.
- Drive continuous workflow improvements.
- Design and implement observability frameworks.
- Continuously improve observability frameworks.
- Develop dashboards, alerts, and metrics strategies.
- Develop logging strategies.
- Monitor service health.
- Proactively detect anomalies.
- Support issue resolution.
- Ensure cost-optimized platform performance.
- Partner with cross-functional teams.
- Implement observability best practices.
- Provide training and guidance on tools.
- Leverage metrics data.
- Drive engineering priorities.
- Design resilient AWS cloud infrastructure.
- Build resilient AWS cloud infrastructure.
- Maintain resilient AWS cloud infrastructure.
- Implement security best practices.
- Implement scalability best practices.
- Implement cost optimization best practices.
- Ensure high availability and disaster recovery.
- Ensure robust platform pipelines.
- Ensure robust shared infrastructure.
- Ensure robust application services.
Berufserfahrung
- ca. 4 - 6 Jahre
Ausbildung
- Abgeschlossene BerufsausbildungODER
- Bachelor-AbschlussODER
- Master-Abschluss
Sprachen
- Englisch – verhandlungssicher
Tools & Technologien
- Prometheus
- Mimir
- Grafana
- Loki
- CloudWatch
- Grafana IRM
- Rootly
- AWS
- EC2
- S3
- RDS
- Lambda
- Kubernetes
- Terraform
- GitHub Actions
- Jenkins
- Docker
- Python
- Go
- Java
Über das Unternehmen
emnify
Branche
IT
Beschreibung
The company enhances innovative components, bridging telco languages and internet protocols.
Noch nicht perfekt?
- Trade Republic
Cloud Platform Tech Lead(m/w/x)
Vollzeitnur vor OrtSeniorBerlin - Workato
Senior Infrastructure Engineer - Observability(m/w/x)
Vollzeitnur vor OrtSeniorBerlin, Frankfurt am Main, München - Workato
Senior Infrastructure Engineer /DevOps(m/w/x)
Vollzeitnur vor OrtSeniorBerlin, Frankfurt am Main, München - Trade Republic
Observability Tech Lead(m/w/x)
Vollzeitnur vor OrtSeniorBerlin - Trade Republic
(Senior) Platform Engineer (Go)(m/w/x)
Vollzeitnur vor OrtBerufserfahrenBerlin