Sr Site Reliability Engineer
GE Aerospace
**Job Description Summary**
As a Sr. Site Reliability Engineer for the GE Aerospace Digital Workplace team, you will be responsible for working closely with the broader Digital Workplace organization to lead the availability, scalability, performance, observability and resiliency goals of key productivity, collaboration, and engagement tools & applications/services across the organization in a seamless, contemporary, and modern way. Working under the leadership of our Director of Infrastructure & Database Engineering, you will be instrumental in building and implementing reliability measures, guardrails & solutions for our cloud infrastructure. This enables our product portfolio of core Intranet Services, Productivity & Collaboration platforms, BPO tooling, Employee Experience & Engagement platforms, and Document Management solutions to achieve 99.9% availability. Furthermore, your role encompasses addressing production issues, improving application’s & service’s performance, conducting capacity benchmarking & optimizing infrastructure costs, design & evolve resilient cloud/infrastructure architecture, and leverage engineering solutions to solve operational problems. You will be part of a growing, focused team to continually develop this internal product portfolio, delivering increased value to our 60,000+ user base across the world, impacting every employee across the company.
**Job Description**
**Essential Responsibilities:**
+ Understand business requirements and collaborate with Product & DevOps teams to implement highly available, scalable, resilient, cost-efficient solutions in Cloud environments.
+ Deploy Observability tools (New Relic, Splunk, ELK, Other open source O11y tools..etc) in our Cloud infrastructure and applications via Terraform and be the SME for these tools.
+ Create and configure alerts, dashboards, reports mapping to the Golden signals – Latency, Errors, Traffic, Saturation.
+ Pioneer the definitions of SLIs, SLOs and Error Budgets for GE Aerospace Digital Workplace’s products and services. And, champion the implementation for large scale adoption.
+ Perform Root Cause Analysis (RCA) for SLO breaches, Alerts and Incidents. Front-end the troubleshooting and debugging sessions.
+ Solve problems relating to critical products, applications, services and create solutions (automations, runbooks..etc.) to prevent problem recurrence.
+ Lead the Incident Management + Postmortem processes and collaborate with the Operations team to develop the templates for comms, runbooks and documents.
+ Consistently share best practices for reliability, resiliency, performance, and improve processes within and across teams.
+ Execute data driven approach to make decisions around capacity needs, Cloud cost optimization and infrastructure stability.
+ Prioritize reducing MTTx (Mean Time to Recover/Resolve/Repair) for Production incidents to provide better user experience.
+ Propose new design and develop solutions to solve complex problems in application resiliency and availability.
+ Be a strong technical mentor for junior team members professionally to help them realize their full potential.
**Qualifications/ Requirements:**
+ Bachelor’s degree from a recognized university or college with a minimum of 4 years of professional experience OR Diploma with a minimum of 5 years of professional experience OR Higher Secondary Certificate with a minimum of 7 years of professional experience
+ A minimum of 2 years of experience in Production Engineering or Site Reliability Engineering roles.
+ A minimum of 2 years of experience in Cloud environments (e.g., AWS, Azure) is required.
+ A minimum of 2 years of experience in DevOps and Infrastructure domain.
**Desired Characteristics:**
**Technical Expertise:**
+ Primary role in recent positions must be as an infrastructure or software engineer or SRE working with Cloud technologies, predominately Production facing.
+ Expertise in Observability i.e. configuring monitoring & logging tools(e.g. NewRelic, Splunk, CloudWatch, ELK) and proficiency in using them.
+ Solid and extensive experience in Cloud environments, specifically AWS or Azure.
+ Good programming skills beyond bash/shell scripting. Eg. Python, Java.
+ A prior or current certification on the Azure or AWS platforms is a strong plus.
+ Configuration management, Infrastructure as Code (IaC), and CI/CD experience (Jenkins, GitLab, Nexus, etc.).
+ Solid understanding of operating systems, especially Linux and containerization technologies (e.g. Docker and Kubernetes).
+ Ability to work in DevOps culture and in Agile/Scrum model.
+ Influence and create new designs, architectures, standards, and methods for large-scale distributed systems.
+ Technical mindset focused on automating everything to reduce manual toil
**Business and Leadership Expertise:**
+ Prior experience with Digital Workplace services is a huge plus.
+ Should lead by example and adopt the SRE mindset of Blamelessness.
+ Able to develop and write modular code to solve complex problems impacting application resiliency and availability.
+ Ability to determine how to effectively integrate disparate systems to optimize operational processes.
+ Demonstrated strong technical, problem-solving, and analytical skills.
+ Solid communication skills at all levels of the organization.
+ Able to influence peers and leadership cross-functionally for data driven solutions, hypotheses, and theories.
+ Driven by professional curiosity, and a desire to develop deep understanding of products applications, services, and their dependencies.
+ Proactively identifies and removes project obstacles or barriers on behalf of the team.
Note:
To comply with US immigration and other legal requirements, it is necessary to specify the minimum number of years' experience required for any role based within the USA. For roles outside of the USA, to ensure compliance with applicable legislation, the JDs should focus on the substantive level of experience required for the role and a minimum number of years should NOT be used.
This Job Description is intended to provide a high level guide to the role. However, it is not intended to amend or otherwise restrict/expand the duties required from each individual employee as set out in their respective employment contract and/or as otherwise agreed between an employee and their manager.
**Additional Information**
**Relocation Assistance Provided:** No
GE Aerospace is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, national or ethnic origin, sex, sexual orientation, gender identity or expression, age, disability, protected veteran status or other characteristics protected by law.
Confirm your E-mail: Send Email
All Jobs from GE Aerospace