There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.
As a Site Reliability Engineer III at JPMorgan Chase within the Employee Platforms (Content Creation & Collaboration group), you will solve complex and broad business problems with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize applications and their associated infrastructure to independently decompose and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform.
Job responsibilities
Guide and assist in building design consensus among peers. Collaborate with software engineers to design and implement CI/CD deployment approaches. Design, develop, test, and implement solutions for availability, reliability, and scalability. Implement infrastructure, configuration, and network as code for applications and platforms. Collaborate with technical experts and stakeholders to resolve complex problems. Understand and utilize service level indicators and objectives to proactively address issues. Support the adoption of site reliability engineering best practices within the team.Required qualifications, capabilities, and skills
Formal training or certification on software engineering concepts and 3+ years applied experience. Possess 7+ years of experience, ideally working with Powershell/Java/Messaging applications in Production environment. Demonstrate expertise in site reliability principles, including reliability, scalability, performance, security, enterprise architecture, and toil reduction. Familiar with implementing SRE practices in applications or platforms. Proficient in at least one programming language, such as PowerShell or Java/Spring Boot. Skilled in messaging platforms like Microsoft Exchange 2019 or Exchange Online. Experience in observability, including monitoring, SLO alerting, and telemetry collection using tools like Grafana, Dynatrace, Prometheus, Datadog, Splunk, and FluentD. Familiar with CI/CD tools such as Jenkins, GitLab, or Terraform. Proficient in writing queries for Splunk and PromQL.Capable of implementing quick solutions to reduce MTTR, conducting RCA, and developing long-term improvement strategies. Strong ability to contribute to collaborative teams, presenting information logically and effectively with minimal supervision. Preferred qualifications, capabilities, and skills Familiarity with troubleshooting common networking technologies and issues related to Outlook, Email, Exchange environment Working knowledge of software applications and technical processes within a given technical discipline (e.g., Cloud Foundry, Predictive Monitoring, Data bricks etc.)