Senior Site Reliability Engineer
Garmin International
Overview We are seeking a full-time Senior Site Reliability Engineer in our Olathe, KS location. In this role, you will be responsible for ensuring the integrity of Garmin's production environment is maintained and that all releases into the environment are well-organized, communicated, and managed. Essential Functions Author and lead process improvements to the whole project lifecycle and release process Establish and provide training to software development teams on operational process and automations that promote software scale, integrity and stability Lead design/definition activities for moderate- and high-complexity systems, features, and/or process Champion the shift-left culture of reliability and delivery performance within software development teams Monitor and support moderate- and high-complexity software releases Design and implement improvements to the software lifecycle and production pipeline through automated tools/systems that align with industry best practices Design and implement observability improvements to the software applications and infrastructure Coordinate and improve monitoring practices across software applications and infrastructure Build and/or maintain tools to generate reports Maintain accurate data to facilitate reporting on key reliability SLOs for multiple products/systems Improve the team’s incident response by nurturing incident playbooks Through post-incident activities, proactively identify and/or implement reliability improvements and automated mitigations of issue recurrence Cultivate a healthy culture of post-incident activities throughout the organization Cultivate engagement in the SRE community to nurture standards, best practices, and training across product owners, software engineers, and other SREs Participate in capacity planning to ensure software can scale sufficiently at peak times Exemplify Garmin’s Mission Statement, Vision, Values and Quality Policy and proactively work to improve Garmin’s image and culture May participate in or lead disaster recovery training May coordinate or engage in chaos testing activities on live systems May build new environments, both with legacy and cloud/container-based infrastructure May assist with moderate- and high-complexity problem resolution and debugging (including code-level debugging) May serve as a mentor to less experienced SREs May design and develop code improvements that improve the resiliency of web applications and services, such as circuit-breaker, caching, messaging, etc May provide on-call support and incident response to troubleshoot and resolve major issues May engage in code and design reviews to provide scalability and reliability insights to other software developers Work collaboratively and professionally in a team environment with other Garmin associates to achieve goals Provide reliable solutions to a variety of customer problems using sound problem solving techniques Communicate status of work clearly providing visibility to supervisor or mentor Accept and act on constructive criticism Thoroughly document work in an organized manner Basic Qualifications Bachelor of Science Degree in Computer Science, Electrical or Computer Engineering from a four-year college or university AND a minimum of 5 years relevant experience performing a substantially similar Applications Engineering role OR an equivalent combination of education and relevant experience Excellent academics (cumulative GPA greater than or equal to 3.0 as a general rule) Experience with moderately complex build and deployment automation Proficiency in application languages/frameworks such as Java, SpringBoot, C#, JavaScript, React, Angular Experience scaling cloud native applications in large, high-availability environments Experience with DevOps-style tools such as Jenkins, Maven, GitLab, Nexus, RunDeck Experience with Infrastructure as Code such as Ansible, Terraform, Salt, Chef, Puppet Experience with messaging technologies such as RabbitMQ, Kafka Experience with data storage technologies such as RDBMS, No-SQL Experience with Linux Administration Configuration of complex multi-tiered server applications Effective judgment, discretion, and decision-making abilities Strong and effective verbal, written, and interpersonal communication skills Team-oriented, possessing a positive attitude and working well with others Desired Qualifications Familiarity with the Agile Manifesto and various Agile practices and frameworks Experience applying Lean principles to individual, team, and organizational process Experience with scripting languages such as Python, Groovy Experience deploying complex applications within cloud infrastructures Experience deploying containerized applications within Kubernetes Experience with OpenStack cloud computing infrastructure and related technologies Experience with APM monitoring tools such as Zabbix, AppDynamics, New Relic, Dynatrace Experience with CDN Providers such as Akamai/Cloudflare Experience with observability tools such as Uptrends, Splunk, Kibana Experience with automated testing tools and testing procedures Garmin International is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, religion, color, national origin, citizenship, sex, sexual orientation, gender identity, veteran’s status, age or disability. This position is eligible for Garmin's benefit program. Details can be found here: Garmin Benefits
Confirm your E-mail: Send Email
All Jobs from Garmin International