Bangalore, India
22 hours ago
Site Reliability Engineer - Observability

We’re looking for problem solvers, innovators, and dreamers who are searching for anything but business as usual. Like us, you’re a high performer who’s an expert at your craft, constantly challenging the status quo. You value inclusivity and want to join a culture that empowers you to show up as your authentic self. You know that success hinges on commitment, that our differences make us stronger, and that the finish line is always sweeter when the whole team crosses together.

Site Reliability Engineer - Observability
Site Reliability Engineering is a new discipline at Alteryx where the team deploys, maintains, and operates Alertyx’s Cloud SaaS Products.  The team works with Product Engineering, Infrastructure Engineering, SRE and the Customer service teams to ensure SaaS services are available and performant. This team will originate customer software/service fixes, contributing to various code bases, and ensuring product availability, scalability, and resiliency. In addition, team members will automate responses to alerts and program alert remediation. What you’ll do:Deploy, Provision, and maintain Alteryx’s Cloud SaaS products.Promotes our customer-centered approach to be a part of our customer’s solutions.Work with product teams on non-functional requirements for SaaS products including resiliency, security, availability.Improve and configure product observability/alerts and automate remediation through programming.Serve as an escalation point for customer SaaS instances for availability, performance and application behavior issues.Demonstrate critical thinking and growth mindset, enthusiastic about learning new technologies quickly and applying the gained knowledge to address customer problemsAuthor code fixes for software/service issues and work with Product Engineering teams to the merge code.Participate in an on-call rotation for off-hours support on a periodic basis if required.About you:2-4 years experience with modern observability stacks like DataDog. Prometheus, Grafana, Kibana, Thanos, Loki, TICK stack2-4 years experience  designing, programming and/or operating distributed systems software2-4 years experience programming in python, go, nodejs, java, .NET or another modern programming language.1-2 years of experience with Kubernetes, OpenShift, k3s or another container orchestration technology.Experience troubleshooting and problem solving skills related to containers or distributed systems.Experience with CI/CD technologies like ArgoCD, Jenkins, or another CD.Experience with AWS, GCP, Azure a plus.Experience debugging software issues and performing RCAs.Proficiency in Helm, Docker, and Jenkins.Ability to break down and discuss technical issues and solutions with non-technical team members.

Find yourself checking a lot of these boxes but doubting whether you should apply? At Alteryx, we support a growth mindset for our associates through all stages of their careers. If you meet some of the requirements and you share our values, we encourage you to apply. As part of our ongoing commitment to a diverse, equitable, and inclusive workplace, we’re invested in building teams with a wide variety of backgrounds, identities, and experiences.

Confirm your E-mail: Send Email
All Jobs from Alteryx, Inc.