The Site Reliability Engineer (SRE) is key to our end-to-end Engineering approach, where your role is to support the operational aspects of our cloud application Twinfield through the development of software. You bridge the gap between development and IT operations, ensuring smooth system operation and reliability of our product.
This role is unique and one of the most impactful ones within the Engineering team, as it’s centered around incident recovery, telemetry improvements and fixing the omissions in our product development cycles through continuous feedback.
The day to day work will expose you to every aspect of our product, in every way: the different functional domains and the non-functional requirements such as performance, availability and reliability. It allows you to broaden your horizon beyond building software within a team, as you’re at the forefront of what is truly our customer experience.
Successful SREs in our team will have strong problem solving, design, coding and debugging skills. We value passion, creativity, agility, accountability and the desire to learn new complex technical areas. You will be an important part of a team of highly motivated and talented individuals. This is a great opportunity to challenge yourself, grow your career and influence the next generation Accounting Software.
This is what makes working at Wolters Kluwer interesting
Wolters Kluwer Tax & Accounting delivers software products to finance professionals, among which the online accountancy product, Twinfield. At international level we belong among the top accountancy software and related services vendors. Thanks to our products, Tax Advisors and Entrepreneurs can rely on a simple and safe way of working. We make this possible mainly in the Netherlands, but we would not accomplish much without the help of more than 100 colleagues based all over the world. With India being an important location for us.
Key characteristics
As an SRE you practice comprehensive DevOps by prioritizing continuous improvement and operations. You proactively identify and implement enhancements in the software development and delivery process. Regularly reviewing processes, tools, and systems, you aim to boost efficiency, effectiveness, and software quality. Your approach involves monitoring, analysing root causes, experimenting, and providing feedback loops to your colleagues. You continuously monitor and manage the software application and infrastructure to ensure its availability, performance and security. You understand what it means to ensure our customers have a reliable experience with our product. Additionally, you can advocate for improvements to reduce our Mean Time To Recovery and other quality metrics part of our DORA measurements.
As your daily work, you support the Production environment of our product and how we get it there. You will use best practices to improve our current setup of alerting and monitoring. This means you look at our product and processes and find ways to improve through suggesting and implementing alterations, and expanding our awareness through monitoring and alerting via our telemetry implementation and expansion of it. Your development skills allow you to either implement the necessary improvements yourself or prepare the work in a way that a development team can easily take over.
When an incident occurs, you support the resolution by debugging production issues, across services and levels of the stack. When required you continue incident resolution outside business hours when necessary for priority one incidents. You actively support proper systems monitoring as eyes on glass to prevent issues becoming incidents where possible.
About you
When others are asked to describe you, they could describe you as:
Passionate about troubleshooting and improvement; an inquisitive mind who loves a puzzle. When you see something broken, you can’t help but fix it.
Great about automation; you aim to automate operational tasks to improve efficiency and remove toil.
Structured and detail oriented, with your eye on customer experience and team improvement.
Talks about quality attributes of software such as availability, performance, security and reliability all the time.
Stays calm and focused under pressure.
Loves working across the organization, from Engineering to Support, to ensure we continuously deliver a high quality product which our customers can trust.
Someone who doesn’t focus on layer-7 alone (virtual infrastructure experience with networking and products such as Cloudflare is beneficial).
Improves the deployment process to make it as much a non-event as possible.
Your profile
5+ years of experience in a Site Reliability Engineering, or DevOps role.
Solid programming skills in C#.
Preferably experienced on the Microsoft Azure Cloud platform, specifically Application Insights, Azure SQL Database, and Azure Kubernetes Service.
Ideally has a strong background in relational database modelling and management.
Preferably experienced with ‘hands-free’ automated deployment by implementing Promotion Approval Gates, Operational Readiness Gates, Quality Gates, and/ or Change Enablement Gates.
Fluent in English, understanding Dutch is a plus.
Our offer
A 40-hours working week with flexible working hours and the possibility to work from home;
A salary fitting your skill level;
25 vacation days (based of 40 hours);
A hybrid way of working, 8 days at the office per month;
An informal working environment and with multiple events planned each year;
Every day a lunch buffet is provided to enjoy with your colleagues and free of cost
Keywords
DevOps, SRE, Site Reliability Engineering, Development, Automation, Incident Management, SQL, SLA, SLO, Telemetry, Risk, Feedback, Continuous Improvement, RCA, Root Cause, Monitoring, Operations, Database, Failure Mode Analysis, Microsoft Azure.