In a world of disruption and increasingly complex business challenges, our professionals bring truth into focus with the Kroll Lens. Our sharp analytical skills, paired with the latest technology, allow us to give our clients clarity—not just answers—in all areas of business. We embrace diverse backgrounds and global perspectives, and we cultivate diversity by respecting, including, and valuing one another. As part of One team, One Kroll, you’ll contribute to a supportive and collaborative work environment that empowers you to excel.
This role sits within the Site Reliability Engineering team and is part of the wider Crisp Technical Services business unit. Our suite of SaaS, distributed systems and product integrations help our internal stakeholders run their critical business operations and provide customers in turn with industry leading threat detection technology products. You’ll play a key role in the formation of a new area within Crisp: that aims to drive operational excellence and customer focus into the operation of our SaaS hosted application suite.
As a Technical Operations Engineer, you will be using your skills and expertise on cloud platforms to maintain and improve availability and running on the Google Cloud platform, and support our industry leading SaaS solution. As part of the SRE team you will be an integral part of ensuring our platforms are highly available and resilient, through continual monitoring and providing improvement suggestions. You will work closely with engineering teams in Development and Delivery to uphold contracted Service Level Objectives (SLOs). You will be tasked with ensuring our internal and externally available systems have reliability, and uptime appropriate to user needs.
At Kroll, your work will help deliver clarity to our clients’ most complex governance, risk, and transparency challenges. Apply now to join One team, One Kroll.
ROLE DUTIES AND REQUIREMENTS
Work in a team to provide third line support for the infrastructure and application Take responsibility, ownership, and coordinate fault resolution. Work alongside a team of engineers where necessary to fix faults that are raised against the supported elements, networks, or applications, For service impacting incidents lead the investigation into the RCA, producing any reports and co-ordinating the delivery of any fixes to mitigate further occurrences Use and maintain our monitoring platforms Operating 24/7 to a response SLA Adhere to ITIL Practices Respond to first line alerts from our alerting platform/s Tune alerting thresholds to reduce false positive alerts Follow run books/SOP’s for alert resolution Create and update SOP’s/runbooks to help with repeat resolutions Write up Postmortems/RCA Documents for P1/P2 Close Loop on Service Cloud Cases for incidents Escalate if an issue was unable to be resolved Contribute to problem management and root cause diagnosis following an incident Manage Escalation for any outstanding issues working with the SDO IAM / GCP permissions (not platform permissions) BAU routine examples Elastic index snapping security alert reviews daily volume alerts triage Persistent failures reprocessingEssential Experience
Experience in working in a distributed, cloud environment using Azure/AWS/GCP Excellent fault finding ability Monitoring Solutions Incident Management Experience working in an ITIL Help Desk environment
Desirable Experience
In order to be considered for a position at Kroll, you must formally apply via careers.kroll.jobs
Kroll is committed to equal opportunity and diversity, and recruits people based on merit.
#LI-GF1