• Assess the impact and severity of critical major incidents.
• Gather information and data to support incident analysis and decision-making.
• Act as the central point of contact during critical incidents, ensuring that all relevant teams are informed and engaged.
• Develop and implement an effective communication template to keep internal and external stakeholders informed.
• Ensure clear and timely communication with internal stakeholders, external partners, and relevant authorities.
• Maintain Response SLA.
• Coordinate with external vendors for additional support.
• Maintain detailed records of incident response activities, including actions taken, decisions made, and outcomes.
• Conduct a thorough post-incident analysis to identify lessons learned and areas for improvement.
• Implement changes to enhance incident response capabilities.
• Provide guidance and support to the incident response team during challenging and multifaceted incidents.
• Generate comprehensive incident reports, highlighting key findings and recommendations.
• Lead and facilitate to aid in the restoration for all business/customer impacting incidents in a 24x7x365 environment.
• Coordinate the triage, recovery, and communication during all major incidents.
• Lead major incident technical bridge and drive all activities to service restoration.
• Assign related Problem record to resolution team and coordinate root cause analysis (RCA) to closure.
• Establish metrics and reporting to create visibility into all Major Incidents and progress of open Problems.
• Ability to work on a 24x7x365 on call rotation.
• Create and keep up a plan for handling incidents that defines roles, duties, and escalation procedures.
• Ensure that the organization's incident management framework remains current and effective.
• Develop contingency plans for various scenarios.
• Review and update incident response plans and procedures as needed.
• 7+ years of supporting IT operations in a large-scale environment
• 5+ years of experience with leading resolution of major incidents in a large-scale environment.
• 5+ years of experience dedicated to Incident and Problem Management.
• Strong understanding of ITIL and Incident, Problem, and Change Management Processes.
• Working background in AWS, Azure, ServiceNow, and/or other technologies.
• Experience with managing ITIL workflows in ServiceNow.