Gurgaon, India
1 day ago
Lead Engineer - IT
Job Description:JOB SUMMARY: Observability Engineer with deep expertise in Zenoss

An Observability Engineer would be a member of the IT team with strong technical understanding of Zenoss. He/ she needs to be aware of techniques used to monitor the health and well-being of growing IT infrastructure. This position will administer, upgrade, maintain and evolve our enterprise monitoring, alerting, reporting, and analysis suite of tools in IT Enterprise division. The overall responsibility of Observability Engineer is to design, configure, implement and maintenance of the Enterprise Management (EMS/NMS) tool suite. The EMS tool Suite provides Compute, cloud, SaaS, Network, Voice and Security monitoring capabilities covering infrastructure components such as data center infrastructure, servers (Linux/Windows/AIX), network equipment, AWS Services, appliances, storage, databases, and applications etc.Qualifications:

ROLES AND RESPONSIBILITIES:

Zenoss Administration

Administer and manage the Zenoss SaaS environmentConfigure and optimize Zenoss for performance monitoring, event correlation, and alertingImplement custom monitoring solutions, including SNMP, API-based, and agent-based monitoringMaintain ZenPacks, ensuring seamless integration with various IT systemsPerform troubleshooting of Zenoss-related issues and optimize event processingManage user roles, permissions, and integrations with external toolsUnderstands and write transforms for event enrichment and managementPerform administration and life cycle management of other Enterprise Monitoring tools including Splunk and AppDynamicsCollaborate with various internal technical teams to deploy new monitoring and alerting conditionsRespond to and manage trouble tickets related to the Enterprise Monitoring toolsSetting up logging and application performance monitoring using Zenoss and AppDynamicsAnalyze processes, identify weaknesses, develop, and implement improvementsAttend daily operations team meeting; create, maintain and update daily operations status reportsCreate, maintain and update relevant documentation and runbookAbility to conduct independent assessments of technical challenges, and to perform architectural trade-offs, and other analyses

Monitoring & Observability Tools

Work with Splunk for log monitoring, dashboarding, and alertingCollaborate on AppDynamics configurations for application performance monitoringAssist in integrating Zenoss with Splunk and AppDynamics for end-to-end observability

Incident & Problem Management

Ensure proactive monitoring and incident response to minimize downtimeConduct root cause analysis (RCA) for performance issuesProvide recommendations for system improvements based on monitoring insights

REQUIRED QUALIFICATIONS:

Degree or higher with preference on Computer Engineering/ TechnologyTotal of 10 years’ experience with a minimum of 3 years as Zenoss SMEStrong knowledge of Zenoss event processing, ZenPacks, and integrationsExperience with monitoring protocols (SNMP, WMI, API-based, etc.)Familiarity with Splunk (log analysis, dashboards, alerting)Experience with AppDynamics (basic administration and monitoring setup)Scripting skills in Python, Shell, or PowerShell for automationStrong troubleshooting and problem-solving skillsKnowledge of ITIL processes (Incident, Problem, and Change Management) is a plus

PREFERRED QUALIFICATIONS:

Experience working in enterprise IT environments with large-scale monitoringExposure to cloud platforms (AWS, Azure) and Kubernetes monitoringLocation:

This position can be based in any of the following locations:

Gurgaon

Current Guardian Colleagues: Please apply through the internal Jobs Hub in Workday

Confirm your E-mail: Send Email