SpinsSys-Diné is looking for a Cloud Operations Monitoring Support personel to join their growing team. This is an exciting full-time opportunity to work in a fast-paced team environment supporting production systems for a large Federal agency.
Under minimal supervision, provide operations monitoring support for office or business unit users of proprietary or custom application software in a 24/7/365 environment supporting Cloud Operations. Position will require work during non-traditional business hours (after hours & Weekends) to support largescale cloud platform that support mission critical applications, take point on end-to-end support and smooth operations of cloud based infrastructure, incident response and resolution and other scheduled maintenance activities. Individual will gain business and application knowledge through training, reporting and resolving Production incidents and inquiries.
*This position will work 6 PM to 2 AM Monday - Friday*
Job Duties and Responsibilities:Incident Management
Triage and resolve Production incidents related to the cloud platform and participate in root cause analysis and post mortem discussions. Analyze cloud platform related Production incidents and engage business teams to determine impact of incident. Work with application support members and cloud support vendors to identify a work-around if permanent solution cannot be reached in a timely manner. Provide a collaborative conduit between application/support teams and the Cloud vendor support such as AWS, Azure etc. Escalate to team leads in a timely manner when resolution cannot be achieved. Help recreate and test possible solutions and/or workarounds in lower environments prior to implementing in Production. Work closely with Cloud Engineering team and other support staff to identify and resolve incidents, create, and implement long-term remediation techniques and fixes. Identify and document known issues and work with Cloud engineering partners and vendor support to address reoccurrence and the identified workaround activityOperations, Monitoring, and Capacity Planning
Monitoring large scale batch jobs. Create processes designed to measure system effectiveness and identify areas for improvement.Create processes intended to provide environment security, as well as automated processes to provide information on current specifications. Oversee the selection of orchestration tooling, as well as compliance audits and reporting. Identify, correct, and enhance important software tools seek ways to enhance systems operations and monitoring with a focus on automation and minimizing cost. Build effective monitoring, alerts, and metrics for production services.
Plan for adequate capacity of systems based on utilization metrics and planned projects to establish supply and demand forecasts. Other duties as assigned. Job Requirements (Education/Skills/Experience): 2-4 years of related experience on Production Support1-2 years of related hands on experience on AWSSome experience in ticketing systems like JIRA, ServiceNow etc. Experience in support various phases of SDLC (Waterfall or Agile). Bachelor Degree or equivalent preferred
Area of Study: Computer Science or IS/IT preferredMust be a US citizen able to pass a standard background check.
Specialized Knowledge & Skills:
Reasonable knowledge of AWS platform and troubleshooting experience with its servicesExperience with Docker and container orchestration.System health monitoring and optimizing performance (CloudWatch, CloudTrail, Splunk and VictorOps).Broad experience with software-defined and traditional networking.Some experience in supporting large-scale batch jobs preferred.Good understanding of Linux, including experience with server administration, monitoring, and troubleshooting.Any experience in building cloud infrastructure using infrastructure-as-code tools like AWS Cloud Formation or Terraform preferred.Reasonable troubleshooting and problem solving experience.Good communications and collaboration skills required to develop required security policies and share information with business and technology staff.Some documentation experience with reasonable oral communication skills.Desired Skills:
Good technical abilities, which include the following: cloud technologies, programming languages like Python orchestration.
Experience with APM technologies such as New Relic and VictorOps.
Experience in Splunk
AWS CertificationDiné Development Corporation (DDC) is a Navajo Nation owned family of companies that delivers IT, professional, and environmental solutions to advance the missions of federal, state, and tribal government agencies. As thought leaders and innovators, our team of specialists build client-centric solutions that solve critical challenges faced by defense, civilian, and healthcare organizations. Employing a mission-focused approach, we deliver value that not only enhances current operations, but also drives future change. Closely aligned with this approach is our commitment to advancing the Navajo Nation and its People. Through economic development and community empowerment, we elevate the Navajo Nation to provide lasting impact and sustainable growth for future generations. DDC’s ability to unite legacy-inspired technologies, industry best practices, and proven methodologies has contributed to our success for twenty years.
This contractor and subcontractor shall abide by the requirements of 41 CFR 60-1.4(a), 60-300.5(a) and 60-741.5(a). These regulations prohibit discrimination against qualified individuals based on their status as protected veterans or individuals with disabilities, and prohibit discrimination against all individuals based on their race, color, religion, sex, sexual orientation, gender identity, national origin, or for inquiring about, discussing, or disclosing information about compensation, or any other basis prohibited by law. We participate in E-Verify.
#LI-DNP
#spinsysddcjobs
Options Apply for this job onlineApplyShareRefer this job to a friendRefer Sorry the Share function is not working properly at this moment. Please refresh the page and try again later. Share on your newsfeed Application FAQsSoftware Powered by iCIMS
www.icims.com