We create tooling, deliver and operate customer environments both on-prem and in the cloud using cloud native technologiesJob Description
Roles and Responsibilities
In this role, you will:
• Develop automated solutions to predict and address potential problems before they result in a service interruption
• Oversee and adapt monitoring and alerting systems
• Collaborate with all GE business units worldwide, providing a bastion technical expertise
• Identify potential process improvements across the entire engineering organization
• Define and drive architectural enhancements into system to mitigate potential failure points
• Provide impact assessment and mitigation plan for changes going into the production environment
• Investigate root cause of severe and systemic outages, identify corrective actions
• Establish performance baseline, capacity thresholds, correlate events, and define monitoring/alerting criteria
• Provide technical coaching and direction to more junior teammates
Education Qualification
Bachelor's Degree in Computer Science or “STEM” Majors (Science, Technology, Engineering and Math) with advanced experience.
Desired CharacteristicsTechnical Expertise:
• Excellent knowledge of Linux system internals
• Excellent knowledge of Kubernetes for cluster management of containers
• Strong analytical and problem solving skills
• Experience with all stages of an agile software development lifecycle (CI/CD)
• Familiar with largecluster deployment tools (Helm, Kustomize)
• Demonstrated ability to script around repeatable tasks (Go, Ruby, Python, Bash)
• Experience with developing cloud-native applications (High Availability)
• Able to dive into any level of a modern internet service (schedulers, containers, Linux kernel,
caching, object storage, distributed filesystems, RDBMS, NoSQL, etc.)
• Comfortable with network troubleshooting (tcpdump, routing, proxies, firewalls, load balancers,
etc.)
• Able to troubleshoot and debug applications (C, Java, Go)
• Proficient in configuration management systems (Chef, Terraform, Ansible, Puppet, Salt)
• Experience with configuring, customizing, and extending monitoring tools (Sensu, Grafana, Prometheus, Graphite, Splunk, etc.)
• Experience deploying and managing infrastructure on public clouds (AWS, GCP, or Azure)
• Comfortable using Git on the command line
Leadership:
• Influences through others; builds direct and \"behind the scenes\" support for ideas. Preemptively
sees downstream consequences and effectively tailors influencing strategy to support a
• positive outcome.
• Able to verbalize what is behind decisions and downstream implications. Continuously
reflecting on success and failures to improve performance and decision-making. Understands and encourages change when needed.
• Proactively identifies and removes project obstacles or barriers on behalf of the team.
• Able to navigate accountability in a matrixed organization.
• Self-starter; communicates and demonstrates a shared sense of purpose. Learns from failure.
Personal Attributes:
• Critical thinker; able to quickly adapt to changing environments
• A hacker or tinkerer at heart
• Risk taker, not afraid to think outside the box or challenge the status quo
• Emotional Intelligence, ability to influence up and out and the ability to work independently
• Must be a team player with a strong desire to win
• Passionate about continuously learning
• Highly organized and efficient; able to balance competing priorities and execute accordingly
• Strong oral and written communication skills.
Relocation Assistance Provided: No