BENGALURU, KARNATAKA, India
9 days ago
Software Developer 2

Cloud Operations Engineer

Career level: IC2

Job Description:

Oracle is leading the digital revolution. We are empowering nearly half a million businesses across the globe to turn untapped potential into real business value. Oracle Cloud Infrastructure (OCI) is a deep and broad platform of public cloud services that enables customers to build and run a wide range of applications in a scalable, secure, highly available, and high-performance environment. Kick-start your career with some real responsibility and an incredible learning experience working on cloud services in OCI. You will play an instrumental role in delivering the cloud experience that is changing lives across the globe. Your versatility will be your greatest asset as you turn your hand to deployment, operations and execution. You’ll have the opportunity to collaborate with the brightest minds in the industry and bring fresh insight to everything you do. Deliver fascinating, high scale services and solutions and enjoy extraordinary career growth at a company that wants to see you thrive. We are building a Global Operations team which can provide you the opportunity to build and operate a suite of massive scale, integrated cloud services in a broadly distributed, multi-tenant cloud environment. OCI is committed to providing the best in cloud services that meet the needs of our customers who are tackling some of the worlds biggest challenges. We oHer unique opportunities for smart, hands-on engineers with the expertise and passion to solve diHicult problems in distributed highly available services and
virtualized infrastructure. At every level, our engineers have a significant technical and business impact operating and building innovative new systems to power our customers 'business critical applications.

What You’ll Do:

Join a fun and flexible workplace where you’ll enhance your skills and build a solid professional foundation. As a Cloud Operations Engineer in our Global Production Services you will contribute to an exciting team working on some of the hottest cloud services such as Ksplice, Oracle Linux YUM Service, OS Management Hub, and more. As a Cloud Operations Engineer, you will use your skills to learn how to constantly deliver and improve on these tremendous cloud services. Operations work will include troubleshooting production issues and handling change management requests for upgrades, patches or
modifications. When not working on operations you will be working on software engineering tasks such as review of incidents to drive improvement of services, tools or runbooks to increase our reliability, scalability and reduce operational overhead through automation, training, documentation, service enhancement, or process improvement. This position has the opportunity to leverage and learn the ins and outs of current cloud service architecture, deployment, monitoring and operational technologies. There are many useful and desirable skills which will be acquired if not
already present. See below for the many cool and current technologies in play. The ideal candidate has some of the skills, but key is the motivation and ability to learn quickly as well as a passion for an excellent customer experience.

Engineers will:

• Improve monitoring, notifications, configuration and deployment of our services.
• Perform proactive service checks and monitor/triage and address incoming system/application alerts to ensure appropriate priority and response.
• Triage and troubleshoot service impacting events from multiple signals including phone, email, service telemetry and alerting.
• Perform change management activities for services such as upgrades and patching.
• Identify and work with engineering to implement opportunities for automation, signal noise reduction, recurring issues and other actions to reduce time to mitigate service impacting events and increase the productivity of cloud operations and development resources.
• Coordinate, document and track critical incidents ensuring rapid and complete issue resolution and an appropriate closed loop to customers and other key
stakeholders.
• Improve the availability, scalability, latency, ease of use, and eHiciency of service control plans and operational tooling.
• Modify /enhance monitoring infrastructure for the services
• Participate in service capacity planning and demand forecasting, software performance analysis and system tuning
• Potentially participate in regular rotations as a central part of the 24x7 operations team. We are hiring in multiple time zones to ensure 24x7 coverage in India to the USA.
• Need to be reliable in terms of working scheduled hours.
• Need to be motivated quick learners.

Desired skills include:

• BE/BTech or ME/MTech in Computer Science, or equivalent.
• 1+ years of work experience as a software, site reliability or customer support engineer
• Ability to work independently and across teams to guide other engineers through technical operations
• Good technical writing and communication skills. Engineers will need to be able to clearly write descriptions of operational issues and corrective actions for incidents.
• Slack skills and being comfortable coordinating with others online.
• Basic Linux system administration knowledge and experience
• Shell scripting, at least basic things, recursive search, output redirection, etc.
• Very strong analytical skills to identify problem root causes.
• Systematic problem-solving approach, combined with a strong sense of ownership and drive in resolving operations issues.
• Candidates will have the opportunity to develop many of the following skills. Current

possession of some of these skills is a bonus.
o Knowledge of Linux OS internals and administration including network services, TCP/IP, NFS, SSH, NTP, bonding, vlans, tuning, systems diagnosis
skills, systemd, kernel modules, user management, storage components
o Experience working under pressure to mitigate customer issues aHecting service reliability, data integrity, and overall customer experience.
o Monitoring, management, analysis and troubleshooting of large-scale, distributed systems.
o Experience with IaaS, PaaS and SaaS architectures
o Experience in building and managing virtualized and containerized systems (KVM, Containers/Docker/Kubernetes, Helm, Puppet, Chef).
o Understanding and experience with Micro-services architecture, Oracle database, MySQL, Oracle WebLogic servers
o Experience with cloud, development and build technologies: Python, Bash, Ansible, Terraform, Hadoop, Kafka, Solr, Redis, Git, Intellij, Jenkins and Maven
o Familiarity with identity, security and encryption technologies and following security best practices.

Career Level - IC2

#LI-DNI

Confirm your E-mail: Send Email