BENGALURU, KARNATAKA, India
21 hours ago
Software Developer 2
Job Description

About Oracle Cloud Observability and Management Platform: Oracle Cloud Observability and Management Platform is a comprehensive set of monitoring, management, diagnostic, and analytics services. It enables visibility and insight across cloud native and traditional technology, whether deployed in multi-cloud or on-premises environments, with broad, standards-based ecosystem support. It's designed to help enterprises better manage their increasingly diverse and distributed IT portfolios, while reducing troubleshooting time, preventing outages, and enabling IT to manage applications from a business perspective.

 

About The Job:

 

We're seeking a talented Development Engineer to join our Site Reliability Engineering team and contribute to the Oracle Observability and Management suite of OCI services.

As a member of the software engineering division, you will take an active role in the evolution of standard practices and procedures. You will be responsible for defining and developing software for tasks associated with the developing, designing and debugging of software applications or operating systems.

As a Development Engineer, you will solve interesting technical challenges by designing, deploying, and troubleshooting key Cloud services, platforms, and infrastructure, always thinking about reliability, scalability, resilience, security, and performance. You will be surrounded by “willing to help” individuals representing some of the brightest and most innovative minds in the industry. You will be a part of an organization that prides itself on providing training, empowerment, and career progression. Our team provides 24/7/365, follow-the-sun coverage while pushing the boundaries of what can be accomplished in the cloud. Advancing cloud computing means great growth opportunities, and highly rewarding experiences working in our expanding computing environments that the SRE team is responsible for.

Work is highly enjoyable, diverse, and challenging, requiring the application of advanced skills in specialized areas.

 

What You Need to Have:

2+ years of experience in the field. A BE/BTech or ME/MTech in Computer Science or equivalent education background. Highly motivated, capable of digging deep into solving problems, and able to work independently. Strong collaboration skills with partner teams and stakeholders. Good analytical and problem-solving skills with a strong customer service orientation. Ability to work effectively in a multi-location team. Excellent communication and interpersonal skills. Ability to work as part of a 24x7x365 operations team.

 

Software Skills:

Proficient in scripting with languages such as Java, Python, Shell, or equivalents. Experienced with micro-services architecture, Linux administration, and Oracle database management. Familiar with CI/CD pipeline concepts and their implementation. Knowledgeable in cloud technologies like Chef, Terraform, Docker, Kubernetes, Solr, etc. Proficient in writing automation utilities to streamline workloads. Adaptable and quick learner in dynamic environments. Effective communicator, able to participate in technical discussions clearly.

 

Responsibilities:

Ensure continuous availability of our cloud services 24x7x365. Utilize excellent communication, technical analysis, and problem-solving skills to methodically resolve issues. Gain comprehensive understanding of the full stack of the services you support (from Network to Application) and delve deep into the service to effectively mitigate customer impact. Drive improvements by developing tools and collaborating with partner teams to decrease incident counts, reduce event severity, and minimize downtime. Conduct proactive service checks and monitor/triage incoming system and application alerts. Triage and troubleshoot service-impacting events efficiently. Identify and collaborate with engineering teams to implement automation opportunities, reduce signal noise, address recurring issues, and take actions to reduce time to mitigate service-impacting events, thereby increasing productivity of cloud operations and development resources. Take on development responsibilities to enhance platform and infrastructure reliability.

Career Level - IC2

Confirm your E-mail: Send Email