Mumbai, Maharashtra, India
16 days ago
Site Reliability Engineer

Role: Site Reliability Engineer

Location: Mumbai / Bangalore / Chennai

Job Shift: 12PM IST – 9PM IST

Working Model: Hybrid Job

Intro:

Do you want to be noticed for your work? Make a difference every day? Be Impactful? Work with cutting edge technology? If so, you will fit in perfectly at CSC and especially within the Regulatory Technology Team. The world’s leading provider of business, legal, tax, and digital brand services, CSC uses technology to make businesses run smoother and smarter. As Software Quality Lead, that is what you will be doing, too.  

The Regulatory Technology Development Team is among the highest performing Agile teams at CSC and is responsible for leading CSC’s regulatory compliance technology implementations – you would be part of a globally-distributed Scrum team (Wilmington, Europe and India) and collaborate daily to deliver mission-critical solutions to CSC’s customers. Application, code quality and test automation are always paramount – driven by the team’s strong dedication to technical quality and agile delivery. You would be responsible for contributing to that goal – every day, every sprint and every release – by automating processes throughout the software life cycle applying your skills as a Site Reliability Engineer to the features being created.

As Site Reliability Engineer for the Regulatory Technology Team, you will be responsible for managing and optimizing cloud solutions to ensure high availability, performance, and security, while also developing automation tools to streamline system management and incident response. 


 Some of the things you'll be doing:

Infrastructure/Application Management: Design, implement, and manage highly available and scalable software solutions using cloud platforms (e.g., Azure). Monitor system performance, reliability, and security, proactively addressing issues to ensure optimal operation. Automation and Tooling: Develop and maintain automation scripts and tools to streamline infrastructure management, deployment processes, and incident response. Implement Infrastructure as Code (IaC) practices using tools like Terraform, Ansible, or similar. Incident Management: Respond to and resolve production incidents, conducting thorough post-incident reviews to prevent recurrence. Implement and maintain robust monitoring and alerting systems to detect and address issues promptly. Performance Optimization: Analyze system performance and reliability metrics, identifying bottlenecks and optimizing system performance. Conduct capacity planning and load testing to ensure systems can handle current and future demands. Collaboration and Support: Work closely with development teams to integrate reliability and performance best practices into the software development lifecycle. Provide support and guidance to other teams on infrastructure and operational best practices. Security and Compliance: Ensure systems and applications adhere to security best practices and compliance requirements. Implement and maintain security measures such as firewalls, intrusion detection systems, and encryption.

Required Skills/Experience:

Technical Expertise: Strong experience in cloud software solution management. Proficiency in programming languages such as Python, Go, or Bash for automation and scripting. Reliability and Performance: Proven experience in maintaining high availability and performance for large-scale, distributed systems. Knowledge of monitoring and logging tools (e.g., Azure Application Insights, Prometheus, Grafana, ELK stack). Automation and DevOps: Extensive experience with automation tools and practices, including CI/CD pipelines. Proficiency in using Infrastructure as Code (IaC) tools such as Terraform, Ansible, or CloudFormation is a plus Incident Management: Experience in incident response, root cause analysis, and implementing corrective actions. Ability to stay calm and make informed decisions under pressure during incidents. Collaboration and Communication: Strong communication skills, with the ability to work effectively with cross-functional teams. Excellent problem-solving skills and a proactive approach to identifying and resolving issues. Security and Compliance: Knowledge of security best practices and compliance standards (e.g., GDPR, HIPAA, SOC 2). Experience implementing and managing security measures in cloud environments.

Education & Experience:

Bachelor’s degree in Computer Science, Engineering, or a related field. Relevant certifications (e.g., Azure Solutions Architect, Certified Kubernetes Administrator) are a plus. Minimum of 3-5 years of experience in system administration, DevOps, or site reliability engineering. Proven track record of managing and optimizing large-scale, highly available systems.
Confirm your E-mail: Send Email