Milpitas, CA, 95035, USA
11 hours ago
Site Reliability Engineer
The application window is expected to close on: 02/15/2025 Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received. Meet the Team Join Cisco's Digital Network Architecture team, transforming networking with AI, machine learning, IoT, and automation to deliver powerful analytics and insights. Our SRE team is dedicated to deliver innovative cloud solutions and thrives on collaboration, automation, provide reliability and performance infrastructure for our customers. Your Impact As a Site Reliability Engineer (SRE), you'll play a key role in enhancing our cloud and hybrid infrastructure. Your DevOps/SRE expertise will be crucial in building, deploying, and operating scalable Software-as-a-Service (SaaS) products. You'll use cloud-native technologies to monitor, maintain, and manage cloud infrastructures within SLA times. * Infrastructure Development: Create scalable, fault-tolerant infrastructures, with a focus on automation, and comprehensive documentation. * Expertise: Leverage your experience in cloud-native technologies to optimize distributed systems at scale. * Skills: Strong hands-on skills to guide and contribute technically to the Infrastructure platform team for cloud and hybrid customers * Tooling: Design tools for programmable infrastructure (infrastructure as code) to drive end-to-end micro-services monitoring and management * Automation: Passion to build and automate processes by leveraging innovative technologies and using industry standard methodologies. * Collaboration: Partner with the development teams to solve and improve infrastructure and establishing standard processes. Responsibilities Your contributions will drive our infrastructure's performance, reliability, and automation. * Ensure infrastructure availability, scalability, and performance to maintain 99.9%+ uptime. * Build, deploy, and manage Kubernetes EKS clusters on AWS with automate provisioning for optimize deployments. * Implement and maintain monitoring tools like CloudWatch, Prometheus, Grafana, and ELK for cloud infrastructure. * Solve and convert infrastructure issues into product improvements. * Build and maintain CI/CD pipelines for multi-region, multi-cloud SaaS applications. * Lead incident response, root cause analysis, and on-call support. * Conduct capacity planning, performance tuning, cost optimization, and infrastructure design. * Enforce standard methodologies for cloud security, governance, and compliance Our Minimum Qualifications for this role: * BS/MS in Computer Science or related field. Over 4+ years in SRE, DevOps of relevant experience. * Hands-on experience with AWS EKS clusters and other AWS infrastructure, including EC2, VPC, S3, EBS, IAM, and other relevant services. * Proficient in Kubernetes, Helm, Docker, and other virtual infrastructure platforms for managing container and microservices architectures. * Expertise with monitoring, alerting and incident management such as Grafana, Prometheus, Alert Manager, Kibana, PagerDuty. * Hands-On experience with CI/CD pipelines and tooling such as GitHub/GitLab, Jenkins/Spinnaker, ArgoCD/GoCD. Our Preferred Qualifications for this role: * Strong knowledge of core Enterprise LINUX with a focus on building, maintaining, securing, and performance tuning systems. * In-depth knowledge of Kubernetes internals, including clustering, scheduling, controllers, API server. * Excellent understanding of container networking, microservices architecture, and experience with virtual machine hosting in AWS. * Strong hands-on experience in building, automating, and maintaining infrastructure on AWS. * Advanced skills in Python, GO, Terraform, CloudFormation, AWS SDK, Ansible * Operational experience with data ingestion and processing pipelines using Kafka, Kinesis, ElasticSearch. * Familiarity with SQL/NoSQL systems like PostgreSQL, MongoDB, RabbitMQ, Redis. * AWS certification, Kubernetes certification is highly preferred. \#WeAreCisco #WeAreCisco where every individual brings their unique skills and perspectives together to pursue our purpose of powering an inclusive future for all. Our passion is connection—we celebrate our employees’ diverse set of backgrounds and focus on unlocking potential. Cisconians often experience one company, many careers where learning and development are encouraged and supported at every stage. Our technology, tools, and culture pioneered hybrid work trends, allowing all to not only give their best, but be their best. We understand our outstanding opportunity to bring communities together and at the heart of that is our people. One-third of Cisconians collaborate in our 30 employee resource organizations, called Inclusive Communities, to connect, foster belonging, learn to be informed allies, and make a difference. Dedicated paid time off to volunteer—80 hours each year—allows us to give back to causes we are passionate about, and nearly 86% do! Our purpose, driven by our people, is what makes us the worldwide leader in technology that powers the internet. Helping our customers reimagine their applications, secure their enterprise, transform their infrastructure, and meet their sustainability goals is what we do best. We ensure that every step we take is a step towards a more inclusive future for all. Take your next step and be you, with us! Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis. Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.
Confirm your E-mail: Send Email