Remote, United Kingdom
9 days ago
Principal DevOps/SRE Engineer

The Company:

Marigold helps brands foster customer relationships through the science and art of connection. Marigold Relationship Marketing is a suite of world-class martech solutions that help marketers create long term customer love and loyalty. Marigold provides the most comprehensive set of use cases for marketers at any level. Headquartered in Nashville, Tennessee, Marigold has offices globally across the United States, Europe, Australia, New Zealand, South America and Central America, as well as in Japan.

What You’ll Do

Help build a Site Reliability Engineering culture by sharing your best practices, approaches, documentation, and code with other engineering teams

Apply automation and software to any tasks or parts of the system that would benefit from it or are performed manually

Troubleshoot complicated issues handling OS, Networking, Database in a cloud-based SaaS environment/on-premises environment and handle live production incidents, debug/troubleshoot application and infrastructure issues, follow and implement SRE best practices

Monitor application performance, take steps to improve overall application performance and stability and follow through with implementation

Conduct system analysis, configuration management and develop improvements for system software performance, availability and reliability

Work closely with software and QA engineers to ensure the system is responding properly to non-functional requirements such as performance, security, and availability

Document your system knowledge as you acquire it over time, create runbooks, and ensure critical system information is readily available to those who need it

Maintain and monitor deployments, orchestration, databases, and general backend infrastructure

Keep up-to-date with security and proactively identify, diagnose, and solve complex security issues.

Be part of an on-call rotation to support the global platform providing an excellent customer experience

Ideal Qualifications:

Degree in Computer Science or equivalent combination of education and experience

7+ yrs experience in DevOps or SRE role

7+ yrs Linux experience

5+ years managing production environments in AWS

5+ years experience in Kubernetes preferably EKS

3+ years creating and maintaining infrastructures with Terraform

Experience using infrastructure as code principles to design, build and maintain cloud platforms using Terraform/OpenToFu

Experience working with database and data store technologies such as RDS/MySQL, Elasticache/Redis or equivalent

Knowledge of core server-side concepts and experience working with cloud networking, load balancers, HTTP or GRPC protocols, and large scale microservice environments

Experience with observability stacks, instrumenting environments for logging and monitoring and building and designing dashboards and alerts

Knowledge of DevOps methodologies, basic programming and the tools involved in CI/CD automation

Nice to Have:

Experience managing high scale web application platforms or SaaS platforms

Strong Kubernetes, EKS or ECS/Fargate experience 

Deep understanding of security principles

History of contributing to FOSS projects

Experience with AWS networking concepts such as VPC peering, Transit Gateway

Experience with multi-geography, multi-tenant applications 

Experience designing and performing disaster recovery

Experience programming with Go or Python

Experience with cost management

Experience with NoSQL databases such as ScyllaDB.

Experience working with Stream processing and big data technology stacks such as  Kafka or Trino

What We Offer:

The competitive salary and benefits you’d expect!

Generous time off (we call it Open Time Away) as well as paid holidays and a birthday benefit day off.

Retirement contributions. 

Employee-centric and supportive remote work environment with flexibility.

Support for life events including paid parental leave.

Confirm your E-mail: Send Email