USA
1 day ago
AWS EKS Engineer
Our Client, a Mortgage Financing company, is looking for an AWS EKS Engineer for their Reston, VA location. Responsibilities: + Kubernetes Cluster Management + Design, deploy, and maintain Kubernetes clusters in production environments. + Ensure high availability, scalability, and security of Kubernetes infrastructure. + Implement monitoring, logging, and alerting solutions for cluster health and performance. COTS Product Integration: + Integrate commercial off-the-shelf (COTS) software with Kubernetes clusters. + Manage and troubleshoot issues arising from COTS product deployments within the Kubernetes ecosystem. ML Workload Orchestration: + Deploy and manage interactive and batch-based machine learning container workflows + Integrate and optimize SageMaker container images for training and inference tasks. + Monitor resource usage and optimize ML workloads for cost and performance efficiency. Infrastructure as Code (IaC): + Develop and maintain infrastructure automation using Terraform. + Write and maintain Terraform modules for Kubernetes, cloud infrastructure, and CI/CD pipelines. + Enforce best practices for IaC, including version control, modularity, and code reviews. CI/CD Pipelines and Version Control: + Create and manage CI/CD pipelines for deploying applications and Kubernetes manifests using GitLab. + Ensure automation and testing in the software delivery process. + Maintain version control practices for all infrastructure and application code. + Work closely with ML engineers, data scientists, and software developers to meet workload requirements. + Document processes, configurations, and troubleshooting guides. + Conduct knowledge-sharing sessions and training for team members. Requirements: + Technical Expertise: Proficiency in Kubernetes (CKA or CKAD certification is a plus). Strong experience with Terraform for IaC. Familiarity with SageMaker, Docker, and containerized workflows. Solid understanding of GitLab CI/CD and version control principles. + Hands-on experience with major cloud providers (AWS, Azure). Knowledge of Kubernetes networking, ingress controllers, and service meshes. Experience with cloud-native tools like Helm, Prometheus, Grafana, and Fluentd. + Machine Learning Workflow Support: Understanding of ML model deployment, training, and inference workflows. Knowledge of integrating SageMaker container images with Kubernetes. + Knowledge of Kubernetes RBAC, secrets management, and pod security policies. Experience with scanning tools for containers and IaC. + Additional Skills: Familiarity with other IaC tools (e.g., Pulumi, Ansible). Experience with advanced Kubernetes features like Operators and CRDs. + This role is a blend of DevOps, MLOps, and infrastructure engineering, making it essential for the candidate to have both technical breadth and domain-specific expertise. Certifications is a plus. Why Should You Apply? + Health Benefits + Referral Program + Excellent growth and advancement opportunities As an equal opportunity employer, ICONMA provides an employment environment that supports and encourages the abilities of all persons without regard to race, color, religion, gender, sexual orientation, gender identity or express, ethnicity, national origin, age, disability status, political affiliation, genetics, marital status, protected veteran status, or any other characteristic protected by federal, state, or local laws.
Confirm your E-mail: Send Email
All Jobs from ICONMA, LLC