Short description
The Applied Innovation of AI (AI2) team is an elite machine learning group strategically located within the CTO office of JP Morgan Chase. AI2 tackle business critical priorities using innovative machine learning techniques and technologies with a focus on AI for Data, Software, Cybersecurity & Controls and Technology Infrastructure. The team partners closely with all lines of business and engineering teams across the firm to execute long-term projects in these areas that require significant machine learning development to support JPMC businesses as they grow. We are looking for excellent infrastructure engineers to help us with the design, development, deployment, delivery, and maintenance of AI products to our clients. In this role, you will be working with other engineers and data scientists in building and maintaining software and infrastructure that supports our team in developing and delivering disruptive AI products that serve our customers in production.
Responsibilities
Collaborate with data scientists and research/machine learning engineers to deliver products to production. Build and maintain scalable infrastructure as code in the cloud (private & public). Manage infrastructure for model training/serving and governance. Manage data infrastructure supporting the inference pipelines. Contribute significantly to architecture and software management discussions & tasks Rapid prototyping & shorten development cycles for our software and AI/ML products: Build infrastructure for our AI/ML data pipelines & workstreams from data analysis, experimentation, model training, model evaluation, deployment, operationalization, and tuning to visualization. Improve and maintain our automated CI/CD pipeline while collaborating with our stakeholders, various testing partners and model contributors. Increase our deployment velocity, including the process for deploying models and data pipelines into production.Requirements
Minimum Bachelor of Science degree in Computer Science, Software Engineering, Electrical Engineering, Computer Engineering or related field. Experience in containerization - Docker/Kubernetes. 3+ years of experience in AWS cloud and services (S3, Lambda, Aurora, ECS, EKS, SageMaker, Bedrock, Athena, Secrets Manager, Certificate Manager etc.) Proven DevOps/MLOps experience provisioning and maintaining infrastructure leveraging some of the following: Terraform, Ansible, AWS CDK, CloudFormation. Experience with CI/CD pipelines ex. Jenkins/Spinnaker. Experience with monitoring tools such as Prometheus, Grafana, Splunk and Datadog. Proven programming/scripting skills with some of the modern programming languages like Python. Solid software design, problem solving and debugging skills. Strong interpersonal skills; able to work independently as well as in a team.Desirable
You have a strong commitment to development best practices and code reviews. You believe in continuous learning, sharing best practices, encouraging and elevating less experienced colleagues as they learn. Experience with data labelling, validation, provenance and versioning.