As a highly collaborative engineer, you enjoy solving complex infrastructure and security problems that directly enable business and product teams.
You have a strong sense of ownership and are comfortable with hands-on technical work across cloud, platform, and security domains. You will work at the intersection of platform engineering, DevOps, security, and MLOps, collaborating with Machine Learning Engineers (MLEs), Data Engineers (DEs), Data Scientists (DS), Product Managers, and InfoSec teams to build and operate secure, scalable, and production-grade cloud environments and reusable infrastructure assets.
You will be responsible for designing and maintaining secure multi-account cloud environments that power data platforms, ML workloads, and product applications.
Your responsibilities will include building and managing multi-account AWS environments (Dev, Staging, Prod) following security and governance best practices; provisioning and managing infrastructure using Terraform (Infrastructure as Code); deploying, operating, and scaling Kubernetes (EKS) clusters for data and ML workloads; implementing autoscaling and cost optimization using tools such as Karpenter; designing and maintaining CI/CD pipelines using GitHub Actions and similar tools; implementing GitOps practices using ArgoCD; orchestrating ML/data workflows using Argo Workflows; working closely with InfoSec engineers to monitor, prioritize, and remediate vulnerabilities across cloud, containers, and pipelines; integrating security practices into CI/CD (container scanning, IaC scanning, dependency scanning); managing IAM, networking, secrets, encryption, and cloud security baselines; implementing logging, monitoring, and alerting for infrastructure and platform reliability; supporting ML and data teams with scalable environments for model training, pipelines, and batch workloads; and driving automation to improve reliability and reduce operational overhead.
What you’ll learn includes how secure, scalable cloud platforms are designed to support ML and data products in production; best practices in DevSecOps, cloud governance, and Kubernetes platform engineering; how MLOps, CI/CD, and GitOps practices come together to enable rapid but safe delivery; operating production systems across multiple AWS accounts with strong security controls; and working cross-functionally with product, data, and security stakeholders in a fast-paced environment.
You will work on the platforms, frameworks, and automation tooling that Data Scientists, Data Engineers, and ML Engineers rely on to move from experimentation to reliable production impact.
Your real-world impact includes fusing platform, security, and scale as you work with modern cloud-native technologies and security-first engineering practices; multidisciplinary teamwork as you collaborate with ML, data, product, and security experts; contributing to an innovative engineering culture with a strong focus on automation, reliability, and continuous improvement; and striving for excellence as you build infrastructure that directly enables high-impact data and ML products.