You will also be part of an effort to build next-generation data platforms on the cloud to enable our business stakeholders to have rapid data access and incubate emerging technologies. Additionally, you will mentor junior engineers, and implement data governance practices to ensure security, compliance, and lifecycle management.
In this role, you will design and build data products. You will develop and maintain scalable, reusable data products that serve as foundational assets for analytics, reporting, and machine learning pipelines.
You will be architecting and optimizing data pipelines You will lead the design, develop, and enhance ETL/ELT workflows using AWS Lambda, AWS Glue, Snowflake, and Databricks You will ensure high-performance, scalable, and cost-efficient data processing solutions.
You will manage and scale data infrastructure. You will oversee cloud-based data environments, configure optimized storage solutions (S3, Snowflake, Delta Lake), and manage compute resources for efficient data processing.
You will perform tuning and optimization. You will implement advanced query tuning, indexing strategies, partitioning techniques, and caching to maximize efficiency in Snowflake and Databricks.
You will check cross-functional collaboration. You will partner with data scientists, engineers, and business teams to deliver well-structured, analytics-ready datasets that drive insights and machine learning initiatives.
You will govern data, compliance and security. You will establish and enforce best practices for access control, data lineage tracking, encryption, and compliance with industry standards (SOC 2, GDPR).
You will automate and observe: build resilient, automated data workflows leveraging tools such as Step Functions and Databricks Workflows. You will implement proactive monitoring, logging, and alerting to ensure system reliability and data quality.
You will drive innovation and best practices. You will be responsible to mentor junior engineers, stay ahead of emerging technologies, participate in hackathons, contribute to internal knowledge sharing at team and company levels, and champion continuous improvement in data engineering methodologies.