As a Senior Data Engineer at QuantumBlack, you will collaborate with business stakeholders, data scientists, and engineering teams to develop, productionize, and scale advanced data, machine learning, and Generative AI solutions that deliver measurable impact for our clients.
You will work across the full analytics and AI lifecycle — from understanding client data landscapes and building robust, reusable data pipelines, to enabling production-grade ML, LLM, and GenAI systems through strong MLOps, LLMOps, and DevOps practices.
Key responsibilities include building and maintaining technical platforms for advanced analytics and AI, designing scalable and reproducible data and ML pipelines, enabling CI/CD for data, ML, and GenAI workflows, and ensuring information security and compliance across cloud environments.
As a Senior Data Engineer you will collaborate with business stakeholders, data scientists, and internal teams to build and implement domain-focused, reusable data products and analytics platforms. You will design, build, and maintain robust, modular, scalable, and reproducible data pipelines supporting advanced analytics, machine learning, and GenAI use cases. You will understand client data landscapes, assess data quality, and work across structured, semi-structured, and unstructured data. You will map data fields to hypotheses and curate, wrangle, and prepare data for analytics, ML, and GenAI models. You will help build and maintain technical platforms that support end-to-end data and analytics engagements.
You will develop and deploy technology that enables productionization and deployment of ML and Generative AI solutions following industry best practices. You will establish and promote standards for software engineering, MLOps, LLMOps, and DevOps within multidisciplinary delivery teams.
You will design, build, and maintain modern, scalable, and secure CI/CD pipelines that automate development, testing, evaluation, and deployment of data pipelines, ML pipelines, and GenAI applications (including model, prompt, and inference workflows).
You will build and manage cloud-native infrastructure to support ML and GenAI lifecycle management, including experimentation, model tracking, deployment, monitoring, and governance. You will shape and support next-generation technology platforms that enable scaling of ML and GenAI products across teams and clients.
You will contribute to R&D projects and internal assetization efforts (e.g., GEMx / PMPx), building reusable frameworks, libraries, and platforms. You will participate in cross-functional problem-solving sessions with internal teams and clients, from data owners to senior leadership, to address business needs and deliver impactful solutions. You will guide global companies through data, ML, and AI solutions to transform their businesses and enhance performance across industries.
You will be based in Gurugram or Bengaluru as part of a global data engineering community, working in Agile, cross-functional teams alongside project managers, data scientists, machine learning engineers, software engineers, designers, and industry experts.
While we advocate for using the right tech for the right task, we often leverage the following technologies -Cloud & Data Platforms:AWS, GCP, Azure, Databricks; Languages & Frameworks:Python, PySpark, SQL, NodeJS, React, TypeScript, FastAPI, NestJS, Data, ML & GenAI: Spark, Kedro, MLflow, LangChain Vector; Databases & Search:Pinecone, Weaviate, Milvus, FAISS, PGVector, OpenSearch, Elasticsearc;Workflow & Orchestration:Airflow, Argo, Kedro, Temporal, n8n, Dify;Containerization & Infrastructure:Docker, Kubernetes, Terraform CI/CD & DevOps: GitHub Actions, Azure DevOps, GitLab, CircleCI; Observability:Grafana, Prometheus, OpenTelemetry, MLflow, Langfuse; Agentic AI:LangGraph, CrewAI, AutoGen.