Key Qualifications
- Strong experience as a Senior Data Engineer with solid MLOps background.
- Hands-on expertise with Azure Databricks (Workflows, Delta Lake, Unity Catalog).
- Proficiency in PySpark, SQL Warehouses, and production-grade pipeline development.
- Experience with CI/CD (GitHub Actions or Azure DevOps).
- Familiarity with monitoring tools (Lakeview, Grafana) and ML observability concepts.
- Knowledge of data quality, drift detection, and ML monitoring practices.
- Understanding of security, access management, and compliance in data/ML ecosystems.
- Experience with Terraform or similar IaC tools.
Key Responsibilities
- Design and maintain end-to-end data & ML pipelines on Azure Databricks (Workflows, Delta Lake, Unity Catalog).
- Build reproducible training and deployment workflows integrated with experiment tracking, model registry, and artifact management.
- Implement data quality frameworks and observability metrics; create and maintain dashboards (Lakeview, Grafana, or similar).
- Automate data ingestion and feature pipelines using PySpark, SQL Warehouses, and Databricks Asset Bundles (DAB) within CI/CD (GitHub Actions or Azure DevOps).
- Manage security, access control, and compliance in data and ML environments.
- Optimize compute performance and cost (autoscaling, spot instances, caching, partitioning).
- Develop automated evaluation and validation pipelines triggered by new telemetry data, ensuring reproducibility and traceability.
- Maintain continuous model and data monitoring (drift, feature stability, prediction quality).
- Ensure environment consistency and dependency management using Infrastructure as Code and containerization (Terraform)