Job Summary
Experienced Data Engineer / Specialist for designing, developing, and maintaining robust data pipelines and solutions on Microsoft Azures. The ideal candidate to have strong technical expertise in Microsoft Azure services: Azure Data Factory, Azure Synapse, Azure SQL, Azure Databricks, etc. The ideal candidate will work closely with cross-functional teams to ingest, transform, and optimize data for business intelligence, reporting, and advanced analytics use cases.
Key Responsibilities
• Design, develop, and maintain robust data pipelines using Azure Data Factory, Synapse Analytics, and other Azure services.
• Design and manage data models in Azure Synapse Analytics and Azure SQL Database.
• Build and optimize ETL/ELT pipelines using Azure Data Factory (ADF) for data ingestion and transformation.
• Develop scalable data processing/ data transformation workflows using Azure Databricks, PySpark, and Spark SQL.
• Develop and maintain data ingestion, transformation, and orchestration workflows.
• Develop and maintain data models for analytics and reporting.
• Build reusable, high-performance data models for analytics and reporting.
• Monitor, troubleshoot, and optimize performance of data pipelines.
• Implement best practices for version control, CI/CD, and automation.
• Document data architecture, processes, and best practices.
Skill Requirements
• Proficiency in Microsoft Azure services: Azure Data Factory (ADF) for data orchestration (for building and managing pipelines), Azure Databricks for big data processing and analytics, Azure Synapse and Azure SQL).
• Experience with Power BI, Tableau/OBIEE.
• Proficiency in PySpark, Python, and Spark SQL for data manipulation and transformation.
• Familiarity with data modeling, ETL/ELT processes, and data warehousing concepts.
• Understanding of ETL/ELT design patterns and best practices.
• Solid understanding of data warehouse best practices, development standards and methodologies..
• Strong understanding of data modeling, data governance, and performance tuning and optimization of Spark jobs.
• Experience with CI/CD pipelines and DevOps practices in data engineering.
• Experience with ETL/ELT patterns and best practices.
Other Requirements
• DevOps & CI/CD: Azure DevOps, GitHub Actions, Jenkins
• Experience in Data Lakehouse architectures and Delta Lake
• Knowledge of real-time data streaming (Event Hubs, Kafka)