Job Summary
- Design of data solutions on Databricks including delta lake, data warehouse, data marts and other data solutions to support the analytics needs of the organization.
- Apply best practices during design in data modeling (logical, physical) and ETL pipelines (streaming and batch) using cloud-based services especially Python & Pyspark
- Design, develop and manage the pipelining (collection, storage, access), data engineering (data quality, ETL, Data Modelling) and understanding (documentation, exploration) of the data.
- Design analytics/gold layer using dbt models,macros and manage it following best practice. Implement DQ and unit test in dbt. Experience in dbt cloud is preferred and working with Databricks sql
- Interact with stakeholders regarding data landscape understanding, conducting discovery exercises, developing proof of concepts, and demonstrating it to stakeholders.
- Experience to work on Collibira for DQ and data governance is plus
- Knowledge on dbt to model and build out layers in dbx is plus
- AWS experience on s3,redshift,glue etc is also required
Key Responsibilities
2. Design, develop, and maintain data pipelines using snowflake, azure data factory (adf), and data bricks to support business requirements.
3. Work closely with stakeholders to gather requirements, identify opportunities for data analytics, and propose data driven solutions.
4. Monitor and troubleshoot data pipelines, ensuring data quality, reliability, and performance.
5. Stay updated with the latest trends and technologies in data management and analytics, incorporating best practices into projects.
6. Collaborate with cross functional teams to integrate data solutions with existing systems and applications.
7. Provide technical expertise and mentorship to team members, promoting a culture of learning and innovation.
Skill Requirements
2. Strong experience in azure data factory (adf) for data integration and orchestration.
3. Handson knowledge of data bricks for data engineering, data processing, and machine learning.
4. Ability to design, develop, and optimize complex data pipelines for etl processes.
5. Strong problem-solving skills and the ability to troubleshoot data pipeline issues.
6. Excellent communication skills to interact with technical and nontechnical stakeholders effectively.
7. Strong leadership skills to guide and motivate technical teams towards project delivery and success.
8. Experience in agile methodologies and project management practices for efficient project execution.
Other Requirements
1.Relevant certifications in Snowflake, Azure Data Factory (ADF), DataBricks are a plus.
Mandatory Skills :
- Databricks – Pyspark, Python, SQL
- dbt core/cloud
- Unity catalog
- Collibra DQ
- AWS Services ex. Glue,s3,redshift,lambda etc