Job Summary
GCP Data Engineer – Job Description (5+ Years Experience) We are looking for a Senior GCP Data Engineer with 5+ years of experience to design, build, and operate scalable data platforms on Google Cloud. The role requires strong hands-on engineering skills across batch/streaming pipelines, data modeling, orchestration, CI/CD, and production support, with a focus on security, reliability, and cost optimization. Key Responsibilities Design and implement end-to-end data pipelines on GCP using services such as BigQuery, Cloud Storage, Pub/Sub, Dataflow (Apache Beam), Dataproc (Spark), and Cloud Composer (Airflow). Build and maintain ELT/ETL frameworks, reusable components, and standardized patterns for ingestion, transformation, and serving layers. Develop optimized data models (dimensional, Data Vault, or domain-oriented) and manage BigQuery datasets, partitions, clustering, and performance tuning. Implement streaming and near-real-time data processing (Pub/Sub → Dataflow/Dataproc → BigQuery) with strong data quality and monitoring controls. Orchestrate workflows using Cloud Composer/Airflow, including dependencies, retries, SLAs, alerting, and backfills. Establish CI/CD for data pipelines and infrastructure (Cloud Build/GitHub Actions/Jenkins) with automated testing, code reviews, and release governance. Apply security best practices: IAM least privilege, service accounts, KMS, VPC Service Controls (where applicable), DLP considerations, and audit logging. Drive observability and operations: logging/metrics, lineage/metadata, incident response, RCA, and performance/cost optimization. Partner with architects, analysts, and stakeholders to translate business needs into technical solutions and delivery plans. Mentor junior engineers and contribute to engineering standards, documentation, and knowledge sharing. Must-Have Skills 8+ years in data engineering (data warehousing, ETL/ELT, streaming) with 3+ years of strong hands-on experience on GCP. Expertise in BigQuery (SQL, optimization, partitioning/clustering, materialized views, UDFs, scheduling, cost controls). Hands-on pipeline development with Dataflow (Apache Beam) and/or Dataproc (Spark/PySpark/Scala) for batch and streaming workloads. Strong Python and SQL skills; experience writing production-grade code, tests, and reusable libraries. Orchestration experience with Cloud Composer (Airflow) including complex DAG design, SLAs, sensors, and backfill strategies. Strong understanding of data modeling, slowly changing dimensions, and analytics-ready design. Experience with Git-based workflows and CI/CD; familiarity with Infrastructure as Code (Terraform preferred). Cloud security fundamentals: IAM, networking basics, encryption, secrets management, and compliance-minded delivery. Production support experience: monitoring, alerting, troubleshooting, and root-cause analysis. Good-to-Have Skills Data governance and metadata: Data Catalog/Dataplex, data lineage, data quality frameworks, and master/reference data concepts. Experience with dbt (including BigQuery), Dataform, or similar transformation frameworks. Experience with GCP networking patterns (Shared VPC), Private Service Connect, and VPC Service Controls. Containerization and platform knowledge: Docker, GKE, Cloud Run. Experience integrating BI tools (Looker/Looker Studio/Tableau/Power BI) and semantic modeling. Exposure to ML data pipelines/feature stores and MLOps on Vertex AI. Kafka and other non-GCP messaging systems; hybrid/multi-cloud integration patterns. Qualifications / Certifications Bachelor’s de
Key Responsibilities
2. Write complex sql queries to extract and manipulate data efficiently
3. Develop data transformation processes using python for data cleaning and preprocessing
4. Collaborate within team to understand data requirements and deliver effective data solutions
5. Optimize data storage, performance, and processing capabilities within google cloud platform
6. Troubleshoot and resolve data related issues in a timely manner
7. Implement data security and privacy measures in compliance with industry standards
8. Stay updated on the latest trends and technologies in data engineering and analytics
Skill Requirements
2. Strong sql skills for data extraction, transformation, and loading
3. Advanced knowledge of python programming for data manipulation and analysis
4. Experience with google dataflow for real-time data processing and analytics
5. Familiarity with google cloud platform services and tools for data management
6. Strong problem-solving skills and attention to detail
7. Good communication and teamwork abilities
8. Ability to work in a fast paced and dynamic environment