Job Summary
Enterprise Data & AI Strategy
- Define the enterprise data architecture, reference models, and technology roadmap.
- Establish strategy for enterprise adoption of LLMs, RAG architectures, LLMOps pipelines, and autonomous agent-based AI systems.
- Drive integration of structured, semi‑-structured, and unstructured data for generative AI use cases.
Data Platform & Pipeline Architecture
- Design and govern data lake, data warehouse, and lakehouse architectures.
- Lead ingestion, transformation, quality, metadata, and governance frameworks.
- Architect real-time, batch, and streaming pipelines across cloud platforms.
- Implement scalable vector databases, embedding pipelines, and semantic search workloads.
Key Responsibilities
2. Conduct data analysis to identify trends, patterns, and insights that will drive business decisions.
3. Create scripts and programs using python to automate processes and optimize data workflows.
4. Collaborate with cross functional teams to understand technical requirements and provide solutions.
5. Implement data quality measures and ensure data integrity across different systems.
6. Stay updated on industry trends and best practices in sql, data analysis, and python to suggest improvements and optimizations.
Skill Requirements
Cloud Modernization & Data Engineering
- Drive cloud data modernization using AWS, Azure, or GCP native services.
- Lead data engineering using Spark, Databricks, Snowflake, BigQuery, or Synapse.
- Implement DataOps/MLops pipelines using Airflow, ADF, Glue, or similar.
- Extend MLOps to LLMOps: prompt management, model registries for LLMs, evaluation frameworks, guardrails, and observability.
Governance, Quality & Compliance
- Ensure data governance maturity—cataloging, classification, lineage, ownership, and policy automation.
- Establish governance for generative AI: responsible AI controls, toxicity filtering, guardrails, hallucination evaluation, and bias mitigation.
- Ensure compliance with GDPR, DPDP, HIPAA, PCI, SOC2, and emerging AI regulations.
AI, ML, and Agentic Workflows
Other Requirements
- 9+ years of experience in data engineering, architecture, or platform leadership.
- Deep experience with data lake, warehouse, and lakehouse designs.
- Strong expertise in AWS, Azure, or GCP data ecosystems.
- Hands-on experience with Spark, Databricks, Snowflake, Kafka, Flink, and Airflow.
- Advanced SQL, Python, ETL/ELT design.
- Experience with data modeling, metadata, lineage, and governance frameworks.
- Knowledge of data security, IAM/RBAC/ABAC, and compliance requirements.
- Expertise in distributed compute tuning and cost governance.
- LLMOps experience including prompt engineering, evaluation pipelines, vector search, embedding models, guardrail frameworks (Azure Prompt Shields, Bedrock Guardrails), and safety monitoring.