Job Summary
Role Overview
We are looking for a highly skilled Data Engineer to join our Machine Learning Model Development & Integration stream within Financial Crime Technology.
In this role, you will design, build, and optimise data pipelines, feature engineering workflows, and model integration components that enable scalable ML solutions for AML, Fraud, and Transaction Monitoring use cases.
You will work closely with Data Scientists, ML Engineers, Solution Architects, and Platform Engineers to deliver production‑ready ML capabilities that are robust, compliant, and cloud‑native.
Key Responsibilities
Key Responsibilities
1. Data Engineering & Pipeline Development
- Design and implement scalable ETL/ELT pipelines for ML model training, inference, and monitoring.
- Build data ingestion frameworks using tools such as EMR, Kafka, Python, Spark, PySpark, MongoDB.
- Develop feature engineering pipelines to support model experimentation and productionisation.
- Ensure data quality, lineage, versioning, and reproducibility across ML workflows.
2. ML Model Integration & Deployment
- Integrate ML models into real-time and batch applications using custom APIs, or microservices.
- Build model inference pipelines, scoring engines, and real-time streaming integrations.
- Automate model deployment, CICD, and configuration using GitLab, AWS CodePipeline, Docker, Terraform.
3. Cloud Architecture & Platform Engineering
- Design cloud-native architecture patterns aligned with enterprise standards and regulatory expectations.
- Use services such as S3, Lambda, Step Functions, AppConfig.
- Optimise cost, performance, and reliability across ML workloads.
4. Cross-functional Collaboration
- Partner with Data Scientists to understand model input needs, feature dependencies, and execution flows.
- Collaborate with Platform teams to onboard, scale, and monitor ML workloads.
- Work with Compliance, Security, and Risk teams to ensure regulatory alignment (e.g., PRA SS2/21, model governance).
5. Operational Excellence
- Build monitoring, alerting, and observability for data pipelines and model endpoints.
- Implement automated lineage, auditability, and compliance controls.
- Enable A/B testing, model comparison workflows, and shadow mode deployments.
Skill Requirements
Skills & Experience Required
Technical Skills
- Strong experience in Python, SQL, PySpark, MongoDB and distributed data processing.
- Hands-on expertise with AWS cloud services—particularly EMR, Lambda, Step Functions, S3
- Experience with Kafka or other event streaming technologies.
- Solid understanding of data modelling, feature stores, and ML pipeline orchestration.
- Understanding of ML lifecycle concepts: model training, evaluation, deployment, and monitoring.
- CICD experience with GitLab, IaC (Terraform/CloudFormation), Docker.
Desirable Skills
- Experience working in Financial Crime / AML / Fraud analytics.
- Exposure to regulatory compliance around model risk management.
- Experience with model interpretability, data drift, and feature drift detection frameworks.
Soft Skills
- Strong problem-solving ability and analytical mindset.
- Excellent stakeholder communication—especially with Data Science and Product teams.
- Highly collaborative, comfortable with agile delivery and iterative development.