Job Summary
About the Role
We are looking for an experienced DevOps Engineer to design, automate, secure, and manage cloud-native infrastructure and deployment platforms. AWS experience is mandatory, with hands-on exposure to GenAI service integrations and working knowledge across Azure, Google Cloud Platform, and other modern cloud platforms. The role requires strong capability in CI/CD automation, Infrastructure as Code, container orchestration, observability, DevSecOps, and production-grade cloud operations.
This role is suitable for candidates with 5 to 12 years of relevant experience in DevOps, cloud engineering, platform engineering, infrastructure automation, or site reliability engineering.
Key Responsibilities
Key Responsibilities
- Design, implement, and manage scalable, secure, and highly available cloud infrastructure.
- Build and maintain CI/CD pipelines for application, infrastructure, container, and GenAI-enabled workloads.
- Automate infrastructure provisioning and configuration using Terraform, CloudFormation, CDK, Ansible, or equivalent tools.
- Deploy, manage, and optimize containerized workloads using Docker, Kubernetes, Amazon EKS, ECS, or equivalent platforms.
- Support integration of GenAI services such as Amazon Bedrock, SageMaker, model APIs, vector databases, prompt management, and AI application deployment workflows.
- Implement monitoring, logging, alerting, tracing, and observability using CloudWatch, Prometheus, Grafana, ELK, OpenTelemetry, or similar tools.
- Apply DevSecOps practices including IAM governance, secrets management, vulnerability scanning, policy enforcement, and secure deployment controls.
- Manage cloud networking, load balancing, DNS, VPN, private connectivity, firewalls, and environment segregation across cloud platforms.
- Drive reliability, scalability, performance tuning, cost optimization, backup, disaster recovery, and production support activities.
- Collaborate with development, QA, security, architecture, and operations teams to improve release velocity and platform stability.
Skill Requirements
Mandatory Technical Skills
- Mandatory hands-on experience with AWS cloud services, including compute, storage, networking, security, monitoring, and deployment services.
- Strong experience in AWS services such as EC2, S3, VPC, IAM, Lambda, API Gateway, RDS, CloudWatch, CloudTrail, ECS, EKS, CodePipeline, CodeBuild, and CodeDeploy.
- Hands-on exposure to GenAI service integrations using Amazon Bedrock, SageMaker, model endpoints, APIs, embedding services, vector databases, and AI application deployment pipelines.
- Strong knowledge of CI/CD tools such as Jenkins, GitHub Actions, GitLab CI/CD, Azure DevOps, AWS CodePipeline, or similar platforms.
- Strong experience with Infrastructure as Code using Terraform, AWS CloudFormation, AWS CDK, Ansible, or equivalent automation frameworks.
- Hands-on experience with Docker, Kubernetes, Helm, Amazon EKS, ECS, and container registry management.
- Working knowledge of multiple cloud platforms, including AWS, Microsoft Azure, Google Cloud Platform, and hybrid or multi-cloud deployment models.
- Strong scripting and automation skills using Python, Bash, PowerShell, or equivalent scripting languages.
- Strong understanding of Linux administration, networking fundamentals, DNS, load balancers, firewalls, SSL/TLS, and cloud security controls.
- Experience with monitoring, observability, log management, incident response, and production support for enterprise applications.
Preferred / Additional Skills
- Experience implementing GenAIOps practices such as prompt versioning, model configuration deployment, evaluation workflows, guardrails, and AI workload monitoring.
- Exposure to vector databases, RAG pipelines, API-based LLM integrations, AI gateways, and secure GenAI workload orchestration.
- Knowledge of Azure DevOps, Azure Kubernetes Service, Azure Monitor, Google Kubernetes Engine, Cloud Build, and Google Cloud Operations Suite.
- Experience with DevSecOps tools such as SonarQube, Snyk, Checkmarx, Trivy, Aqua, Prisma Cloud, or equivalent security platforms.
- Familiarity with service mesh, API management, event-driven architecture, serverless deployment, and microservices operations.
- AWS, Azure, Google Cloud, Kubernetes, Terraform, DevOps, or security certifications are preferred.
Other Requirements
Experience Criteria
- 5 to 12 years of relevant experience in DevOps engineering, cloud infrastructure, platform engineering, SRE, automation, or production operations.
- Candidates should have strong hands-on experience in AWS-based production environments, with practical knowledge of Azure, Google Cloud Platform, and multi-cloud architecture.
- Candidates should have experience supporting enterprise-scale applications, cloud migration, automation, deployment governance, security compliance, and production incident management.
Educational Qualifications
Mandatory Qualification:
- B.E. / B.Tech in Computer Science, Information Technology, Electronics, Software Engineering, or any other relevant engineering stream.
Equivalent qualifications may also be considered:
- BCA / MCA / M.Tech / M.Sc. in Computer Science, Information Technology, Cloud Computing, Cybersecurity, Software Engineering, or related disciplines from a recognized institution or university.