Job Summary
Validate the safety, reliability, and production readiness of agentic workflows in DBOps/SecOps before rollout, ensuring zero false actions and high operational stability.
As a Subject Matter Expert in Tools and Automation, you will play a crucial role in ensuring the timely resolution of escalated incidents while adhering to quality compliance and SLA requirements. Your expertise will contribute to enhancing customer satisfaction through effective communication, documentation, and proactive engagement with business stakeholders.
Key Responsibilities
Validate use cases and end-to-end agent behavior. Test automation safety, false-action prevention, and escalation logic. Verify operational stability and production readiness. Build automated test suites for agent workflows. Partner with developers to remediate defects.
1. Ensure Timely Resolution Of Escalated Tickets And Incidents By Employing Advanced Monitoring Techniques And Event Analysis, Ensuring Adherence To Sla And Quality Standards.
2. Mentor Team Members And Administrators By Providing Guidance On Monitoring Tools And Best Practices, Preparing Standard Operating Procedures (Sops), And Maintaining Comprehensive Documentation For Operational Efficiency.
3. Validate Change Order Implementation Plans Using Monitoring Tools, Ensuring Compliance With Human Error Protocols And Actively Participating In Capacity Planning Discussions To Optimize Resource Allocation.
4. Foster Positive Customer Relationships By Engaging In Customer Meetings, Capturing Feedback, And Addressing Issues To Enhance Service Delivery And Satisfaction.
5. Conduct In-Depth Analyses, Including Root Cause Analysis And Trend Analysis, Leveraging Monitoring Data To Generate Reports And Insights For Presentation To Key Business Stakeholders, Driving Continuous Improvement Initiatives.
Skill Requirements
Strong in Python test frameworks (PyTest, Selenium for UI). Experience with AI/Agent testing, prompt evaluation, regression suites. Understanding of SecOps/DBOps risk scenarios. Knowledge of CI/CD and test automation tools. Detail-oriented mindset for safety-critical validation.
1. Proficient In Monitoring Tools And Technologies Related To Event Monitoring And Incident Management.
2. Strong Analytical Skills With The Ability To Perform Root Cause Analysis And Trend Analysis.
3. Excellent Communication Skills For Effective Stakeholder Engagement And Documentation.
4. Familiarity With Itil Processes And Best Practices In Incident Management.
Other Requirements
Validate the safety, reliability, and production readiness of agentic workflows in DBOps/SecOps before rollout, ensuring zero false actions and high operational stability.
1. Itil Foundation Certification (Optional But Valuable).
2. Certification In Relevant Monitoring And Automation Tools (Optional But Valuable)