Job Summary
Role Requirements:\r\n• Experience with Network Monitoring and Observability solutions of CNFs/VNFs\r\no Have good knowledge of Prometheus, SNMP monitoring platforms, GCP, and K8s.\r\no Should be able to assess the integration of these tools into alerting/on-call systems.\r\n• Experience in Integration/Automation\r\no REST based integrations for monitoring/alerting tools with external systems.\r\no Should be able to visualize the end to end workflow (alerting to ticketing).\r\no Should have experience in automation (Python/Go) for integrations and RESTful API requests for custom jobs.\r\n• Experience in Incident Management Workflow\r\no Defining escalation policies.\r\no Ensuring alerts/incident prioritization/closures.\r\n• GAP analysis on the monitoring infrastructure and tools\r\no Missing functionalities in the current implementation.\r\no Evaluation of potential replacement tools.\r\no Should be able to drive a POC on the potential replacement tool.\r\n• DevOps experience\r\no Linux, Docker (CREs), networking.\r\n• Documentation and Communications\r\no Should be able to document and maintain the configurations, alert mapping, WOW, error handling.\r\no Should be able to communicate and convince the non technical stakeholders.\r\n
Key Responsibilities
Role Requirements:\r\n• Experience with Network Monitoring and Observability solutions of CNFs/VNFs\r\no Have good knowledge of Prometheus, SNMP monitoring platforms, GCP, and K8s.\r\no Should be able to assess the integration of these tools into alerting/on-call systems.\r\n• Experience in Integration/Automation\r\no REST based integrations for monitoring/alerting tools with external systems.\r\no Should be able to visualize the end to end workflow (alerting to ticketing).\r\no Should have experience in automation (Python/Go) for integrations and RESTful API requests for custom jobs.\r\n• Experience in Incident Management Workflow\r\no Defining escalation policies.\r\no Ensuring alerts/incident prioritization/closures.\r\n• GAP analysis on the monitoring infrastructure and tools\r\no Missing functionalities in the current implementation.\r\no Evaluation of potential replacement tools.\r\no Should be able to drive a POC on the potential replacement tool.\r\n• DevOps experience\r\no Linux, Docker (CREs), networking.\r\n• Documentation and Communications\r\no Should be able to document and maintain the configurations, alert mapping, WOW, error handling.\r\no Should be able to communicate and convince the non technical stakeholders.\r\n
Skill Requirements
Role Requirements:\r\n• Experience with Network Monitoring and Observability solutions of CNFs/VNFs\r\no Have good knowledge of Prometheus, SNMP monitoring platforms, GCP, and K8s.\r\no Should be able to assess the integration of these tools into alerting/on-call systems.\r\n• Experience in Integration/Automation\r\no REST based integrations for monitoring/alerting tools with external systems.\r\no Should be able to visualize the end to end workflow (alerting to ticketing).\r\no Should have experience in automation (Python/Go) for integrations and RESTful API requests for custom jobs.\r\n• Experience in Incident Management Workflow\r\no Defining escalation policies.\r\no Ensuring alerts/incident prioritization/closures.\r\n• GAP analysis on the monitoring infrastructure and tools\r\no Missing functionalities in the current implementation.\r\no Evaluation of potential replacement tools.\r\no Should be able to drive a POC on the potential replacement tool.\r\n• DevOps experience\r\no Linux, Docker (CREs), networking.\r\n• Documentation and Communications\r\no Should be able to document and maintain the configurations, alert mapping, WOW, error handling.\r\no Should be able to communicate and convince the non technical stakeholders.\r\n
Other Requirements
Role Requirements:\r\n• Experience with Network Monitoring and Observability solutions of CNFs/VNFs\r\no Have good knowledge of Prometheus, SNMP monitoring platforms, GCP, and K8s.\r\no Should be able to assess the integration of these tools into alerting/on-call systems.\r\n• Experience in Integration/Automation\r\no REST based integrations for monitoring/alerting tools with external systems.\r\no Should be able to visualize the end to end workflow (alerting to ticketing).\r\no Should have experience in automation (Python/Go) for integrations and RESTful API requests for custom jobs.\r\n• Experience in Incident Management Workflow\r\no Defining escalation policies.\r\no Ensuring alerts/incident prioritization/closures.\r\n• GAP analysis on the monitoring infrastructure and tools\r\no Missing functionalities in the current implementation.\r\no Evaluation of potential replacement tools.\r\no Should be able to drive a POC on the potential replacement tool.\r\n• DevOps experience\r\no Linux, Docker (CREs), networking.\r\n• Documentation and Communications\r\no Should be able to document and maintain the configurations, alert mapping, WOW, error handling.\r\no Should be able to communicate and convince the non technical stakeholders.\r\n