Job Summary
Experienced L1/L2 Support Engineer with strong expertise in production support, incident management, and AIOps/event monitoring platforms. Skilled in Linux troubleshooting, log analysis, and SQL, with hands-on experience in identifying and resolving application, integration, and performance-related issues. Proficient in handling end-to-end ticket lifecycle, root cause analysis, and ensuring SLA adherence. Familiar with REST APIs, messaging systems, and observability tools, with exposure to cloud and container technologies. A proactive team player with strong problem-solving skills and a customer-focused approach, capable of working effectively in a 24x7 support environment.
Key Responsibilities
Provide L1/L2 production support for the Moogsoft AIOps/event management platform
Monitor, triage, and resolve incidents across event ingestion, processing, and correlation layers
Perform UI validation, log analysis, and data ingestion verification for issue identification
Troubleshoot configuration, integration, database, and performance-related issues
Classify issues (configuration, defect, or enhancement) and escalate appropriately
Collaborate with Development, SRE, and DevOps teams for faster resolution and service improvement
Handle customer tickets, provide timely updates, and ensure SLA adherence
Participate in incident management, RCA preparation, and continuous improvement initiatives
Support release validation and deployment activities
Work in rotational shifts and be flexible to operate outside IST timezone (24x7 support model)
Skill Requirements
Understanding of Moogsoft or similar event management/AIOps platforms (event ingestion, correlation, alerts, dashboards)
Strong Linux fundamentals and log analysis skills
Basic SQL knowledge (MySQL/MariaDB or similar databases)
Familiarity with REST APIs, SNMP, webhooks, and system integrations
Knowledge of messaging systems (Kafka/RabbitMQ) and event-driven architecture
Exposure to monitoring and observability tools (Elasticsearch/OpenSearch, Grafana, Prometheus)
Basic scripting knowledge (Bash, JSON handling)
Cloud & DevOps (Preferred)
Exposure to AWS services (EC2, S3, RDS)
Knowledge of Docker and containerized environments
Basic understanding of Kubernetes concepts
Support & Operational Skills
Experience in incident management, ticket triage, and escalation handling
Strong troubleshooting and root cause analysis (RCA) capability
Ability to differentiate between configuration issues, product defects, and known issues
Experience supporting releases and deployment activities
Other Requirements
Strong analytical and problem-solving skills
Good communication and stakeholder management
Customer-focused approach with structured thinking
Ability to work in a fast-paced, 24x7 support environment