Job Summary
Own the reliability, availability, and performance of production NAS and/or Object Storage services. Apply SRE principles to storage platforms: define reliability goals, improve observability, and reduce manual operational work through automation. Design and build automation and Infrastructure‑as‑Code to manage storage systems at scale. Lead troubleshooting and resolution of complex storage incidents; participate in on‑call and incident response. Perform capacity planning, forecasting, and demand modeling to support business growth. Partner with engineering teams to support application onboarding, testing, and production readiness. Contribute to global storage initiatives, including lab and infrastructure deployments. Create and maintain runbooks, documentation, and operational best practices to improve team efficiency. What We’re Looking For 8+ years of experience in SRE, infrastructure automation, or platform engineering, with strong storage exposure. Hands‑on experience operating NAS and/or Object Storage platforms, cluster/Ceph in production. Strong proficiency with automation and IaC tools (e.g., Ansible, Terraform, Puppet, SaltStack). Experience running highly available, scalable systems in 24×7 environments. Familiarity with containers and orchestration (Docker, Kubernetes). Experience with CI/CD pipelines, monitoring, logging, and version control systems (Git, Perforce). Strong incident management, troubleshooting, and communication skills. Bachelor’s degree in Computer Science, Engineering, or a related field. Nice to Have Experience with large‑scale distributed systems. Strong understanding of SRE concepts such as SLIs, SLOs, error budgets, observability, and logging. Ability to debug and optimize infrastructure and automate repetitive workflows. Proven ability to work independently and deliver results as a contractor in a global team environment. Why This Role Apply SRE practices to storage systems at scale Work on mission‑critical infrastructure High impact, ownership‑driven role Opportunity to influence reliability and operational maturity across teams
Job Description : CI/CD pipeline
Key Responsibilities
We are looking for a Systems Storage Site Reliability Engineer (SRE) to support and scale our global storage platforms. This is a contractor position focused on applying SRE principles to storage systems—improving reliability, reducing operational toil, and enabling sustainable growth through automation and observability. You will work at the intersection of storage engineering and reliability engineering, partnering closely with infrastructure and application teams to operate production systems at scale. Automation tool , IAAC tools
Skill Requirements
Storage , Netapp, Backup, SRE , platform Engineer, object storage, ceph/Cluster . Automation tool
Other Requirements
Constrainers and orchestration