As the Director of Application Support/SRE, you will be responsible for building our SRE capabilities and application support for our applications, including website, order management system, microservices, middleware, inventory management, analytics platform and vendor management. Reporting to the Senior Director of Technical Operations, you will work with all application development/engineering teams, DevOps and Infrastructure teams. In addition, you will:
- Identify and define yearly budgets and ensures spend and operational expenses are in line with approved budgets.
- Actively lead and be engaged in technical discussions, project execution and incident management
- Partner with Sephora engineering to learn from incidents through the RCA process and avoid recurrence of those
- Drive service reliability by developing tooling that enables metric visibility using SLIs, SLOs, and SLAs.
- Promote simplicity in solving complex problems across our infrastructure systems and teams as we scale.
- Set strategy and roadmap for team towards reducing operational overhead of keeping Sephora IT Applications healthy, secure, and up for our customers and business.
- Build and develop the Site Reliability Engineering team
- Work with key stakeholders across Sephora Engineering teams to take ownership of the operational health, security, growth, usability, and design of our production and developer infrastructure systems.
- Manage vendors, track Service Levels, manage cost model, negotiate/renegotiate contracts, and drive vendor performance.
- Advocate and implement reliable design patterns and VALET based monitoring framework
- Lead and focus team on root cause analysis, pattern identification and continuous improvement in order to optimize application performance, resiliency and reliability.
- Collaborate across engineering to drive ownership of production systems, enable faster decision making and transparent observability into system health.
- Experience in managing applications and ecommerce productions systems
- Experience in building SRE capabilities for large enterprises
- Vendor management experience
- Deep knowledge of production operation systems for monitoring, investigation, and alerting – Splunk, Nagios, PagerDuty, Grafana, Loki, Prometheus, etc
- Expertise in problem solving and analyzing global scale distributed systems
- Minimum 8 years of experience in a management role within the Technology space
- Experience creating actionable JIRA stories for technical audiences
- Good knowledge of retail, ecommerce, and enterprise application technologies
- Strong oral and written communication
- Minimum 15 years of experience in a medium to large technology or retail organization running a large, complex application support team
Vacancy Type: Full Time
Job Location: San Francisco, CA, US
Application Deadline: N/A