Remote Engineering VP, Reliability
About the position
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a VP of Engineering, Reliability. In this pivotal role, you'll define and execute the reliability engineering roadmap while managing a team responsible for ensuring system stability across cutting-edge infrastructure and AI-native architectures. Your impact will bridge the gap between engineering efficiency and operational excellence, paving the way for scalable growth and enhanced service delivery. This position demands a visionary leader with a track record of transforming reliability within innovative technology environments. You will leverage your extensive experience to create a forward-looking vision that meets organizational goals while ensuring compliance and security.
Responsibilities
• Define and execute the reliability engineering roadmap, aligning with enterprise growth.
• Balance centralized platform capabilities with distributed ownership for scalability.
• Establish SLO/SLI/error budget frameworks for feature velocity and system stability.
• Lead infrastructure cost management and capacity planning to meet enterprise commitments.
• Develop and scale a multi-disciplinary team while fostering a culture of ownership.
• Drive continuous improvement through DORA metrics and incident trend analysis.
• Empower developers with self-service tooling and clear documentation.
• Act as the primary engineering interface for compliance and security requirements.
• Collaborate with executives to position reliability as a key enabler for success.
Requirements
• 15+ years of engineering experience, with 7+ years in leading reliability or infrastructure teams.
• Proven track record managing organizations of 40+ engineers across multiple teams.
• Demonstrated experience evolving reliability operating models for scalable businesses.
• Expertise in regulated sectors where compliance and data sensitivity are critical.
• Strong understanding of SRE principles, including SLOs and incident management.
• Technical command of AWS, Terraform (IaC), and modern observability stacks.
• Experience owning cloud infrastructure budgets and cost management.
• Familiarity with AI/ML workloads and their reliability requirements.
• Executive presence for engaging with the C-suite on risk management.
Benefits
• A dynamic, rapidly growing organization focused on helping businesses thrive.
• Comprehensive Medical, Dental, & Vision Insurance for full-time employees.
• Competitive and fair pay commensurate with experience.
• Maternity and paternity leave policies for full-time employees.
• Short and long-term disability coverage.
• Opportunities to learn from a dedicated leadership team.
• Top-of-the-line company swag for team members.
Apply tot his job
Apply To this Job