Site Reliability Engineering (SRE) Architect - REMOTE
About the position
Responsibilities
• Must have good experience on infrastructure architecture
• Must have application architectural experience
• Build and mentor the SRE teams across
• Architectural experience on performing SRE activities on their own
• Defining SRE Roadmaps and Strategies across organization
• Ability to create and set SRE objectives SLI, SLO and SLA, KPI
• Excel on implementing core SRE principles and practices
• Work closely with team on Estimation & Resource Planning and deliver solutions that meet business needs
• Understand management requirements and strategize planning from SRE and resiliency perspective
• Triage and RCA of production incidents
• Observability and monitoring with APM tools and creating dashboards/alerts and automation for incidents
• Leadership qualities like cross teams collaboration and effective communication
Requirements
• Agile
• JIRA
• Budget and Resource Planning
• ITSM tools like Service Now
• Microsoft project, Planner
• Documentation tools like Confluence, Sharepoint
• SRE Chaos/ Resiliency Engineering-Primer
• SRE Monitor&Observ-Primer
• Cloud Concepts
• Kubernetes
• Docker
• Cloud native technologies
• Linux/Unix
• Any programming/scripting languages
• Terraform
• Ansible
• Monitoring tools
• Grafana
• Splunk
• Databases
• APM
• AWS
• Openshift
Benefits
• Medical/Dental/Vision/Life Insurance
• Paid holidays plus Paid Time Off
• 401(k) plan and contributions
• Long-term/Short-term Disability
• Paid Parental Leave
• Employee Stock Purchase Plan
Apply tot his job
Apply To this Job