Site Reliability Engineer - Remote
Job Description:
• Be on a Pagerduty on-call rotation to respond to production incidents
• Maintain and develop monitoring and alerting solutions to improve the on-call experience
• Design, build and maintain scalable infrastructure for running our systems
• Assist product developers in debugging and triaging production issues
Requirements:
• 2+ years experience working as a Site Reliability Engineer or related position
• Experience with AWS, Kubernetes, Docker
• Familiarity with deployment/provisioning tools like Terraform, Helm, Ansible
• Strong knowledge of the Linux platform
• Comfortable working with Golang and shell script
• Experience with observability and monitoring tools - Prometheus, Datadog, NewRelic, Grafana, Loki, or similar
• Experience with MySQL or similar relational databases and GitLab is also a plus
Benefits:
Apply tot his job
Apply To this Job