Unlock Your Potential as a Site Reliability Engineer
We are seeking an exceptional Site Reliability Engineer I to join our team. This role is ideal for individuals who are passionate about designing, building, and maintaining scalable and highly available systems.
This position offers the opportunity to work on cutting-edge technologies, collaborate with cross-functional teams, and contribute to the development of our internal platform. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our services.
Key Responsibilities:
1. Design and Implement Scalable Systems: Design, build, harden, and maintain key parts of our internal platform, from CI/CD to developer tools.
2. Migrate to Industry-Leading Tools: Migrate to industry-leading CICD tools like GitHub Actions.
3. Automate Deployment Practices: Automate safe deployment practices using industry-leading tools like ArgoCD, Argo Rollouts, Helm Charts, etc.
4. Automate Infrastructure Provisioning: Automate infrastructure provisioning and other engineering processes by working on automations built on top of an engineering platform written in GitHub Actions.
5. Couch and Upskill Team Members: Couch and up-skill other engineering team members.
6. Solve Challenging Technical Problems: Solve challenging technical problems and see an immediate impact of your work.
Our Philosophy:
We believe in a DevOps philosophy where every engineering team should be responsible for the software they build and deploy. SREs play a critical role in ensuring that the teams have the tools, practices, and expertise to make that happen in a blame-free culture.
We are building our own internal PaaS using the latest technologies like Kubernetes, Prometheus, Kotlin, and others. This platform is an important pillar in our engineering effort and helps us deliver better, faster, and more reliable solutions for our customers.