Job Title: System Reliability Engineer
Our company is looking for an experienced System Reliability Engineer to join our team. This role will be responsible for ensuring the reliability and availability of our systems.
Key Responsibilities:
* Provide support to technical and business teams, responding to incidents and service degradation;
* Monitor systems proactively to detect and respond to issues;
* Investigate integration issues, gather information, and collaborate with internal and external teams;
* Perform root cause analysis to prevent recurring issues;
* Prioritize multiple concurrent issues effectively with your team;
* Understand the business context and technical architecture of each system to better assess impact and urgency;
* Participate in on-call rotations to ensure platform stability;
* Contribute to the continuous improvement of monitoring, alerting, logging, and incident response processes;
* Act as a liaison between technical and non-technical stakeholders, adapting communication accordingly.
Required Qualifications:
* 3+ years of experience in Application Support or Site Reliability Engineering;
* Strong analytical mindset: identify patterns, differentiate between isolated errors and systemic issues;
* Experience with microservices operationalization;
* Proficient with tools like ELK stack, Prometheus, and Grafana;
* Familiarity with cloud environments, especially AWS;
* Experience using collaboration platforms such as Jira, Confluence, GitLab;
* Ability to understand complex systems architecture and how components interact within a broader ecosystem;
* Strong proactivity in identifying risks through logs and metrics and suggesting improvements to observability;
* Excellent communication skills, especially when engaging with non-technical stakeholders;
* Willingness to participate in on-call duty as needed;
* Fluency in English, both written and spoken.
Nice to Have:
* Hands-on coding experience with .NET Core, Python, or similar;
* Background in Retail or Logistics domains;
* Familiarity with Transport Management Systems (TMS) and logistics processes;
* Experience working with transport carriers (operational or functional knowledge).
Technical Requirements
The ideal candidate will have experience with the following technologies:
* ELK stack
* Prometheus
* Grafana
* AWS
* Jira
* Confluence
* GitLab