Havi Tech Hub | Observability | Senior SRE (Site Reliability Engineering) Engineer
Your new company
Havi, a global leader since 1974, employs over 10,000 people and serves customers in more than 100 countries. Specializing in the foodservice industry, Havi provides innovative supply chain and logistics solutions, including analytics, planning, distribution, and freight management.
Havi's diverse teams collaborate seamlessly across locations and functions, embodying a spirit of integrity and creativity to serve their customers in the best way possible.
Your new role
This role will act as a Senior SRE (Site Reliability Engineering) Engineer within the Supply Chain Technology function. This person will ensure the reliability, resilience, and observability of enterprise services in production, combining deep technical troubleshooting skills with proactive engineering to continuously improve system stability.
This person will operate within a follow‐the‐sun engineering model between Portugal and Malaysia, focusing on maintaining service health, monitoring SLOs and error budgets, acting as an engineering-level first responder during major incidents, and driving systemic improvements across platforms and services.
The Senior SRE Engineer will:
* Monitor and manage SLOs, SLIs, and error budgets across services;
* Perform deep technical triage during high-impact incidents;
* Lead structured root cause analysis and ensure corrective actions are implemented;
* Design and implement resilience improvements (failover, retries and scaling strategies);
* Improve observability coverage and alert quality;
* Automate operational tasks to reduce manual intervention;
* Collaborate with cross‐functional teams to embed reliability into service design;
* Contribute to postmortems and continuous improvement initiatives.
A typical day will include monitoring reliability indicators, improving resilience mechanisms, analyzing incident patterns, refining observability practices, building automation, and working closely with Platform Engineering, Application Development, Service Operations, and Security.
This role will include day‐to‐day collaboration across global teams, ensuring stability and driving ongoing reliability enhancements.
What you will need to succeed
As a Senior SRE Engineer, you will need:
* Bachelor's Degree in Computer Science, Engineering, or equivalent experience;
* +5 years of experience in Infrastructure, Cloud Engineering, or SRE roles;
* Experience supporting production systems at scale;
* Experience with distributed systems and Cloud platforms (Azure preferred);
* Knowledge of CI/CD pipeline reliability practices;
* Strong understanding of SRE principles (SLOs, SLIs and error budgets);
* Solid understanding of incident management processes;
* Advanced troubleshooting capability across application and infrastructure layers;
* Familiarity with Infrastructure as Code concepts;
* Ability to work effectively in a global, cross-regional engineering model;
* Strong observability expertise (metrics, logs, tracing and alert tuning);
* Automation and scripting skills (Python, Bash, PowerShell, etc.);
* Strong analytical and problem solving skills;
* Excellent communication skills – Fluency in English.
What the company can offer you
Have the opportunity to join a cross-functional team in an international company with a multicultural working environment!
Senior SRE Engineer (m/f/d) Hays Working for your tomorrow