Job Information
Date Opened
20/10/2025
Job Type
Full time
Remote Job
Industry
IT Services
Job Description
This is a remote position.
We are looking for a Site Reliability Engineer with strong expertise in observability engineering to join our team. The ideal candidate will have hands-on experience with the Grafana Stack (Tempo, Loki, Mimir, Alloy), knowledge in Java development, a strong SRE mindset, and a passion for automation, scalability, and ownership.
Responsibilities
* Design, implement, and maintain observability solutions covering metrics, logs, traces, and RUM.
* Work with tools such as Grafana Cloud, Tempo, Loki, Mimir, Alloy, and OpenTelemetry.
* Build reliable alerting and monitoring pipelines based on SLOs/SLAs, focusing on low-maintenance automation.
* Ensure the health and integrity of observability data flows from instrumentation to dashboards.
* Collaborate with development and operations teams to embed observability by design into the software lifecycle.
* Define and promote best practices and standards for observability across the organization.
* Support the modernization of observability by replacing and evolving legacy monitoring and alerting solutions.
* Monitor observability-related costs and contribute to FinOps efforts by identifying optimization opportunities.
Must-have:
* 3+ years of experience as an SRE, Observability Engineer, or equivalent role.
* Practical experience with OpenTelemetry, or similar instrumentation tools.
* Knowledge in Kubernetes, Helm, Terraform, and ArgoCD.
* Experience designing and managing telemetry pipelines (metrics/logs/traces), exporters, and sidecars.
* Expertise in performance monitoring, alerting, dashboarding, and root cause analysis.
* Knowledge in Java development and applications instrumentation
* Product-oriented mindset with a bias for automation and a "you build it, you run it" culture
* Fluency in English.
Nice-to-have:
* Knowledge of APM and distributed tracing solutions.
* Experience with FinOps practices applied to observability.
* Hands-on involvement in replacing legacy monitoring stacks.
* Experience with Cloud environments (Azure preferred)
* Contributions to open-source observability tools.
If it sounds like you, share your CV with us and let's talk