.Do you want to work 36 hours in 4 days?
If so, your future starts here!360imprimir/BIZAY is a marketing products and services marketplace present in 24 countries and constantly growing due to its technological reinvention!Our aim is to help and inspire small and medium-sized enterprises (SMEs) to have successful communications by changing the way they develop and implement their marketing strategy.As a Site Reliability Engineer (SRE), you will be responsible for our monitoring and observability practice.
You will define and fine-tune Service Level Objectives (SLOs) and Service Level Indicators (SLIs) and work closely with product managers and engineering leads to mentor them on these practices.
You will also help ensure our platform is stable, reliable, and continuously improving by implementing industry-leading monitoring solutions.Key Responsibilities:Lead the development of our monitoring and observability strategy.Define, implement, and maintain SLOs/SLIs for key services.Mentor Product Managers and Engineering Leads in the definition and optimization of SLOs/SLIs.Collaborate closely with the engineering and product teams, and the quality and monitoring team to manage incidents and maintain system health.Set up monitoring tools to ensure visibility into the performance and reliability of our eCommerce platform.Continuously improve our incident management processes by identifying and resolving performance bottlenecks.Implement and manage Datadog for monitoring, along with other tools in the stack (e.G., Cloudflare and Azure Cloud).Continuously optimize CI/CD processes for performance, reliability, and incident prevention.Collaborate with QA teams to ensure that monitoring and observability practices are integrated into the testing process for early issue detection and prevention.Ensure best practices for high availability, performance, and security across the infrastructure.Help the team evolve our SRE practices and develop a culture of observability.Qualifications:Mid-level to Senior experience in a Site Reliability Engineering, DevOps, or similar role.Strong experience with monitoring and observability tools and frameworks.Familiarity with Monitoring Tools, CI/CD, Scripting tools and Cloud Environments.Proven experience defining and implementing SLOs and SLIs for large-scale systems.Strong understanding of incident management and ability to collaborate closely with engineering teams.Monitoring-obsessed and passionate about improving system reliability and visibility.Nice to Have:- Experience working in a high-traffic, customer-facing platform.
E-commerce experience is a plus.- Previous experience in mentoring and guiding teams on best practices in observability