About the Opportunity
We are seeking a skilled Data Engineer to join our team. As a Data Engineer, you will design, build, and optimize batch and streaming data pipelines in Databricks (PySpark, Spark SQL).
Key Responsibilities:
* Design and develop scalable data transformations aligned with the Medallion Architecture.
* Ensure data quality, reliability, and performance through testing and monitoring.
* Manage data infrastructure using Terraform and GitOps principles.
* Operate workflows with Airflow on Azure Kubernetes Service (AKS).
* Collaborate with Data Architects, Project Managers, and stakeholders to align on solutions and delivery.
* Participate in code reviews and contribute to knowledge sharing within the engineering team.
Requirements
To be successful in this role, you will need:
* Strong experience with Databricks, PySpark, and Spark SQL.
* Proven expertise in batch and streaming data processing.
* Hands-on experience with Azure Data Lake Storage Gen2 (ADLS).
* Solid knowledge of Airflow, preferably on Kubernetes (AKS).
* Understanding of Medallion Architecture principles.
* Familiarity with Terraform and infrastructure-as-code practices.
* Awareness of data privacy, governance, and security standards.
Nice to Have
Experience with Talend and/or Fivetran is a plus, as well as knowledge of Databricks Asset Bundles and familiarity with Vault, Helm charts, and Kafka monitoring tools.
Location: Remote, Portugal