Job Description
Plan and execute data infrastructure projects utilizing Databricks, PySpark, and Spark SQL to drive business growth through data-driven insights.
Develop scalable data pipelines and ensure data quality, reliability, and performance by implementing testing and monitoring protocols.
Collaborate with cross-functional teams, including Data Architects, Project Managers, and stakeholders, to align on solutions and delivery.
Required Skills and Qualifications
* Technical Expertise:
* Strong experience with Databricks, PySpark, and Spark SQL;
* Proven expertise in batch and streaming data processing;
* Hands-on experience with Azure Data Lake Storage Gen2 (ADLS);
* Solid knowledge of Airflow, preferably on Kubernetes (AKS);
* Understanding of Medallion Architecture principles;
* Familiarity with Terraform and infrastructure-as-code practices;
* Awareness of data privacy, governance, and security standards.
Benefits
* Opportunities for Growth:
* Participate in code reviews and contribute to knowledge sharing within the engineering team;
* Document workflows, processes, and deployment standards.
Other Requirements
* Desirable Skills:
* Experience with Talend and/or Fivetran;
* Knowledge of Databricks Asset Bundles;
* Familiarity with Vault, Helm charts, and Kafka monitoring tools.