2 days ago Be among the first 25 applicantsEducational Qualifications:
BSc, MSc in Computer Science, Electrical/Computer Engineering or related technical discipline.Experience:
Minimum 5 years of production-level experience in big data manipulation using high-level programming languages such as Python (preferred), Java, or Scala. Proven ability to solve complex problems and deliver quality outcomes.Technical Skills:
Experience in building robust data pipelines with open source distributed computing frameworks (Apache Spark, Apache Flink, Dask).Data Infrastructure:
Experience designing, constructing, cataloging, and optimizing data lake infrastructures (e.G., MinIO / Amazon S3, Hive Metastore / Glue Data Catalog).Cloud & Serverless Computing:
Experience with AWS and cloud technologies.Containerization & Orchestration:
Familiarity with Docker for local development and tuning applications on Kubernetes.SQL & Data Warehousing:
Experience with SQL analytic workloads against cloud data warehouses (e.G., Amazon Redshift) or data lakes (e.G., Presto, Amazon Athena).Software Development:
Strong understanding of software testing, agile development, and version control.Big Data Formats:
Knowledge of Apache Parquet, Avro, ORC and leveraging their metadata.Language & Communication:
Fluency in English;
excellent team collaboration skills;
curiosity and willingness to learn new technologies.Preferred additional experience includes building scalable data streaming applications (e.G., Spark Streaming, Apache Flink, Amazon Kinesis), workflow orchestration tools (e.G., Airflow, Luigi), and familiarity with brokers like SQS/SNS or Kafka. Knowledge of NoSQL databases such as Redis and MongoDB is a plus.Seniority level
Mid-Senior levelEmployment type
Full-timeJob function
Information TechnologyIndustries
IT Services and IT Consulting
#J-*****-Ljbffr