Summary:This role involves designing and developing scalable web crawling and data extraction systems aimed at collecting significant amounts of web data to identify fraud patterns and analyze digital ecosystems.
Responsibilities:
* Design and implement scalable web crawlers and scraping solutions.
* Extract, process, and structure large datasets from multiple web sources.
* Optimize crawling performance and reliability.
* Work with engineering and data teams to integrate crawled data into data pipelines.
* Ensure compliance with web data collection standards and regulations.
Must Haves:
* Strong experience with web crawling or scraping technologies.
* Experience with React Native, Python, Scrapy, Puppeteer, Selenium or similar tools.
* Knowledge of data pipelines and distributed systems.
* Understanding of web technologies (HTML, JavaScript, APIs).
* Experience handling large-scale data extraction.