Lead Multimodal Data Expert
We are seeking an exceptional Lead Multimodal Data Expert to join our Generative AI team in Portugal. This is a Remote – Portugal position, allowing for flexibility and work-life balance.
About the Role
This senior-level position entails applying cutting-edge tools and techniques to interpret diverse input types, including multimodal extraction, prompt engineering, data quality & structuring, content filtering, human-in-the-loop feedback, scalability & optimization, research & prototyping, and collaboration with cross-functional teams.
Main Responsibilities
1. Multimodal Extraction: Utilize state-of-the-art tools (OCR, vision-language models, document understanding frameworks) to extract insights from various data sources;
2. Prompt Engineering: Develop and refine strategies for using LLMs to transform unstructured content into structured formats;
3. Data Quality & Structuring: Clean, validate, and transform messy data into well-defined schemas ready for use in analytics pipelines;
4. Content Filtering: Design systems for cleaning, validating, and filtering data to ensure accuracy and align with ethical guidelines;
5. Human-in-the-Loop Feedback: Implement feedback loops where experts validate or enrich data, improving LLM-based extraction reliability;
6. Scalability & Optimization: Architect cost-efficient, high-throughput data pipelines that are robust to noisy or incomplete sources;
7. Research & Prototyping: Experiment with emerging tools and methods in the LLM + multimodal space, exploring new ways to enhance information coverage and extraction reliability;
8. Collaboration: Partner with data engineers and other data scientists to integrate collected data into larger AI and analytics systems;
Requirements
* Master's degree (or Ph D) in Computer Science, Data Science, Machine Learning, Statistics, or a related field;
* Proficiency in Python and experience with libraries for web scraping, OCR, and NLP;
* Deep understanding of LLM capabilities in multimodal and extraction contexts, including prompt engineering and few-shot learning;
* Strong background in unstructured data processing: APIs, web scraping, HTML parsing, OCR, image/document analysis;
* Strong analytical problem-solving skills, with a track record of turning noisy data into high-quality datasets for ML;
* Excellent communication and documentation skills, with the ability to influence across technical and product teams;
About the Opportunity
This role offers a unique chance to drive innovation and excellence in the field of Generative AI. As a Lead Multimodal Data Expert, you will have the opportunity to work on cutting-edge projects, collaborate with top talent, and shape the future of AI.
We offer a dynamic and supportive work environment, with opportunities for growth and professional development. If you are passionate about AI, data science, and innovation, we encourage you to apply for this exciting opportunity.