Job Title: MLOps Engineering Specialist
The Role:
We're seeking a skilled MLOps Engineer to join our team. As an MLOps Engineer, you will play a key role in designing and implementing Machine Learning (ML) workflows and infrastructure.
Key Responsibilities:
1. MLOps Strategy & Architecture:
o Develop and maintain comprehensive documentation for MLOps processes and infrastructure.
o Evaluate and select appropriate MLOps tools and technologies.
o Collaborate on the MLOps strategy for our hybrid cloud environment, ensuring alignment with business objectives, security requirements, and best practices.
2. Hybrid ML-loop Implementation:
o Design and implement Infrastructure as Code (IaC) solutions for provisioning and managing cloud and on-premises resources using Terraform.
o Implement robust security measures to protect sensitive Data and ML models in both cloud and on-premises environments.
o Implement and/or integrate tools to deliver the full MLOps loop.
o Develop and implement monitoring dashboards and alerts to proactively identify and resolve MLOps platform issues.
3. Collaboration & Leadership:
o Work closely with data scientists, software engineers, and infrastructure teams to deliver a high-quality MLOps solution.
o Provide technical leadership and mentorship to junior engineers.
o Communicate effectively with stakeholders at all levels, including technical and non-technical audiences.
o Stay up to date with the latest MLOps trends and technologies.
o Participate in code reviews and contribute to the development of best practices.
Qualifications:
To be successful in this role, you'll need:
1. Education: A Master's degree or Ph.D. in Computer Science, Data Science, Machine Learning or a related field.
2. Experience: At least 3 years of experience in MLOps, Data Engineering or similar roles, with a strong focus on AI/ML use-cases.
3. Technical Skills:
o Proven experience designing, implementing, integrating, and maintaining Data infrastructure and MLOps solutions.
o Strong knowledge of Cloud solutions, preferably Azure.
o Experience with on-premises infrastructure management, including server management, networking, and storage.
o Experience with containerization and orchestration tools like Docker, Kubernetes, and Apache Airflow.
o Strong scripting and automation skills, such as Python and Bash.
o Strong understanding of DevOps principles and practices.
o Experience with CI/CD pipelines, preferably GitHub Actions.
o Proficiency in IaC tools, preferably Terraform.
4. Soft Skills:
o Excellent problem-solving skills.
o Effective communication skills, team player, and capable of collaborating across functional areas.
o Proactive, customer and result-oriented personality.
o Fluent in English, written and spoken.
o Experience working in Agile/SAFe environments.