Unlock the Future of Technology with Our Team
We are seeking a highly skilled LLM Operations Engineer to join our cutting-edge team.
About the Role
This is an exciting opportunity to work on large language models (LLMs) from installation and fine-tuning to deployment and monitoring. You will be responsible for ensuring scalable, secure, and efficient deployment of LLMs using enterprise data.
You will collaborate at the intersection of MLOps, software engineering, and cloud infrastructure to maintain governance, observability, and performance standards for production LLMs.
Develop and optimize pipelines for training, deployment, and model management to drive business growth and innovation.
Required Skills and Qualifications
* Minimum 2 years' experience in similar roles
* Strong programming skills in Python, Bash, and Go
* Hands-on experience with LLM frameworks like Hugging Face Transformers, PEFT, DeepSpeed, Megatron-LM, vLLM, or TensorRT-LLM
* Expertise in containerization, orchestration, and cloud infrastructure using Docker, Kubernetes, Helm, Terraform, or Ansible
* Knowledge of distributed training frameworks such as PyTorch, Ray, Horovod, or Accelerate
* Fluency in English and residence in Portugal
* CPLP nationality or an EU work permit is required to facilitate employment
Benefits
* Become part of a top-notch team that values innovation, collaboration, and continuous learning
* Enjoy a dynamic and supportive work environment that fosters growth and development
* Work on challenging projects that make a real impact on the industry
Be Part of Our Journey
At our company, we believe in empowering our employees to become tech thinkers, solvers, minders, and innovators. Join us on this exciting journey and discover the possibilities.