Agent / LLM Engineer - AI Agent Development
**Location:** Maia, Porto, Portugal (On-site)
**Experience Level:** Junior to Mid Level (1-4 years)
**Employment Type:** Full-time
About the Role
We are seeking a passionate and innovative Agent/LLM Engineer to join our AI development team. You will be responsible for refining and enhancing our Agents and developing the next-generation with MCP (Model Context Protocol) server implementation. This is an exciting opportunity to work at the forefront of AI agent technology and contribute to cutting-edge customer support solutions.
Key Projects
Current: Backoffice Agent Refinement
- Enhance existing agent capabilities and performance
- Optimize agent workflows using LangGraph for complex multi-step reasoning
- Implement advanced hallucination detection and mitigation strategies
- Improve agent reliability, error handling, and output validation mechanisms
Upcoming: Frontend Agent & MCP Server
- Design and develop customer-facing agents with real-time capabilities
- Implement MCP (Model Context Protocol) server architecture for seamless model integration
- Create scalable agent infrastructure supporting concurrent user interactions
- Develop robust conversation management and context preservation systems
Key Responsibilities
### AI Agent Development & Architecture
- Design and implement complex agent workflows using **LangGraph** for state management
- Build robust **LangChain** pipelines for document processing and retrieval
- Develop **hallucination detection and control** mechanisms to ensure response accuracy
- Create **scalable agent architectures** supporting high-concurrent user loads
- Implement **RAG (Retrieval-Augmented Generation)** systems with vector databases
### Advanced Agent Capabilities
- Develop **multi-agent orchestration** and coordination systems
- Implement **tool calling and function execution** frameworks
- Create **memory management systems** for long-term conversation context
- Build **agent evaluation and testing** frameworks for quality assurance
- Design **prompt optimization** and dynamic prompt generation systems
### Production & Scalability
- Implement **horizontal scaling** strategies for agent workloads
- Develop **caching mechanisms** for frequently accessed information
- Create **load balancing** solutions for distributed agent processing
- Monitor and optimize **token usage and cost management**
- Implement **rate limiting and abuse prevention** mechanisms
### Quality & Safety
- Build **output validation pipelines** to catch and correct hallucinations
- Implement **safety filters** and content moderation systems
- Develop **confidence scoring** and uncertainty quantification
- Create **human-in-the-loop** workflows for critical decisions
- Design **audit trails** and conversation logging systems
## Required Technical Skills
### AI/ML Frameworks & Tools
- **LangChain:** Advanced experience building complex agent pipelines and chains
- **LangGraph:** Proficiency in creating stateful, cyclical agent workflows
- **Vector Databases:** Experience with PostgreSQL with pgvector
- **LLM APIs:** Integration with OpenAI GPT-4, Anthropic Claude, and other LLM providers
- **Embeddings:** Working with text embeddings for semantic search and retrieval
Hallucination Control & Validation
- **Fact-checking mechanisms:** Building verification systems against knowledge bases
- **Confidence scoring:** Implementing uncertainty quantification for LLM outputs
- **Output validation:** Creating structured validation pipelines
- **Guardrails:** Implementing safety rails and content filtering
- **Ground truth verification:** Comparing outputs against authoritative sources
### Scalability & Performance
- **Async programming:** Python asyncio for concurrent request handling
- **Queue systems:** Redis, Celery, or RQ for background task processing
- **Caching strategies:** Redis/Memcached for response and embedding caching
- **Database optimization:** Query optimization and connection pooling
- **Load testing:** Performance testing for high-concurrency scenarios
### Backend Development
- **Python:** Strong proficiency (2+ years) with modern async frameworks
- **FastAPI/Flask:** Building robust APIs with proper error handling
- **Database Integration:** PostgreSQL, vector databases, and ORM usage
- **Authentication:** JWT, OAuth, and secure API design
- **Docker/Containerization:** Containerized deployment and orchestration
## Advanced Skills (Preferred)
### Agent Architecture Patterns
- **Multi-agent systems:** Coordinating multiple specialized agents
- **Tool use frameworks:** ReAct, Plan-and-Execute, and custom reasoning patterns
- **Context window management:** Efficient handling of large conversation contexts
- **Streaming responses:** Real-time response generation and WebSocket integration
- **Agent memory architectures:** Short-term, long-term, and episodic memory systems
### ML/AI Operations
- **Model monitoring:** Tracking agent performance and behavior drift
- **A/B testing:** Experiment frameworks for agent improvements
- **Data pipeline management:** ETL for training data and knowledge base updates
- **Model fine-tuning:** Custom model adaptation for specific use cases
- **MLOps practices:** Version control for prompts, models, and agent configurations
### Integration & Protocols
- **MCP (Model Context Protocol):** Implementation and server development
- **WebSocket/SSE:** Real-time bidirectional communication
- **Microservices:** Agent deployment in distributed architectures
- **API Gateway patterns:** Request routing and transformation
- **Event-driven architecture:** Pub/sub patterns for agent communication
## Soft Skills & Attributes
- **Problem-solving mindset:** Creative approaches to complex AI challenges
- **Attention to detail:** Precision in prompt engineering and output validation
- **User empathy:** Understanding customer needs in conversational interfaces
- **Analytical thinking:** Data-driven approach to agent performance optimization
- **Communication skills:** Excellent English and Portuguese for team collaboration
- **Adaptability:** Comfort with rapidly evolving AI technologies and best practices
- **Quality focus:** Commitment to building reliable, production-ready AI systems
## What We Offer
- Competitive salary commensurate with experience level
- Professional development budget for AI/ML courses and certifications
- Access to premium LLM APIs and latest AI development tools
- Modern office environment in Maia, Porto with powerful development hardware
- Flexible working hours within core business hours
- Opportunity to work with cutting-edge AI agent technologies
- Coffe and snacks
## Work Environment
This is an **on-site position** based in our Maia office. We foster a collaborative environment where AI engineers can experiment, share discoveries, and iterate quickly on agent improvements. Our team values continuous learning and staying at the forefront of AI agent development.
## Technical Stack
- **AI/ML:** LangChain, LangGraph, OpenAI GPT-4, Anthropic Claude, Hugging Face
- **Languages:** Python (primary), TypeScript/JavaScript (frontend integration)
- **Databases:** PostgreSQL with pgvector, Redis for caching
- **Infrastructure:** Docker, Kubernetes, cloud deployment (AWS/GCP/Azure)
- **Monitoring:** Custom agent analytics, OpenTelemetry, Prometheus
- **Development:** FastAPI, Pydantic, pytest, black, mypy
## Technical Interview Topics
Candidates should be prepared to discuss:
### Agent Architecture & Development
- LangGraph workflow design for complex multi-step reasoning
- LangChain pipeline optimization and best practices
- RAG system architecture and vector database selection
- Multi-agent coordination and communication patterns
### Quality & Safety
- Hallucination detection and mitigation strategies
- Output validation and fact-checking mechanisms
- Confidence scoring and uncertainty quantification
- Safety filters and content moderation approaches
### Scalability & Performance
- Horizontal scaling strategies for agent workloads
- Caching mechanisms for embeddings and responses
- Async programming patterns for concurrent users
- Token optimization and cost management strategies
### Integration & Production
- MCP protocol implementation approaches
- Real-time conversation management systems
- Error handling and graceful degradation patterns
- Monitoring and observability for AI agents
## Sample Technical Challenges
During interviews, you may be asked to:
- Design a LangGraph workflow for a multi-step customer support scenario
- Implement a hallucination detection mechanism for agent responses
- Architect a scalable system supporting 1000+ concurrent conversations
- Debug an agent producing inconsistent or incorrect responses
- Optimize a RAG system for better retrieval accuracy and speed