Overview
We are looking for a hands-on Senior ML Engineer to lead the development and operation of production-grade AI systems across LLMs, OCR, and voice. This role combines deep technical ownership with leadership of a high-performing team of junior engineers, as well as direct engagement with stakeholders to shape AI solutions.
Responsibilities:
Lead and mentor a team of highly talented junior ML engineers through:
Code reviews, design reviews, and technical direction
Enforcement of strong software engineering and ML best practices
Design, deploy, and operate scalable AI systems with a focus on reliability and performance
Lead production deployment of LLMs and multimodal systems (RAG, OCR, voice)
Own **model performance end-to-end**, combining evaluation, observability, and hardware optimization:
Build evaluation pipelines (benchmarks, regression testing, LLM-as-judge)
Implement deep observability (tracing, latency, error tracking)
Optimize GPU utilization (multi-GPU serving, batching, quantization, memory tuning)
Continuously improve throughput, latency, and cost efficiency
Architect and manage GPU infrastructure:
Model serving, load balancing, and scaling strategies
Hardware-aware deployment and performance tuning
Build and maintain robust MLOps pipelines:
Model/version management, CI/CD, automated testing, and rollback strategies
Monitoring and feedback loops for continuous improvement
Engage directly with clients and stakeholders to:
Gather and clarify business requirements
Translate non-technical needs into well-defined technical problems
Communicate solutions, trade-offs, and progress through clear documentation, reports, and proposals
Contribute hands-on to system design, implementation, debugging, and production incident resolution
Company:
PhazeRo
Qualifications:
Requirements
Proven experience deploying LLMs in production
Strong experience with GPU-based inference and optimization
Solid backend engineering skills (Python, APIs, distributed systems)
Experience with MLOps and production ML systems
Experience with OCR/document AI and/or voice systems (STT/TTS)
Experience with Docker and Kubernetes
Strong understanding of modern AI architectures (RAG, vector DBs, agent workflows)
Experience mentoring or leading engineers
Strong communication skills with the ability to bridge business and technical domains
Nice to Have:
Experience with open-weight models (Qwen, Llama, DeepSeek, Gemma)
Experience with on-prem / sovereign AI deployments
Experience with LoRA / fine-tuning
Multilingual or Arabic NLP experience
About PhazeRo
PhazeRo provides advanced ML & AI consulting services to corporate clients across the globe.