Build an engine on Azure
Why a New Platform Was Needed
The organization operated across multiple digital brands, each relying on personalized content to drive engagement. While recommendation models were already in use, the setup had grown fragmented:
- Separate configurations per brand
- Limited scalability
- Manual deployment processes
- Insufficient monitoring and feedback integration
- Increasing pressure to deliver real-time relevance
The business challenge was clear: personalization was becoming critical to performance, but the underlying infrastructure wasn’t built for long-term scale.
Rather than optimizing individual models, the decision was made to rethink the foundation.
The Core Idea: Platform First
The initiative focused on one principle: build a reusable, scalable platform that separates experimentation from production stability.
This meant designing an architecture that could:
- Combine batch training with real-time serving
- Standardize deployment and versioning
- Scale across brands without duplicating infrastructure
- Ensure reliability in live environments
- Create feedback loops to continuously improve model output
By shifting from “a model project” to “a platform strategy,” the organization ensured that recommendation capabilities could evolve without rebuilding core infrastructure every time.
The Architecture in Practice
The solution combined two complementary layers.
A real-time layer processed user interactions and served recommendations instantly via APIs. Containerized services running on Azure Kubernetes ensured resilience and scalability, while monitoring tools provided visibility into performance and stability.
Alongside this, a batch layer handled offline model training and retraining. Data was processed and stored in Azure, with MLFlow managing experiments and model versioning. This allowed the data science team to iterate and improve models without disrupting live systems.
The clear separation between online serving and offline training proved essential. Real-time systems were built for reliability; training pipelines were built for flexibility.
Challenges Along the Way
Like most platform transformations, the complexity wasn’t in the algorithms, it was in integration and coordination.
Key challenges included:
- Aligning data engineering and data science workflows
- Standardizing processes across multiple brands
- Balancing speed of experimentation with production stability
- Ensuring observability and measurable impact
Establishing clear ownership between platform engineering and data science teams was crucial. Once responsibilities were defined and workflows standardized, iteration speed increased significantly.
Technologies Behind the Platform
The platform was built on Azure, using a modern, cloud-native stack including:
- Python-based services
- MLFlow for experiment tracking
- FastAPI for model serving
- Docker and Kubernetes for orchestration
- Distributed data processing for batch workloads
- Monitoring and dashboarding tools for observability
However, the real differentiator wasn’t the tooling; it was the architectural coherence and lifecycle management around machine learning.
Results Achieved
By the end of the project, the organization had:
- A unified recommendation backbone across brands
- Standardized deployment and version control
- Improved stability of real-time recommendations
- Faster iteration cycles for data science teams
- A foundation ready for further AI expansion
Most importantly, recommendation systems evolved from isolated experiments into core digital infrastructure.
The strengthened platform now enables the organization to expand its AI capabilities confidently, including plans to further grow the data science function.
What This Means for the Market
Many companies today face a similar inflection point. They have promising models but lack scalable infrastructure. As AI moves from experimentation to operational necessity, the competitive advantage shifts toward organizations that invest in robust ML platforms.
The lesson from this project is simple:
Sustainable AI impact requires engineering maturity as much as modeling expertise.
Personalization is no longer just about building smarter algorithms. It’s about building the systems that make those algorithms reliable, measurable, and scalable.
And that shift, from model to platform, is where real long-term value is created.
FAQ
The core is a clear separation between a real-time layer and a batch layer. Built for speed and stability, the real-time layer serves recommendations through APIs, without experimenting with data scientists disrupting that environment. The batch layer is available for offline training, retraining and experiments, separate from production, so that only validated models go live via a standardized process.
The right time usually comes earlier than organizations think, ideally once multiple models are in production or multiple teams become dependent on the same data pipeline. Waiting for deployments to become too slow, models to be difficult to maintain or production to become unstable only makes the transformation more expensive.