From whiteboard to production: A demand forecasting system for an online grocery shop
Data-driven software is revolutionizing the world and enable intelligent services we interact with daily. Robert Pesch and Robin Senge outline the development process, statistical modeling, data-driven decision making, and components needed for productionizing a fully automated and highly scalable demand forecasting system for an online grocery shop for a billion-dollar retail group in Europe.
|Talk Title||From whiteboard to production: A demand forecasting system for an online grocery shop|
|Speakers||Robert Pesch (inovex), Robin Senge (inovex)|
|Conference||Strata Data Conference|
|Conf Tag||Make Data Work|
|Location||New York, New York|
|Date||September 24-26, 2019|
Data-driven software products employ statistical and machine learning models inferred from data to drive business goals. By this, they are revolutionizing the world and enable intelligent services we interact with daily. The development of such products requires many short iterations and the close collaboration of software developers, data engineers, data scientists, and DevOps engineers to translate an idea from whiteboard to a fully fledged software system integrated into a complex enterprise IT environment. Robert Pesch and Robin Senge outline the development process, statistical modeling, data-driven decision making, and components needed for productionizing a fully automated demand forecasting system for an online grocery shop for a billion-dollar retail group in Europe. On the whiteboard: For egrocery stores, accurate stock planning is the key ingredient for the success of the whole business case. Compared to nongrocery stores, supply chain optimization is an even harder task due to perishable items in combination with a broad variety of goods sold according to a long tail distribution. In order for an egrocery business to be sustainable, many challenges along its supply chain have to be approached. Accurate stock planning resides in the very center of all efforts: unavailability of items results in unsatisfied and potentially lost customers, whereas overstock results in food spoilage, which again results in higher cost for the retailer. Balancing these two factors is not trivial. Just predicting a single value (a point prediction) for the demand for each article for a certain time in the future (for example 100 grape units for next week) is often not sufficient, as this prediction ignores the stochastic part and the random fluctuations of the demand. Thus, predicting the entire probability distribution is often required to obtain the cost-optimal forecast. Many different models and approaches exist for obtaining such a prediction, ranging from simple models to complex models. Traditional forecasting models such as exponential smoothing, ARIMA, and the newly developed Facebook Prophet library can be applied to get point predictions for arbitrary forecasting horizons. Modeling the problem as a regression problem extends the prediction capability, allows for the integration of features, and enables you to choose from different models like linear regression models, support vector machines, or regression trees. Combining such predictions with parametric probability distributions provides a good starting point for the desired probability function. More tailored models like gamlss allow the direct prediction of probability distributions. Toward the production system: Building a demand forecasting system for an egrocery business is a complex and complicated task. Hence, it makes sense to approach the task in an agile fashion, starting with a very simple, almost naive, solution. This provided two important things: a fast first version of the data product that was able to serve as a proof of concept for further development and a simple baseline model to compare to when experimenting with more complex predictive models. The system is developed further in agile iterations and constantly improved, so every new predictive model that is likely to be more complex than its predecessor has to prove that its increased complexity also significantly increases accuracy. In alignment with Occam’s razor and also with machine learning theory, this is a way to not overcomplicating things. Complex models like deep neural nets have the potential to learn extraordinary complex relations. However, this ability comes with a high risk of overfitting, instability, and outliers. Building a productive, fully automated demand forecasting system for approximately one billion predictions per day, you better take care of these risks. You’ll leave with an understanding of a variety of software features that distinguish a prototypical notebook model from a fully fledged highly scalable production system. These include extracting the original customer request, outlier and fraud detection mechanisms for the input data, sanity checks for the output data, a fallback solution, a risk-sensitive model selection, and model monitoring.