The retail sector is adopting Data Science techniques to reduce costs, improve decision making and explore ways to do more with data. Use cases include:

  • Forecasting demand and footprint to optimise operational efforts
  • Better understanding of consumer buying patterns to enhance sales
  • Insight into what aspects of a demographic area or store format contribute to the best sales

In this article we will provide an overview of:

  1. Applications of Data Science in Retail
  2. Essential Data Science techniques analysts need to know
  3. Case study: A training plan for Sainsbury’s Argos Ltd

The training courses that we ran with Argos were Core Data Science and Linear Models and Time Series Forecasting

Get in touch with us to learn more about the course! 

Applications of Data Science in Retail


“What aspects of a store mean we will achieve the best sales? What is the best store to implement a given project? For example, we put Argos stores inside Sainsbury’s stores now, and need to work out what variables are important to successfully grow sales. These are some of the problems that are becoming more relevant,” said Richard Pegler, Operational Strategy Manager at Sainsbury’s Argos Ltd. “Now that Argos is owned by Sainsbury’s there are so many different store formats we need to understand individually and to do so, we need to apply more advanced analytical techniques to solve the more complex and less well-defined business problems.”

“There are a variety of regression and correlation type questions that we are working on,” said Richard Pegler. “One of the business decisions we are trying to get behind is understanding what are the variables that drive the success of an Argos store. Another area of focus is doing a lot more forecasting to avoid misspending operational costs.”

Core Data Science and Machine Learning Skills


To keep their skills up to date, analysts need to move away from Excel to make use of predictive analytics and Machine Learning techniques using Python. Core Data Science techniques include:

Data Science Foundations

  • Working with the Jupyter notebook
  • Working with widely-used Python libraries such as Numpy library for array manipulation, The Pandas library for data manipulation and Matplotlib and Seaborn for Data visualisation
  • Principal Component Analysis (PCA)

Unsupervised and Supervised Learning Techniques

  • K-means Clustering, Hierarchical Cluster Analysis, Density-Based Clustering (DBScan)
  • The k-Nearest Neighbour algorithm
  • Overfitting, underfitting, bias-variance tradeoff
  • Cross-Validation and hyperparameter tuning

Machine Learning Techniques

  • Decision Trees
  • Bagging and Bootstrapping algorithm
  • Random Forests in scikit-learn
  • Boosting classifiers and  Boosting methods in scikit-learn
  • Adaboost, XGBoost, LightGBM
  • Stacking, Stacking with cross validation, Stacking in scikit-learn

Regression Analysis

  • Linear regression and Multivariate regression using SKLearn
  • Logistic regression
  • Regularisation: Ridge and LASSO

Time Series Analysis

  • Working with Time Series: ordering, trends, seasonality, noise
  • Windowing, rolling average, differencing, autocorrelation
  • Time Series exploration with Pandas
  • Time Series modelling: Autoregression model, Autoregression with Moving average, Cross-validation
  • Using scikit-learn ARMA
  • Facebook’s Prophet scalable time series forecasting toolbox

Case Study



To quickly develop the skill set of Argos employees Cambridge Spark delivered an intensive Data Science conversion training course for analysts. The 5 day, in-person course covered Data Science and Machine Learning using Python, introducing Argos analysts to a core set of techniques they can apply to internal projects.

The programme was really flexible and beneficial,” said Richard Pegler, Operational Strategy Manager at Sainsbury’s Argos Ltd. “Cambridge Spark presented content that our supply team have already started using in for location strategy optimisation and short horizon store forecasting.”

“The tutors catered to the various levels of Python knowledge in the group and it was really good to have a mix of theory and applied practicals using Jupyter Notebooks,” said Richard Pegler. Individuals who weren’t too confident on the coding side could focus on the theory then tackle more of the practical exercises in their own time. It was a good formula.”

"For the supply team, their main objectives are demand forecasting and segmentation across stores. We found the Time Series Analysis and Clustering content particularly useful and individuals have already started applying the knowledge and theory covered. Overall, we are really pleased with the content and make-up of the course.

- Richard Pegler, Operational Strategy Manager at Sainsbury’s Argos Ltd.

Interested in training for your teams?

Whether you're looking to train 5 people or 100 people, we have a variety of scalable training solutions to help you address a wide spectrum of training needs within the fields of Data Science, Artificial Intelligence, or Software Engineering.

Please contact us with your details and any known requirements. We'll then get in touch and guide you through every step of the way.