GSK Data Analyst apprentices spent 2 days in a Hackathon working on a product formulation stability dataset from the Consumer Health team. Using the Python based tools and techniques they had learned in just a few months, they were able to derive valuable insights which could lead to some real business impact.

A key feature of our Level 4 Data Analyst Apprenticeship is a hackathon mid-way through the programme, giving apprentices two days to work in teams to conduct exploratory data analysis on a new dataset and present their findings at the end of the two days. The hackathon is an opportunity for apprentices to showcase how far they have come since starting the apprenticeship with no programming background and now being required to use Python programming to produce high-quality analysis and insights. As part of GSK’s Data Academy which we partner with them on, one cohort of their Level 4 Data Analyst Apprentices recently completed their hackathon on a real business problem for GSK.

Real dataset, real business opportunity 

Apprentices were given a real dataset from the R&D team pertaining to one of GSK’s product lines, including where they are manufactured, their active ingredients and the stability of the active ingredients over time. The R&D team needed to conduct experiments in a laboratory to ascertain how much the active ingredients in the product line decay over 5 timeframes and an analysis of the data was needed to recommend optimal formulations to meet stability requirements.

Working in teams of three, the Apprentices had two days to conduct their exploratory data analysis to produce a solution to this problem with recommendations and further work that could be done. Each team presented their findings to senior leaders in the GSK Data Science team, including Wade Munsie (Chief Data Officer), Vidhu Dev (VP of Digital Transformation, R&D) and Emma Duckworth (Director of Data Science). A key output was a Jupyter Notebook that the R&D team could use to replicate the work done.

Faster, more effective data manipulation and visualisation with Python

At this point of the programme, our Apprentices are proficient at Python programming, can analyse data with the Pandas library and are comfortable visualising data using Bokeh and Seaborn. They have also been introduced to working on databases in SQL and how to work with APIs and web-scraping.  Each team was given an initial list of questions to start them off on their data analysis but it was then up to each team to decide how to proceed. 

In every team, they employed the full range of skills they had learnt in the apprenticeship to tackle the dataset. In order to cleanse and manipulate their datasets, the teams used Python programming with the Pandas and NumPy libraries.

“Normally I would have used a spreadsheet and would have put some filters on it or done some searches on the pivot tables, but with the use of the coding we learnt it is easy to manipulate the data and fine tune it a bit more and quicker with a large volume of data. When you're dealing with a large volume of data in Excel after a while it will tell you that the data is too large and you'd have to do it in pieces and then patch it together, but I think doing it this way we're able to manipulate it to get what we want to from it.”

Eva Kane, Data Analyst Apprentice at GSK

GSK works with the Microsoft suite of tools, so many of the teams chose Power BI to create dashboards from their analysis, including one team creating an overview of the stability trends of the target data. Seaborn and matplotlib were highlighted by multiple teams as useful tools to help with the visualisations that each team was producing.

“Our data visualisation module was also really useful when we were using matplotlib to create our plots.”

Lucy Meadows, Data Analyst Apprentice at GSK

For every team, a key factor was using their skills to produce reports and visualisations which allowed them to easily and clearly communicate their findings to a range of stakeholders.

From no experience, to competent Python Analysts in 4 months

Each team presented their findings to seven senior data leaders from across GSK. Not only was this an opportunity for the apprentices to showcase the impact they could have on a live business query, but also an opportunity for them to receive direct feedback and coaching from senior leaders on how to present their insights and recommendations.

“Using visualisations in your presentations to get the message through, that has been really really powerful in getting that story across”

Wade Muncie, CDO at GSK

The apprentices came from non-programming backgrounds and have had 4 months to learn Python programming and how to carry out impactful analysis. It is an amazing achievement for the apprentices to have completed the hackathon and receive very positive feedback from the senior leaders at GSK.

“They did a really great job - great attempts and great confidence by all of the teams and unity within the teams when presenting the data. There are some learnings here of how some teams presented it differently from others and how each of us took that information and digested it. Thank you for all of the effort”

Vidhu Dev, VP of Digital Transformation, R&D

Actionable insights driving business benefit

By the end of the Hackathon, the senior leaders were able to identify commercial considerations from the presentations. For instance, one analysis identified the ultimate cost of an ingredient varying from different sites in the manufacturing process, which supports commercial discussions on possible cost savings which needed to be taken forward by another team.

Business impacts such as these should be a key outcome from investing in employees with a Data Analyst apprenticeship. As was concluded at the end, the hackathon was more than a part of the apprenticeship programme, it was about showing how workforce upskilling programmes can generate direct business impact.

“There will be follow up conversations so that we can continue exploring the value from this data. The hackathon was not just a useful exercise for this course, it was evidence in the ways agile teams can get to insights (and value) quickly”

Emma Duckworth, Director of Data Science at GSK

To find out how your organisation could benefit from apprenticeships in data science, analytics and AI, get in touch.