The benefits of 'learning by doing' in data science training

Many companies are choosing to build their data science capabilities from within, by upskilling their existing staff. But with so many data science training programmes available, how do you choose the right one for your team?

When choosing a data science training programme, it's important to consider how much emphasis is placed on enabling learners to develop their skills in a real-world context.

As with learning any new skill, practice is paramount to success. And it’s especially important when learning complex technical skills such as coding, programming and machine learning.

In this blog post, we take a closer look at some of the theory behind the concept of 'learning by doing' and why it should be an essential component of any data science training programme.

'Learning by doing'

The concept of learning by doing goes all the way back to ancient times. As Aristotle said as early as 350 BC;

“For the things we have to learn before we can do them, we learn by doing them.”    

There are many different terms within this broad heading, such as experiential learning and adventure learning. It’s also the concept on which apprenticeships were built - they are one particular approach to learning by doing.

The concept of learning by doing is used across many subjects and levels of education. But it is particularly powerful for learning complex, technical skills such as those required in the field of data science.

But why is it so powerful?

According to Barbara Seels, author of “Instructional Technology: The Definition and Domains of the Field”:

“Practice is the most important ingredient of effective instruction; it speeds up learning, aids long-term retention, and facilitates recall. Instruction is less effective when there is no opportunity to perform the task or when practice is delayed.”

The more immediate the opportunity for practice, the more likely it is that people will get to grips with the new skills. Plus, as an active process, it makes learning more engaging for students. It also helps boost students' confidence when they come to apply the new skills back in the workplace - because they'll have hands on experience of doing it.

Real-world scenarios

As we’ve seen, practice is great. But it is somewhat redundant if students aren’t able to practice the new skills in the context of their own work.

Of course, it is often too impractical or expensive to use real-world experiential learning. And this is where online learning comes in, as it can be used to simulate real conditions. And some data science training providers leverage technologies like gamification and AI to support students on their learning journey.

Ultimately, the more opportunities learners are given to practice their code in a real-world context, the better equipped they will be to apply it within your business.

The value of feedback

Feedback is a crucial part of the learning process. If students are practising their code but not getting it right, or doing things in a less effective way than they could be, then feedback helps them to get back on track and improve their performance. Feedback also helps motivate students, giving them a boost of confidence when they get things right, as well as guidance on how they can improve.

What this means for choosing a data science training programme

Given the importance of real-world practice and feedback, a traditional classroom approach alone isn’t enough when training people in data science.

Your employees may take part in a series of lectures or workshops to learn about data science - but without the facilities to practice they may recall the basics, but not how to apply those skills in the workplace. In fact, research shows that people will remember just 20% of what they learn in a classroom one week later.

This is why many data science training programmes consist of a blended learning approach; a combination of classroom training, online training modules and practical exercises. This is typically accompanied by feedback from trainers and mentors, who can provide guidance on where they are getting things right and where they can improve.

So what are the tools that are currently being used by data science training providers, that enable students to get that vital real-world practice?

Jupyter Notebooks: from practical exercises to the real-world

Jupyter Notebooks (previously known as iPython notebooks) is one of the most popular tools used by data scientists to create code and visualise data. It is used for data cleaning, statistical modelling, building and implementing machine learning models, and much more.

Some training courses enable students to complete weekly exercises in Jupyter Notebooks throughout the duration of the programme, which enables them to easily apply their new skills and knowledge back in the workplace. It acts as a workbook that they can refer back to throughout their training - and beyond.

EDUKATE.AI®: accelerating learning with instant feedback

EDUKATE.AI® is an AI-powered learning experience platform for data science that allows students to practice code in a real-world context and receive personalised feedback and recommendations on learning.

At Cambridge Spark, EDUKATE.AI® plays an instrumental role in all our data science and AI training programmes. Learners have access to the platform throughout the duration of the training programme so they can practice their code through real-world projects.

How EDUKATE.AI® works:

  • Learn - students begin by taking a module and working through Jupyter Notebooks, interactive content and videos
  • Choose a project - students begin a real-world project covering specific techniques
  • Write code - students write code where they feel comfortable and submit via git, our Web IDE or integrated Jupyter Notebook
  • Submit code - students submit their code to KATE, the smart code engine behind EDUKATE.AI®
  • Receive feedback - students receive instant personalised feedback: code quality, code performance, correctness and further reading materials
  • Iterate and improve - students learn by doing and build a portfolio of real-world projects

How EDUKATE.AI® accelerates learning:

  • Real-life applications - projects apply skills to real-world situations, simulating a data science environment in a range of sectors from finance to media
  • Instant feedback - submit code and KATE provides feedback instantly, helping learners understand how to develop their skills and improve the quality of their code
  • Fully personalised - KATE offers personalised exercises and reading recommendations based on the code students submit, allowing them to learn more effectively
  • Adaptive learning - as students submit their work KATE gets to know their learning needs and can identify key skills for them to practice and develop


It’s simply impossible to successfully equip your employees with the data science skills that will benefit your business if they don’t have the means to practice those new skills over and over - and in the context of their role within your business.

Completing practical exercises in Jupyter Notebooks not only gives learners the opportunity to develop and hone their skillset, but also sets them up for applying those skills to their work back in the office. Similarly, EDUKATE.AI® accelerates learning by creating a real-world data science environment where students can practice their code in the context of their work and get instant feedback to allow constant development and improvement.

Enquire now

Fill out the following form and we’ll contact you within one business day to discuss and answer any questions you have about the programme. We look forward to speaking with you.


Talk to us about our Data & Ai programmes