Interested in joining our next programme?

Enrolment deadline

23rd August 2024

One data apprentice can create real business impact

Light blue icon showing a chart with a line trending upward

£1.4m revenue

identified through data-driven insights

Light blue icon showing a hand holding a GBP currency symbol

£120,000 saved

by creating efficiencies

Light blue icon showing a clock

90% shorter project times

achieved through automations

Light blue icon showing a clock with a person

5x faster ML model training

achieved through automations

Build capability to create and maintain key data infrastructure

Want to train new talent and reskill existing employees with one of the most in-demand technical skillsets? Develop key internal capabilities to raise the usability of critical datasets in your organisation. Cambridge Spark's Level 5 Data Engineer Apprenticeship equips learners with core technical and leadership skills.

In turn, learners are able to support business functions in creating and maintaining data analytics pipelines. They build the skillset to access data in their organisation and gain an understanding of the data engineering lifecycle, data modelling and more to help organisations maximise the value of their data.

Leaners will also have the opportunity to join guest talks on technical updates from leading technology providers like Google Cloud Platform and Databricks.

Hear from Jonathan Wagstaff, Group Head of Business Intelligence at Exertis

Exertis logo

Hear from Jonathan Wagstaff, Group Head of Business Intelligence at Exertis

Exertis logo
Jonathan Wagstaff
Data apprenticeships enable myself and my team to keep up-to-date with the latest.


Suitability of role

  • Looking to develop skills in Python, SQL, data modelling approaches, software testing, git, CI/CD and DevOps mindset
  • Pursuing a junior data engineering role

Eligibility for funding

  • No prior equivalent data training or related experience.
  • Employed in England and resident in the UK or EEA for the last 3 years.
  • Employees working at least 30 hours a week (part-time employees can be considered for a minimum cohort size)
  • Can commit to the minimum 6 hours a week on the job learning requirement for the duration of the programme (14 months of training)

What makes our programme special

We deliver all of our programmes online, helping our clients offer flexible and inclusive programmes open to all of their staff. EDUKATE.AI, our online learning platform, gives learners a sandbox environment to practice their skills, providing them with immediate feedback on industry-simulated assignments. We believe that the gold standard for online delivery is to offer a mix of experiential learning, coaching, technical mentorship and peer support.

Dark green icon showing a mountain with a flag in the top

Real-World Practice for Accelerated Impact

EDUKATE.AI provides a sandbox environment where learners can practice new skills on real assignments. This accelerates the impact that learners can make in their workplace, allowing them to immediately apply what they've learned.
Dark green icon showing a computer screen with a speech bubble coming out of it


Our online learning platform offers apprentices a seamless learning experience with in-browser access to their slides, workshop recordings, quizzes and practical assignments. Immediate feedback enables apprentices to gauge their progress effectively.
Dark green icon showing an online application with cyclical arrows around it

Expert Curriculum

Our curriculum develops the skills to thrive in a data-driven organisation. The programme teaches the latest concepts and tools essential to build and manage critical data infrastructure.
Dark green icon of a person reading a book with a magnifying glass in the foreground

Personalised Learner Support

We provide each learner with a dedicated Data Mentor and Learner Success Coach to support them on their technical and personal development. This personalised support structure helps learners to succeed and overcome obstacles they encounter.
Dark green icon with a person standing in front of a gear shape

Flexible Fully Online Learning

Our programme is fully online, providing maximum flexibility for learners and employers alike. This means that learners can access their content from anywhere, with no set up or installation of EDUKATE.AI required.
Dark green icon showing three people together


Joining our programme means becoming part of a thriving community of thousands of data professionals. Learners have the opportunity to tap into this rich network of peers and alumni and benefit from the expertise and experience of others in the field.

A real-world learning experience is our learning experience platform which delivers a seamless experience in one place, and accelerates learning and impact through real practice on real projects with immediate personalised feedback on code.
Screenshot showing EDUKATE.AI learning platform with Knowledge Base response from Kevin

The Curriculum

Our curriculum is developed by our leading faculty, composed of data scientists in leading industry positions and academics from some of the top universities in the world. Our curriculum is continuously updated and reiterated to incorporate the latest skills.

We take a modular approach to how we offer our curriculum. The full Level 5 Data Engineer Apprenticeship includes all of the below modules. We also offer curated shorter tracks and can offer a fully tailored pathway based on a skills gap analysis.

Core Modules

Understand Python syntax and data structures and gain familiarity with programming in Python and data processing and cleaning with Pandas. Understand version control with Git, from command-line basics to handling conflicts, merge requests, and code reviews. And get hands-on experience with software testing using unittests in Python and the pytest library

Learn more about what is meant by data engineering and how it is used in organisations.

Gain insights into the diverse roles that interact with data engineers and understand their collaborative interactions.

Learn the fundamentals of SQL, from connecting to SQLite databases and performing basic queries to advanced topics like subqueries, joins, and optimising queries with indexes.

Explore NoSQL databases, understand their pros and cons, and work with real-world examples, gaining practical experience with tools like DBeaver, SQLAlchemy, and BigQuery to connect and manipulate data in diverse SQL environments.

This module covers the reasons why data modelling is important and the various techniques that can be used to model your data efficiently.

Explore the Software Development Lifecycle and Continuous Integration/Continuous Deployment processes, gaining an understanding of containerisation with an introduction to Docker.

Gain an understanding of deploying container-based applications using Kubernetes and learn Infrastructure as Code (IaaC) principles, implementing them with Terraform for efficient infrastructure management.

This module offers a comprehensive exploration of data quality, encompassing aspects such as accuracy, completeness, consistency, and timeliness.

It also addresses critical topics in data governance, including compliance with privacy and security regulations, ethical considerations, and the implementation of best practices to ensure data quality and ethical data handling while minimising environmental impact

This module introduces the essential concepts of data pipelines and workflow orchestration, followed by hands-on experience in building, monitoring, and scaling data pipelines using Python and tools like Airflow and Luigi.

It also covers configuring data access, managing permissions, incident management, and optimisation techniques to ensure efficient and reliable data processing within pipelines.

Learn how to analyse user and business requirements for data products, design scalable and secure solutions, and effectively document your technical processes.

Explore the lifecycle of data product implementation, covering prototyping and implementation using Python, rigorous testing and debugging processes, and various approaches to deploying data products effectively in real-world scenarios.

Understand real-time data streaming and advanced integration techniques, learning best practices for data security and access control.

Explore strategies for optimising performance and scalability in data engineering within a cloud computing environment while considering vendor-agnostic principles and evaluating various data storage and computing options.

Explore the latest trends and emerging technologies in data engineering, focusing on optimising data products and leveraging advancements in data science.

Learn strategies for ensuring business continuity through robust data provision, while emphasising the importance of continuous improvement to stay abreast of rapid technological developments.

Load more


What delivery options do you offer?

We tailor our delivery to your workforce needs. This ranges from from independent, immersive elearning supported by EDUKATE.AI through to tailored bootcamps, to our structured apprenticeship programmes. The Level 5 Data Enginner Apprenticeship is available to learners based in England.

Are you able to tailor the programme to the organisation and sector?

Yes. We work with our clients to contextualise our programmes to their organisation and sectors they operate in. We do this through tailored hackathons, bespoke assignments and guest lectures from industry experts. We also work with a range of partners to create bespoke programmes for sector, such as health and journalism.

What is an apprenticeship?

Apprenticeships are a long-term training commitment which seek to support people entering the workforce and upskill existing UK-based employees within an organisation, enabling employers to foster a workforce consisting of highly-skilled and highly-engaged talent.

The Cambridge Spark Data Engineer Apprenticeship runs 14 months plus a 3-month end-point assessment and includes a minimum of 6 hours per week off-the-job training, enabling a blended approach between theory and practical-learning.

What is the Apprenticeship Levy?

The UK government introduced the Apprenticeship Levy scheme in April 2017 as a way to drive investment in strengthening the country’s skills base.

All organisations with annual staff costs of over £3m have to pay 0.5% of their salary bill into a ring-fenced apprenticeship levy pot. The money is collected monthly via PAYE and can only be used for training on approved apprenticeship schemes (such as the Level 5 Data Engineer Apprenticeship that we offer). Organisations must forfeit any levy funding left unspent for 24 months or more.

What if my organisation doesn't pay into the UK Apprenticeship Levy?

An organisation that doesn't pay into the levy can still qualify for government-funded apprenticeships for their staff. In fact, the UK government will sponsor 95% of the apprenticeship programme, leaving the organisation to invest the remaining 5%, provided that learners meet other eligibility criteria.

What does "off-the-job training" mean?

Off-the-job training is defined as learning undertaken outside of the day-to-day work duties and during the apprentice’s normal working hours.

Our off-the-job training is delivered on a flexible basis and can be carried out at the apprentice’s place of work or home.

The 6 hours per week, minimum, off-the-job training provides learners with the time to focus and develop the required skills, knowledge and behaviours to complete the programme.

How much do managers need to be involved?

Managers will need to ensure apprentices achieve their planned off-the-job training hours and work on their project portfolio.

We also encourage managers to have regular one-to-one meetings with apprentices to catch up on how they are progressing and to join the apprentice and their coach for 30 minutes every 3-4 months for a general catch up about the programme.

FAQ section (1)

Enquire now

Fill out the following form and we’ll contact you within one business day to discuss and answer any questions you have about the programme. We look forward to speaking with you.