The Accelerate Programme for Scientific Discovery equips researchers with the skills to advance the frontiers of science through the application of AI. Dr Ema Bauzyte, a research associate in archaeology, recently benefited from one of their training programmes delivered with Cambridge Spark.

Supported by a donation from Schmidt Futures, a philanthropic initiative founded by Eric and Wendy Schmidt, the Accelerate Programme provides young researchers with specialised training in AI techniques. Cambridge Spark are delighted to have worked with the Accelerate Programme on an initiative spearheaded by Professor Neil Lawrence at the University of Cambridge, equipping scientific researchers with the skills they need to use machine learning and AI to power their research.

The Accelerate-Cambridge Spark data science for science residency is open to scientific researchers at the University of Cambridge. Dr Ema Bauzyte, a research associate at the McDonald Institute for Archaeological Research, recently completed the residency and shared her experience below. 

What motivated you to join the Accelerate-Spark course?

The field of archeology is moving to ML and this programme provided an excellent opportunity to efficiently process datasets in automated ways while reducing errors. The Programme also seemed like a great way to help keep up with the literature. Last year for example, there was a major article on strontium isotopes used for provenancing studies. Generally baselines for these studies have to be established through painstaking and expensive empirical analysis of local plants and animals. However, modeling has increasingly been used to predict isotopic ratios. By pooling the available data together and using random-forest regression, this paper predicted bioavailable strontium isotope ratios on a global scale. We can then use this information to trace origins of animals and people in the archaeological record. Other applications I have found inspiring relate to the automated identification of structures in images, such as use of Regions-based Convolutional Neural Networks to identify archaeological features (barrow, fields) in aerial images.

What opportunities did you see to use data science in your work?

In my research, I work with chemical data with lots of variables and so to reduce them using the t-sne algorithm has proved incredibly helpful in identifying clusters in my data. More broadly, the Accelerate Programme and working with Cambridge Spark gave me the confidence and curiosity to learn how to use different algorithms and solve problems with ML.

How have you used the insights from the Accelerate-Spark course in your work?

I had actually completed my PhD a while ago but I decided to return to it and applied the t-sne algorithm to my research and I also discovered new methods using hierarchical clustering on the 7 different archeological sites I was working on. This approach has helped to answer questions relating to the connectivity of different areas and visualise exactly which sites are related.

What did the course help you to do in your research, that you couldn’t have done previously?

Using PCA and t-sne helped to unlock insights into clusters and relationships between analysed objects that were not possible to see as a human.

How do you see your work developing in future? Do you think you’ll continue to use data science and AI?

Cleaning the data with code has become part of my day-to-day work so these skills I use all the time - I have moved away from Excel as a result! I am conducting a lot of analysis using t-sne, hierarchical clustering and I am now able to engage a lot more with the literature on ML and archeology and see how conclusions can be applied in this research. I can see these methods will change how landscape studies are conducted in archeology. Archeology traditionally looked at small datasets on one site. With ML, it is possible to mine legacy data and see more macro-scale analysis enabling more global studies over time. I also believe that the application of ML to archeology can shift the perception of the domain being purely Humanities to more interdisciplinary with ways of working with data which can lead to more innovation and could help make a stronger case for funding.

Learn more about our different data and AI apprenticeships.