Python vs Excel for Data Analysis
Excel has been a firm favourite for working professionals for many years and for good reason. Its wide capabilities and ease of use have made it critical in all manners of business, education, finance and research.
Enter Python. This programming language has gained traction over recent years. One report states that the demand for Python skills, as a requisite in job postings, has increased by 27.6 percent in the last year and shows no signs of slowing down. Initially built as a way to write scripts that ‘automate the boring stuff’, Python has become a leader in web development, data analysis and infrastructure management.
Demand for Excel
Microsoft Excel skills are still in high demand. After 34 years in this fast-changing tech world, the spreadsheet software is still going strong. The seasoned data analysis tool is still used often in the financial sector to organise and present large amounts of data. Excel has been developed and updated recently, which means it boasts more user-friendly features and more effective functionalities for all businesses.
According to Microsoft, there are 1.2 billion people that own Excel, of which 800 million people currently use it. In comparison, it’s been estimated that the number of people that use Python is around 8.2 million people. The odds are if someone you work with sends you a report, it will be in Excel, so it’s useful to know how to use it.
The limitations of Excel
However, consultants and IT experts have voiced their concerns over how fragile the spreadsheet software can be. Excel is working to overcome challenges such as:
- Data Volume: Companies, small and large, have most likely used Excel at some point in their development. However, as organisations continue to generate data, they find themselves dealing with an increasing number of spreadsheets, resulting in complex analytical issues.
- Syntax Errors: Excel has been considered notorious when copying and pasting data in specific cell ranges. This can create lots of errors when inputting formulas manually.
- Security Risks: Companies have to be cautious about the kind of information that is stored in Excel sheets, in case of misuse and cyber attacks. Excel has some security policies that need to be addressed.
Python says, (“Hello, World!”).
First released in 1991, Python has become one of the most ubiquitous programming languages out there. Although Python and Excel technically have different functionalities, Python has developed a strong following as people have realised its capabilities and potential. It’s been deemed a better data analysis tool by many developers and the wider data science community.
While Python needs basic programming skills, it has been looked upon as a prerequisite for many quantitative roles. Companies are looking to hire new candidates with at least beginner-level proficiency in Python.
Its avid practitioners, known as Pythonistas, have uploaded 145,000 custom-built software packages to an online repository. These cover everything from game development, to astronomy and can be installed and inserted into a Python program in a matter of seconds. This versatility explains why the Central Intelligence Agency has used it for hacking, Google for crawling web pages, Pixar for producing movies and Spotify for recommending songs. Some of the most popular packages harness “machine learning”, by crunching large quantities of data to pick out patterns that would otherwise be imperceptible.
How popular is Python?
According to the ‘Popularity of Programming Language’ index, Python is the world’s most popular computing language. It’s grown 11.4 percent in the last five years. With a popularity share of 28 percent, Python beats its closest competitor, Java, by 10 percent. Whilst these numbers might not be an accurate metric to measure value, consider that Uber, PayPal, Google, Facebook, Instagram, Netflix, Dropbox and Reddit all use Python in their development and testing. Moreover, Python is also used extensively in robotics and embedded systems.
In 2012, Stack Overflow, the largest and most trusted online community for developers, saw questions relating to Python account for less than 4 percent. According to Stack Overflow's latest Developer Survey, Python is now the 3rd most popular language amongst the 80,000 respondents of the survey.
Higher incomes for jobs with Python skills
Not only can learning Python increase your productivity, it can also grow your personal income. According to IT jobs website CWJobs, the average salary in the UK for jobs requiring Python skills is £67,500 compared to just £37,500 for jobs requiring Excel skills. Aside from growing your salary by learning Python, it’s also a great way to future-proof your career by keeping your skillset up-to-date and relevant.
Who can benefit from learning Python?
Python is such a diverse tool that can be used in multiple applications in plenty of jobs. Some of the most interesting things that you can do with Python are:
- Automate the boring stuff: Updating spreadsheets, renaming files, gathering and formatting data, checking spelling, automating Excel reports using Python, fixing grammar mistakes and compiling reports. These are just a few examples.
- Build a Bitcoin notification service to see when might be a good time to purchase the highly talked about cryptocurrency. If Ethereum is more your thing, the code can be replicated for other currencies.
- Mine data from Twitter to build a sentiment analysis tool. This project would lead nicely into learning more about text processing and speech recognition.
- Build a Blockchain to use for almost any financial transaction.
Learn and apply Python skills at work with one of our government-funded apprenticeships. Find out more here.
Jobs that Python can benefit:
Account managers, accountants and anyone working with large datasets can benefit from learning and using Python. Programming knowledge will allow you to extract and manipulate data from multiple reports to then filter and detect any inconsistencies in the data on a very large scale, which would take you a long time if using Excel.
Data analysts can benefit from learning Python, as the majority of their work involves trawling through data, and Python can help automate that process, saving time and effort. As Zhivitov, Data Analyst at TransferWise, says: "with Python, you can do so much more because of its general-purpose. It gives you the freedom to build tools for yourself, and you can easily cover the entire pipeline of data analytics work from start to finish."
👉RELATED READING: Data Analysts: Who are they and What do they do?
Good news if you work in marketing: Python can help you too. It can help by automating data collection (SEO indexation, email and SMS responses and trend information), automate SEO processes, monitor campaigns more effectively and automate customised error checks. The jobs you would generally go to Excel for, you can automate by writing simple Python code.
Journalism: Python is particularly relevant within journalism that uses data to tell stories. Those who know Python are in demand, as they can rapidly sort through information, making them much more efficient when it comes to writing to meet deadlines. Learn about Cambridge Spark's Data-Driven Journalism Programme.
👉RELATED READING: Cambridge Spark Makes Data Driven Journalism a Reality
What makes Python a better option than Excel?
There are many things that Excel can do. And it's a great tool for basic data analysis. But Python allows you to do more in terms of analysis. Here are a few reasons:
Python for data analysis
Python can handle much larger volumes of data, and therefore, more analysis. It also forms a basic requirement for most data science teams. It can easily replace mundane tasks with automation. Python also offers greater efficiency and scalability. It's faster than Excel for data pipelines, automation and calculating complex equations and algorithms.
Python is free!
Although no programming language costs money to use, Python is free in another sense: it’s open-source. This means the code can be inspected and modified by anyone. Python is a progressive language that is constantly being developed, collaboratively, by a group of volunteers. Whereas Microsoft Excel costs around £150 to download for one license. The cost for businesses (dependant on the number of employees) could be in the thousands, and Excel is developed solely by Microsoft employees.
Leveraging the latest research
Excel has a large user base that offers a wide variety of tips and tricks in an open forum. But the Python community does the same and more. With its strong ethos of collaboration, academics and data scientists often publish and share their code. This means that the latest techniques developed in Python are available for free to the community.
A Python library is a collection of functions and methods that allows you to perform many actions without writing your code from scratch. This makes a data analyst's work more efficient because they don't have to waste time writing new code. Instead, they can just import a library. Different libraries have different functionalities. For example, TensorFlow (developed by Google) is used for machine learning projects. And SciKit Learn is a library used when working with complex datasets.
Python is referred to as a ‘glue’ language, which means that it's particularly useful for connecting different scripts together and interacting with different systems, including different forms of databases (e.g. SQL and NoSQL databases), data formats (JSON, Parquet, etc.) and web services. The Python community also contributes to many packages that allow you to interact with a range of public APIs. This is often useful for data scientists given their need to read data from different places and process it.
Deep learning and machine learning
Python is the de facto language of machine learning. Researchers and academics are all using Python for deep learning to create predictive and simulative models that find new insights into their data. Most notably, Google’s TensorFlow works mostly with Python.
Python is widely supported
Python is backed by a large community of developers (8.2 million), and therefore, has a strong support system. There are many tutorials covering Python concepts all over the web. Even Python programming experts can find guidance, if necessary, when working on complex problems. As mentioned earlier, Excel does have many more people using the software, and it's well supported online in tutorials and guides.
Python is not only supported online but offline, too, at conferences, meetups, hackathons and events around the world. For example, PyCon is an international set of conferences, held at multiple locations around the world. Organisers aim to unite developers and data science enthusiasts to discuss and promote the Python programming language.
Another leading conference is PyData, which focuses on the community of users and developers of data analysis tools to share and learn together with chapters in cities all over the world.
In the business world, Python proficiency has been on the rise. Cambridge Spark CEO Dr Raoul-Gabriel Urma participated in an interview with eFinancialCareers about the future for traders if they don’t learn Python. He said: "If you want to get an edge today, you need to create new strategies with Python. This is why all the traders at trading companies are increasing their Python proficiency."
Excel vs Python: Who wins?
The evidence suggests that both Excel and Python have their place with certain applications. Excel is a great entry-level tool and is a quick-and-easy way to analyse a dataset.
But for the modern era, with large datasets and more complex analytics and automation, Python provides the tools, techniques and processing power that Excel, in many instances, lacks. After all, Python is more powerful, faster, capable of better data analysis and it benefits from a more inclusive, collaborative support system.
Python is a must-have skill for data analysts, and now is the time to learn. According to Zhivotov: “You can be a good data analyst without knowing Python. But if you want to stand out above the rest, be a star data analyst and progress, then you need to learn Python".
If you want to reap the benefits, such as a higher salary, better career opportunities and keeping your skills relevant for the fourth industrial revolution, then learn Python.
Learn Python with Cambridge Spark
At Cambridge Spark, we offer a Level 4 Data Analyst Apprenticeship. If you're working full time, you could join the L4 apprenticeship where you'll learn:
- Advanced Python programming
- Data analysis with Numpy and Pandas
- Processing big data
- Building and implementing machine learning models; and
- Working with different types and databases, such as SQL
You'll learn advanced data analytics, whilst remaining in full-time work. And England-based apprentices are eligible to have their course fully paid for by the UK Apprenticeship Levy.
Don't have the time to commit to a long-term apprenticeship?
We also offer commercial training courses including our Data Analysis Foundations Certificate where you'll learn how to build data analytics capabilities using Python and Pandas over 8 half-days of live interactive expert-led workshops.
Contact us to learn more about Cambridge Spark's apprenticeships and courses.