Analysis: What, How and Why
Data analysis is about breathing life into data. The data has been collected, tidied up, and structured. But what does it mean? This is where we put two and two together to extract insight and get answers to our questions.
What trends and patterns become clear to us when we compare income sources and budgets from the past few years? And how can historical weather data and sound recordings of glaciers give us answers about the climate crisis?
An analysis is essentially a good explanation. It provides the basis for a concrete assessment of what has happened, why it happened, what one should have done differently, and even what might happen in the future. What did we do wrong and why did it turn out that way? What steps should we take going forward to prevent this from happening again?
Even though the prep has been ever so complex, the conclusion of the analysis should be clear and concise, even for people who don’t know the difference between rows and columns or entities and attributes. The whole thing should result in clear summaries. Here we can get help from visualisations and so-called “dashboards”, which we will look at more closely later.
How is data analysed?
In analysis, we break down a problem into its elements, study each systematically, and then piece together what the sum of the parts reveal. From there, we typically search for patterns and correlations in the data.
As with so much we have talked about in the Data Journey, this is not one single process or operation. There are a variety of techniques, tools, and procedures. Therefore, how one analyses data depends on the data being analysed, and for what purpose.
When we collect and structure data, it is often to prepare an analysis for a specific purpose. As you remember: we work “backwards” from the goal to figure out what data we need and what we should do with it. And we work iteratively, as well, so that we’re able to return to the finish line..
What methods and tools we have available to work with also depends on the context. It is two very different things to look at customer behaviour on a website using a user-friendly tool that does the heavy lifting for you, compared to analysing data on a glacier over several decades, where you need to structure the data yourself with the help of statistics and algorithms.
We’ll soon delve deeper into topics like statistics, machine learning, and data mining, which would be useful in our glacier example. But first, let’s shift our perspective. Instead of focusing on how we analyse, let’s think about why we do it and the insights we aim to gain.
What can we find out with the analysis?
What, how and why: This sounds like the start of a quiz book! But trust us, once this chapter is over, you will be left with more answers than questions. These are not random interrogative words, but the recipe for a very specific approach to finding solutions to a problem.
Let us analyse a hypothetical game of chess. Magnus Carlsen has done the impossible. The world’s chess experts are scratching their heads, even the artificially intelligent chess robots failed to keep up. A game so revolutionary that the basic principles of chess need to be reconsidered, and everyone is asking the same question: How did he do it?
First, we need to see what exactly happened. We’ll break it down: Which opening moves did he use, how has the pawn structure evolved in relation to the opponent’s attack? Step by step, we jot down all the moves, from the first opening, through castling and queen sacrifice to a crushing checkmate. This is a descriptive analysis.
Fact
Descriptive analysis
Descriptive analysis is about collecting, examining and analysing data to provide a picture of something that has already happened. A descriptive analysis is often what is presented in a report, where we summarise actual conditions to provide a basis for further analysis.
How was the finances from the last quarter? How many people commuted from Gjøvik to Lillehammer last year? Did the viewing figures for Maskorama increase from 2021 to 2022? These are several examples of descriptive analysis.
Now that we know the details of what happened in the chess game, we can look more at why it unfolded as it did. Has Carlsen played like this in past games? Did he spot a weak point in his opponent, leading to that unique opening? Can we use other sources to help explain the unpredictable queen sacrifice in the midgame?
This type of analysis, where we create an overall picture and identify causes, is called diagnostic analysis. For instance, why did Carlsen sacrifice the queen? Our study suggests it allowed two knights and a bishop to better defend and attack the opposing king.
Fact
Diagnostic analysis
So let’s say that the premiere of The Masked Singer lost a bunch of viewers in 2022, compared to the year before. Why? Where the descriptive analysis showed what happened—the viewing figures went down—the diagnostic analysis is about finding the underlying causes of this. We aim to uncover problems, understand trends and prepare for informed, data-driven decisions.
To get a complete understanding, we need to look beyond just the immediate data, like the show’s viewing figures. Other factors can come into play. For example, if another popular show like Strictly Come Dancing aired at the same time on a different channel, that could be a reason for the drop.
It’s important to know whether things happen to occur at the same time (correlation) or whether one thing causes another (causality). For example, in 2020, while more people watched TV and hospital visits increased, it doesn’t mean watching TV caused the visits. The pandemic influenced both.
For instance, in 2020, data might show that high TV viewership happened alongside a spike in hospital visits (correlation). It might be tempting to say one caused the other, but that’s wrong. Both were influenced by the pandemic (causality).
By identifying real causes, like why sales drop on some days or why a species is vanishing, we can make smarter decisions going forward.
Awesome move! That queen sacrifice was brilliant, and there were many other great moves in the game to learn from. But can we use this game to learn something about the future?
In fact, yes we can. By comparing the game with thousands of other games, we can—with a high degree of precision—predict how different positions can develop, and with the help of analysis tools predict the opponent’s best move.
This method of using data to anticipate future events is known as predictive analysis.
Fact
Predictive analysis
In the past, only fortune tellers and sci-fi could really claim to speculate about the future. But with new technology, we can actually, with a high degree of precision and accuracy, see glimpses of what can happen next week, next month and even several years into the future.
Who will win tomorrow’s football match? When should the components of the wind turbine be replaced? Will the interest rate rise? Making predictions and probability calculations for future events is called predictive analysis.
While we can’t be 100% certain about the future, advanced statistics and machine learning help us make sense of loads of data. This isn’t random guessing. Since actions and events have patterns (like A leading to B), predictive analysis is way more reliable than just making wild guesses.
Now we know what happened, why Carlsen chose his moves, and we have some sensible assumptions about how these entirely new strategies can influence future situations.
The world’s chess experts have analysed the game up and down. They’ve looked at every detail and the overall strategy. The question now is as follows: What should we do with all this new information?
Well, we can use this information to make recommendations on how chess should be played. In fact, this is how some programmed chess robots work: They learn from previous games to come up with the mathematically best moves.
Using data to find out what we should do with what is happening is called prescriptive analysis.
Fact
Prescriptive analysis
So you have analysed what has happened, why it happened, and even what can happen in the future. What are you going to do with all this information?
Prescriptive analysis is about taking the insights from descriptive, diagnostic and predictive analysis and turning them into action plans, in order to give recommendations for specific situations. This is done by combining data with mathematical models and algorithms to explore possible scenarios and determine the most effective solutions.
At its core, the goal is the same as in all analyses: To help make decision-making processes more informed and efficient.
This chess story might be made up, but the way of thinking can be applied in many real-life situations. These types of analyses are related and often overlap.
Let’s summarise them quickly, once more:
- Descriptive analysis: What happened?
- Diagnostic analysis: Why did it happen?
- Predictive analysis: What is going to happen, when will it happen and why will it happen?
- Prescriptive analysis: What should we do about what is happening?
But conducting an analysis, whether of a chess game or a budget, is rarely something we do entirely on our own, without help from various tools. Let’s take a look at the most important tools we have for analysing data.