Sunday, 16 April 2023

Do Not Confuse Data With Information

There are two terms that are often erroneously interchanged - data and information. To the layperson, they may sound like the same thing and indeed, they are synonyms in the Thesaurus; however, I submit that they should not be confused with each other even to a layperson. Especially to a layperson.

What is the main difference? In a word, context.

Information is made up of different pieces of data that form a coherent whole. Data, therefore, is information that has no context. It is simply text, numbers, dates and even images and audio clips, few of which make any sense on their own. For data to become useful information, often it has to be sorted, arranged and recalculated, and even paired with other data.

Data center.

This is why a data center is not known as an information center. The contents of a data center's servers are data, not information. It has not yet been processed; it has merely been stored.

This is why an information counter is not known as a data counter. Data counters are vastly different things; consider a data counter on your mobile phone to show you how much data you have used for this month. But an information counter is, literally, a counter on which one obtains information.

Adding significance to data

Let's say I have some data. "My wife's birthday is on 16th April." When I see this, my first response would be to say - so what? Why is this a big deal? Pfft, that's just a day and month, I even know the year she was born.

Birthday girl.

But... what if the next piece of data was "Today's date is 11th April."? Again, that's just today's date. On its own, it is hardly significant.

However, when you put these two pieces of data together, there's context. It has become information. And the information is that my wife's birthday is in five days! The provided context now has important implications. Now the data is significant.

Editor's Note: Yes, I'm writing a blogpost about Data Analytics on my wife's birthday. Talk about having a death wish.

Adding depth to data

Let us use a soccer example this time. Say, "Mohamed Salah has scored a total of 44 goals in the 2018 / 2019 season." OK, that sounds impressive already on its own. But this has no context.

What if I was to add in a few more statements?

"The last player to score 44 goals in one season was Christiano Ronaldo, in the 2008 / 2009 season."
"The next highest scorer this season was Harry Kane, with 22 goals."

Goal!

Now we have context. Firstly, the last time this record was equaled was ten years ago. Secondly, no other player in the 2018 / 2019 season even came close, with the next highest score half of this! This helps to provide understanding of exactly why 44 goals in that season was such a great feat. We establish that it is not something that gets achieved frequently.

Editor's Note: These are not actual statistics.

Possibly changing a conclusion

My dearly departed grandmother once said that I am "her favorite grandson". Statistically, that means I rank Numero Uno out of all the grandchildren she had. All well and good, right?

Grandmas are great.

So let's add in another bit of data. How many grandsons did my grandmother have?

One. Me, obviously.

Well, shit, of course that would make me Number One in the popularity rankings by default. Because there is no Number Two! This could turn into a conversation on how a decent sample size is required for meaningful statistical analysis, but my point is - context was added and this changed the conclusion that was inferred by one singular piece of data.

Finally

Data can be misleading or incomplete if presented on their own. Often, other data is required to provide context, and convert data into actual, actionable information. The examples presented were actually pretty simplistic. This was meant for simple illustration, but I trust they proved my point adequately!

For your data information,
T___T

No comments:

Post a Comment