Friday 10 June 2022

Different Number Types in Datasets

Data encountered in Data Analytics may be numerical - and often it is important to understand what kind of numerical data it is. Sometimes the data just happens to be a number, but may actually represent something other than an actual numerical value.

Today, I will attempt to provide an understanding of the different kinds of numerical data.

Quantitifiable

These are actual numerical values. They represent quantities and measures. Number of apples collected per season from an orchard. Weekend ticket sales for newly premiering movies. Number of people incarcerated in 2021.

Quantifiable numbers.

Such values can be further categorized into discrete and continuous values, and I have already gone through that in a previous blogpost. Suffice to say, these values have quantifiable meaning. They can have calculations performed on them - summation, averages and such - to derive more data.

Categorical

Categorical values are numbers that don't represent numerical values. They could have numerical functions performed on them, but the end results would have no discernable meaning. The numbers are actually representations of other values. Such as serial numbers, barcodes or foreign key fields.

Just a label, like a
serial number.

Take, for instance, a 1 or 0 representing true or false, Male or Female. Or, for a more extensive example, let's take a table of different sports. If 1 represents basketball, 2 represents football, 3 represents baseball and 4 represents hockey, getting the mean of all these numbers in the table would net you a result that made no sense whatsoever.

Ordinal

If you combined the best parts of Categorical and Quantifiable values, it would look a lot like Ordinal values. Ordinal values are numbers that are representations of other values that are also quantities and measures.

Approval ratings
can be ordinal.

A good example would be ratings on a form. Let's say, for instance, the question on the poll was "On a scale of 1 to 10, with 10 being excellent and 1 being dreadful, how would you rate our service?", the numbers would represent different ratings, but we could still have a Mean or Median derived to provide an impression of the general approval rating.

In conclusion

It is important, when dealing with numerical values, or even just values that look like numbers, to understand just what these numbers represent. They may not mean what you think they mean, and this could lead to some horrific data crunching.

Much numerous regards!
T___T

No comments:

Post a Comment