Wednesday, 23 February 2022

Mean, Median and Mode in Python (Part 3/3)

The Mode is a measure of frequency of appearance in a dataset. The value that appears the most number of times, is the Mode. In the case that more than one value has the same number of appearances, the smallest value is the Mode.

The Mode

We begin by declaring a function to determine the frequency. It is tt_frequency(), and it accepts a list and a value as parameters.
def tt_frequency (vals, val):


We declare freq, and set it to 0.
def tt_frequency (vals, val):
    freq = 0


Then we use a For loop to iterate through the list, and increment freq each time the current element matches the value.
def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1


Finally, return freq.
def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1
            
    return freq


Next, we declare tt_mode(), which accepts a list as a parameter. The code you are about to read is extremely inefficient, but it does the job!
def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1
            
    return freq

def tt_mode (vals):


In this function, we declare val and freq. Then we use a For loop on the list.
def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1
            
    return freq

def tt_mode (vals):
    val = 0
    freq = 0

    for v in vals:


We run the current element through the tt_frequency() function, and assign the value to temp_freq. If temp_freq is greater than freq, then we replace that value in freq, and set val to the current element.
def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1
            
    return freq

def tt_mode (vals):
    val = 0
    freq = 0
    
    for v in vals:
        temp_freq = tt_frequency (vals, v)

        if (temp_freq > freq):
            freq = temp_freq
            val = v


If not, then we check if it is equal. In that case, we next check if the current element is smaller than val.
def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1
            
    return freq

def tt_mode (vals):
    val = 0
    freq = 0
    
    for v in vals:
        temp_freq = tt_frequency (vals, v)

        if (temp_freq > freq):
            freq = temp_freq
            val = v
        else:
            if (temp_freq == freq):
                if (v < val):


If so, we set val to the smaller value. freq does not need to be changed because it is already the same value. Finally, return val.
def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1
            
    return freq

def tt_mode (vals):
    val = 0
    freq = 0
    
    for v in vals:
        temp_freq = tt_frequency (vals, v)

        if (temp_freq > freq):
            freq = temp_freq
            val = v
        else:
            if (temp_freq == freq):
                if (v < val):
                    val = v
            
    return val


Now let's test tt_mode()'s output with two different lists, against the Statistics library's mode() function.
    return val

print(tt_mode(test))
print(tt_mode(test2))

print(stat.mode(test))
print(stat.mode(test2))



Correct output.


In summation

Whatever I have presented above is probably not an exact replica of how the NumPy and Statistics libraries perform ther calculations, but it is a pretty close match to how the average human brain would calculate these numbers. Of course, the methods could always be better optimized.

In the meantime, seeya!
T___T

No comments:

Post a Comment