Now, as a programmer, and by extension, a numbers and statistics nerd, I could not let this one go. Thus, once it was over and the results confirmed, I was up hitting my Python console. I simply had to confirm this.
Five percent, yo. |
And this is how I began.
I was going to need some random functionality and some statistics crunching. Thus these two libraries were imported.
import random
import statistics
import statistics
I followed up by declaring a list, pe_results. And then I populated the list with actual results. A 0 was a vote for Tharman Shanmugaratnam, a 1 was for Ng Kok Song, and a 2 was for Tan Kin Lian.
import random
import statistics
pe_results = []
for vote in range(1, 2480760):
if (vote >= 1 and vote <= 1746427): pe_results.append(0)
if (vote >= 1746427 and vote <= 1746427 + 390041): pe_results.append(1)
if (vote >= 1746427 + 390041 + 1): pe_results.append(2)
import statistics
pe_results = []
for vote in range(1, 2480760):
if (vote >= 1 and vote <= 1746427): pe_results.append(0)
if (vote >= 1746427 and vote <= 1746427 + 390041): pe_results.append(1)
if (vote >= 1746427 + 390041 + 1): pe_results.append(2)
Next, another list was needed, diff.
import random
import statistics
pe_results = []
for vote in range(1, 2480760):
if (vote >= 1 and vote <= 1746427): pe_results.append(0)
if (vote >= 1746427 and vote <= 1746427 + 390041): pe_results.append(1)
if (vote >= 1746427 + 390041 + 1): pe_results.append(2)
diff = []
import statistics
pe_results = []
for vote in range(1, 2480760):
if (vote >= 1 and vote <= 1746427): pe_results.append(0)
if (vote >= 1746427 and vote <= 1746427 + 390041): pe_results.append(1)
if (vote >= 1746427 + 390041 + 1): pe_results.append(2)
diff = []
Then I used the shuffle() method from the random library to randomly sort the list pe_results. And then I declared another list, sample, comprising of the first 100 values of pe_results.
diff = []
random.shuffle(pe_results)
sample = pe_results[:100]
random.shuffle(pe_results)
sample = pe_results[:100]
Now I had counts for the three candidates, all starting at 0. And I tallied up the counts using a For loop.
sample = pe_results[:100]
ts_count = 0
nks_count = 0
tkl_count = 0
for vote in sample:
if (vote == 0): ts_count+=1
if (vote == 1): nks_count+=1
if (vote == 2): tkl_count+=1
ts_count = 0
nks_count = 0
tkl_count = 0
for vote in sample:
if (vote == 0): ts_count+=1
if (vote == 1): nks_count+=1
if (vote == 2): tkl_count+=1
After that, I needed the actual final results for each candidate, taken from the link at the start of this blogpost.
ts_count = 0
nks_count = 0
tkl_count = 0
for vote in sample:
if (vote == 0): ts_count+=1
if (vote == 1): nks_count+=1
if (vote == 2): tkl_count+=1
final_ts = 70.4
final_nks = 15.72
final_tkl = 13.88
nks_count = 0
tkl_count = 0
for vote in sample:
if (vote == 0): ts_count+=1
if (vote == 1): nks_count+=1
if (vote == 2): tkl_count+=1
final_ts = 70.4
final_nks = 15.72
final_tkl = 13.88
Next, I used the diff list, populating it with the differences in percentages for the sample counts and the final counts. I used the abs() function because a variance is a variance, regardless of which direction it is.
final_ts = 70.4
final_nks = 15.72
final_tkl = 13.88
diff.append(abs(ts_count - final_ts))
diff.append(abs(nks_count - final_nks))
diff.append(abs(tkl_count - final_tkl))
final_nks = 15.72
final_tkl = 13.88
diff.append(abs(ts_count - final_ts))
diff.append(abs(nks_count - final_nks))
diff.append(abs(tkl_count - final_tkl))
And then finally, I printed the Mean and Median of the diff list, using the statistics library, as well as the highest and lowest values.
diff.append(abs(ts_count - final_ts))
diff.append(abs(nks_count - final_nks))
diff.append(abs(tkl_count - final_tkl))
print ("Mean: " + str(statistics.mean(diff)))
print ("Median: " + str(statistics.median(diff)))
print ("Highest: " + str(max(diff)))
print ("Lowest: " + str(min(diff)))
diff.append(abs(nks_count - final_nks))
diff.append(abs(tkl_count - final_tkl))
print ("Mean: " + str(statistics.mean(diff)))
print ("Median: " + str(statistics.median(diff)))
print ("Highest: " + str(max(diff)))
print ("Lowest: " + str(min(diff)))
Just to be sure, I ran this 10 different times so there would be more data.
for x in range(1, 10):
random.shuffle(pe_results)
sample = pe_results[:100]
ts_count = 0
nks_count = 0
tkl_count = 0
for vote in sample:
if (vote == 0): ts_count+=1
if (vote == 1): nks_count+=1
if (vote == 2): tkl_count+=1
final_ts = 70.4
final_nks = 15.72
final_tkl = 13.88
diff.append(abs(ts_count - final_ts))
diff.append(abs(nks_count - final_nks))
diff.append(abs(tkl_count - final_tkl))
print ("Mean: " + str(statistics.mean(diff)))
print ("Median: " + str(statistics.median(diff)))
print ("Highest: " + str(max(diff)))
print ("Lowest: " + str(min(diff)))
random.shuffle(pe_results)
sample = pe_results[:100]
ts_count = 0
nks_count = 0
tkl_count = 0
for vote in sample:
if (vote == 0): ts_count+=1
if (vote == 1): nks_count+=1
if (vote == 2): tkl_count+=1
final_ts = 70.4
final_nks = 15.72
final_tkl = 13.88
diff.append(abs(ts_count - final_ts))
diff.append(abs(nks_count - final_nks))
diff.append(abs(tkl_count - final_tkl))
print ("Mean: " + str(statistics.mean(diff)))
print ("Median: " + str(statistics.median(diff)))
print ("Highest: " + str(max(diff)))
print ("Lowest: " + str(min(diff)))
And these were the results! Both the Mean and Median came to about 3 and change. No matter how many times I ran the code, it didn't vary much.
Mean: 3.5851851851851855
Median: 3.2799999999999994
Highest: 9.4
Lowest: 0.12
Median: 3.2799999999999994
Highest: 9.4
Lowest: 0.12
The Final Conclusion
It looks like the variance is generally even lower than reported, though it can go a lot higher and lower. But it was a fun experiment, nevertheless.You can count on me, Singapore.
T___T
T___T
No comments:
Post a Comment