Teochew Thunder: February 2022

Monday, 28 February 2022

The old-fashioned way to skirt the OCBC Phishing Scam

A month has passed since a new shockwave hit Singapore - not the COVID-19 pandemic, but a plague of an entirely different kind. A wave of phishing scams which caused customers of Oversea-Chinese Banking Corporation (OCBC) to lose their entire life savings. The perpetrators spoofed the number of OCBC which was responsible for sending out text messages to customers, putting in a link to a page that looked eerily similar to OCBC's official site, where they proceeded to trick these customers into entering their details.

This was sadly not even a new trick. As scams go, it was actually one of the oldest in the book. It bypassed technological safeguards and used good old-fashioned social engineering to gain access to the accounts of the victims.

No to Internet Banking

Despite having multiple OCBC accounts, I was immune to this little trick. Not because I'm smarter than average, but because I don't have Internet Banking. I don't have an OTP token, and any SMS sent to me in this vein would merely raise an eyebrow. (My wife, too, is immune to this little trick, but for a different reason - she doesn't speak English, and any SMS she receives that she can't read, she simply deletes.)

Getting a Phishing message.

That also means that any banking I do is on-site - at an ATM or an actual bank. Yes, it is a far more laborious activity. Yet, that is what I do. Many people have questioned why I, as a software developer, do not engage in the convenience of Internet Banking. After news of the islandwide scams broke, they are no longer questioning.

This does not just apply to iBanking. It is an established pattern for many aspects of my life.

Tech people being technophobic

Smart refrigerators, Internet of Things, Alexa - nope. My career as a software developer stops at my home. I still write code at home, but any compromise of security at home would merely affect the laptop and mobile phone.

No smart home for me.

Is this technophobia? Perhaps. But I seriously doubt I'm the only software developer who lives life like that.

The reason is simple. It is precisely because I am a software developer, that I've seen how things work. Mistakes happen. Negligence happens. Software developers are human and sometimes they get tired. Sometimes they're just sloppy, or irresponsible. Sometimes they just don't know better. They might cut corners or do whatever they can get away with, as opposed to doing a proper job. And I am acutely aware of just quickly things could go to shit if any of the loopholes we sometimes leave open, are exploited. A layperson has the luxury of being blind to all this, and just assuming that shit will work. I don't.

I've seen enterprise level applications fail a simple Cross-site Scripting check. Experienced developers who use unsanitized data in Stored Procedures. Essential software that don't even pass basic security tests.

And I'm supposed to just trust software?

In all fairness...

The recent scam was not due to security flaws on OCBC's side. Their systems were not compromised. The breaches were due to mistakes made on the part of the users. If OCBC is to be blamed for anything, it would be the lack of user education. And that, is as much the responsibility of the individual as that of the organization.

If you can't stand the cheat, get out of the kitchen!
T___T

Wednesday, 23 February 2022

Mean, Median and Mode in Python (Part 3/3)

The Mode is a measure of frequency of appearance in a dataset. The value that appears the most number of times, is the Mode. In the case that more than one value has the same number of appearances, the smallest value is the Mode.

The Mode

We begin by declaring a function to determine the frequency. It is tt_frequency(), and it accepts a list and a value as parameters.

def tt_frequency (vals, val):

We declare freq, and set it to 0.

def tt_frequency (vals, val):
freq = 0

Then we use a For loop to iterate through the list, and increment freq each time the current element matches the value.

def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1

Finally, return freq.

def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1

    return freq

Next, we declare tt_mode(), which accepts a list as a parameter. The code you are about to read is extremely inefficient, but it does the job!

def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1

    return freq

def tt_mode (vals):

In this function, we declare val and freq. Then we use a For loop on the list.

def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1

    return freq

def tt_mode (vals):
    val = 0
    freq = 0

    for v in vals:

We run the current element through the tt_frequency() function, and assign the value to temp_freq. If temp_freq is greater than freq, then we replace that value in freq, and set val to the current element.

def tt_frequency (vals, val):
    freq = 0
    for v in vals:
        if (v == val) :
            freq = freq + 1

    return freq

def tt_mode (vals):
    val = 0
    freq = 0

    for v in vals:
        temp_freq = tt_frequency (vals, v)

        if (temp_freq > freq):
            freq = temp_freq
            val = v

If not, then we check if it is equal. In that case, we next check if the current element is smaller than val.

If so, we set val to the smaller value. freq does not need to be changed because it is already the same value. Finally, return val.

Now let's test tt_mode()'s output with two different lists, against the Statistics library's mode() function.

return val

print(tt_mode(test))
print(tt_mode(test2))

print(stat.mode(test))
print(stat.mode(test2))

Correct output.

In summation

Whatever I have presented above is probably not an exact replica of how the NumPy and Statistics libraries perform ther calculations, but it is a pretty close match to how the average human brain would calculate these numbers. Of course, the methods could always be better optimized.

In the meantime, seeya!
T___T

Monday, 21 February 2022

Mean, Median and Mode in Python (Part 2/3)

Let us now examine the Median of a dataset. We use this to discover the middle of a dataset.

The Median

How the Median is derived, is to sort the values, then take the value that lies right in the middle of the sorted dataset. If there are two values in the middle of the dataset, we take the Mean of these values as the Median.

So here's our function, tt_median().

def tt_median (vals):

We sort the list using the standard sort() method.

def tt_median (vals):
vals_sorted = vals.sort()

Declare med. Check for the length of the list. If the length is even, make med the average of the values just before and after the halfway mark. We can use the tt_mean() function we've already written, for this.

def tt_median (vals):
    vals_sorted = vals.sort()
    med = 0

    if (len(vals) % 2 == 0):
        med = tt_mean([vals[int(len(vals) / 2 - 1)], vals[int(len(vals) / 2)]])
    else:

If the length of the list is odd, then just set med to the value of the element right at the midpoint of the list. Finally, return med.

def tt_median (vals):
    vals_sorted = vals.sort()
    med = 0

    if (len(vals) % 2 == 0):
        med = tt_mean([vals[int(len(vals) / 2 - 1)], vals[int(len(vals) / 2)]])
    else:
        med = vals[int(len(vals) / 2)]

    return med

Let's try this with two different lists. test will be the list we defined while testing tt_mean().

Compare the results against NumPy's median() function.

print(tt_median(test))
print(tt_median(test2))

print(np.median(test))
print(np.median(test2))

Close enough!

Last but definitely not least, the Mode.

Saturday, 19 February 2022

Mean, Median and Mode in Python (Part 1/3)

The basic ways to analyze numeric data come in the form of formulae that help describe the numbers. These are the Mean, Median and Mode of a dataset. Before computer programming came along and changed the world forever, these were the calculations that were used in statistics. Now, we have languages such as Python, which automate these tasks for us. However, it is useful to know how to derive these numbers ourselves.

To do this, I am going to explain each method, and use Python to implement these methods. At the same time, I am going to use Python's NumPy library to check if these implementations are correct. For the purpose of this exercise, we will be disregarding edge cases such as negative numbers and zero values.

The Mean

This basically is the average value of an entire dataset. To achieve this, we add up all the values and divide this total by the total number of values.

Let's write a function to do this. We'll call it tt_mean(). In it, we accept a parameter, vals, which is a list.

import numpy as np
import statistics as stat

def tt_mean(vals):

We use a For loop to iterate through the list, totalling them up. This value is stored in a variable, total.

import numpy as np
import statistics as stat

def tt_mean(vals):
    total = 0

    for v in vals:
        total = total + v

And then we divide that total by the number of values in that list. And we return the final result.

import numpy as np
import statistics as stat

def tt_mean(vals):
    total = 0

    for v in vals:
        total = total + v

    return total / len(vals)

Compare the results against that of NumPy's mean() function. Here, the test dataset will be a list of 11 numeric values.

import numpy as np
import statistics as stat

def tt_mean(vals):
    total = 0

    for v in vals:
        total = total + v

    return total / len(vals)

test = [1, 3, 10, 45, 7, 8, 8, 10, 10, 8]

print(tt_mean(test))

print(np.mean(test))

An exact match!

This one was easy. The next one is the Median, and that one will be slightly more complicated.

Tuesday, 15 February 2022

Tech Wizardry in the Beijing Winter Olympics

The Beijing Winter Olympics took place on the 4th of this month. It would have escaped my notice completely, quite ironically, if some well-meaning Human Rights activist had not come into a room I was in on my Clubhouse app, and insisted on urging us to boycott the upcoming Beijing Winter Olympics as protest against the alleged Uyghur genocide.

These activists are adorable. Gotta love 'em, eh? Like, do activist-led boycotts even work on something like China?

This did not make me suddenly want to watch the Beijing Winter Olympics, but it did pique my interest enough to do a search on Google and YouTube. And that led me down quite a rabbit hole, especially where the tech was concerned. Also, my wife, who is a citizen of the People's Republic of China, after noting my interest, started sending me clips of the technology used during the event.

Transport

First off, there's the self-driving cars that are ferrying people all around the area of the sporting venues. The first time I saw one of those, it struck me that it looked way better than the ones used in the movie Total Recall. That's certainly cool, but the one that really caught my attention was the high-speed rail connecting Beijing to Zhangjiakou.

An amazing 5G-fitted train.

It was also driverless. Visually, the train wasn't all that impressive, but the various little amenities inside it - due to 5G capabilities augmented by repeater towers installed all along the route so that the signal remained strong even when the train was traveling through tunnels - were mind-blowing.

Robots

The next really cool thing about the tech was robots. Robotics was used in multiple places. In the kitchens, orders were placed via a tablet, and robotic arms tossed vegetables in woks and prepared dishes such as claypot rice, burgers and fries. Completed dishes were delivered by a ceiling-mounted conveyor belt. And check out their barista robot with the super-long arms!

Also of note were the robots that helped with safe distancing. Robots warned visitors to mask up, checked temperatures and sprayed disinfectant on a frequent basis. Can you say awesome?!

Venues

I've saved the arguably coolest thing for last. Beiing needed a venue for curling. Curling being an exclusively winter sport, it would not make sense to build a venue just for that, and leave it unused for the rest of the year. So they came up with this idea for the process of converting a swimming complex to an ice rink! The video below will explain it far better than I ever could, but the tech involved, while not exclusively software tech, was nevertheless a very impressive achievement.

Next to that, however, a 360 degree experience was made possible by an orbiting camera, coupled with 5G technology that concurrently uploaded footage of the venue from all possible angles. Trackers sewn into the suits of atheletes provide feedback of their positions. The net result was that spectators not on-site viewing the event, got something like a virtual experience which was both high-quality and low-latency.

Final thoughts

The sheer amount of work involved and the ingenuity of the implementation was staggering. It also holds some implications for the future. What if all this could be implemented on a larger scale than just for the Winter Olympics? Now that's exciting.

Seeya later, sport!
T___T

Friday, 11 February 2022

Web Tutorial: The Valentine Letter

Happy upcoming Valentine's Day, ya crazy lovebirds!

It's my pleasure, as always, to present the annual Valentine's Day web tutorial. This one is in standard HTML, CSS and JavaScript, and it will be a simple animation complete with graphics and a sappy love poem. We also want this animation to appear differently according to the time of day.

Ready? Here we go...

The images

For this, we will be using two pairs of backgrounds and one background for the love poem.

These are the sky backgrounds. They have been given a color adjustment in Photoshop, but are otherwise identical.

topbg_day.jpg

topbg_night.jpg

These are the earth backgrounds. They are PNG files with transparency.

middlebg_day.png

middlebg_night.png

This is the background image for the letter.

letterbg.jpg

The starting HTML is as follows.

<!DOCTYPE html>
<html>
   <head>
       <title>Valentine Letter</title>

       <style>

       </style>

       <script>

       </script>
   </head>

   <body>

   </body>
</html>

We set divs to have a translucent red background because there is going to be a fair amount of nesting.

Here, we define the function getTimeBg(). Leave it blank, for now.

The body needs to call this function upon loading.

Now for the divs! The first div has a CSS class of container.

This is the styling for container. We want a specific width and height, and it will be set in the middle of the screen via the margin property. We will also give it a nice pink outline.

Looking a little basic, right?

Now, add in two divs, wth CSS classes top and middle.

Here's the styling. They have 100% widths, but different heights. Also, middle has background properties that will fill the entire div. For now, we will leave the background image itself unspecified.

You can see where they line up.

Within the div styled using CSS class top, we have a div styled using CSS class sky.

This is the styling for sky. It takes up 400% of its parents width. The background-repeat property is set to repeat-x, because we want the sky background to stretch on forever horizontally.

.top
{
   width: 100%;
   height: 200px;
}

.sky
{
   width: 400%;
   height: 100%;
   background-position: left top;
   background-repeat: repeat-x;
}

See how this works? Don't worry about the content overflowing for the time being; I want you to have a clear visual of what is going on.

Now within the div styled by middle, we have another div styled using letter.

Here are the CSS styles. letter has a specific width and height, and is set to the middle of its parent using the margin property. Its background is letterbg.jpg, and we set other properties to ensure it fills up the entire thing.

.sky
{
   width: 400%;
   height: 100%;
   background-position: left top;
   background-repeat: repeat-x;
   margin-left: 0px;
}

.middle
{
   width: 100%;
   height: 850px;
   margin-top: -200px;
   background-size: cover;
   background-position: middle top;
   background-repeat: no-repeat;
}

.letter
{
   width: 400px;
   height: 500px;
   margin: 0 auto 0 auto;
   background: url("letterbg.jpg");
   background-size: cover;
   background-position: middle top;
   background-repeat: no-repeat;
}

So far so good. We may need to make more adjustments later.

Yet aother div!

text is for containing the poem, so we will make it even less wide than letter, set it in the middle using the margin property, and set font sizes.

.letter
{
   width: 400px;
   height: 500px;
   margin: 0 auto 0 auto;
   background: url("letterbg.jpg");
   background-size: cover;
   background-position: middle top;
   background-repeat: no-repeat;
}

.text
{
   width: 80%;
   margin: 0 auto 0 auto;
   font-family: georgia;
   font-size: 12px;
   font-weight: bold;
   line-height: 2.5em;
}

Now we will add two paragraph tags within the div styled using text. They will be styled using poem and author, respectively.

Fill up the first p tag with the poem, and the second p tag with the author's name.

<div class="middle">
   <div class="letter">
       <div class="text">
           <p class="poem">
               <br />
               How do I love thee? Let me count the ways.<br />
               I love thee to the depth and breadth and height<br />
               My soul can reach, when feeling out of sight<br />
               For the ends of being and ideal grace.<br />
               I love thee to the level of every day's<br />
               Most quiet need, by sun and candle-light.<br />
               I love thee freely, as men strive for right.<br />
               I love thee purely, as they turn from praise.<br />
               I love thee with the passion put to use<br />
               In my old griefs, and with my childhood's faith.<br />
               I love thee with a love I seemed to lose<br />
               With my lost saints. I love thee with the breath,<br />
               Smiles, tears, of all my life; and, if God choose,<br />
               I shall but love thee better after death.<br />
           </p>
           <p class="author">
               - Elizabeth Barret Browning
           </p>
       </div>
   </div>
</div>

Let's style those p tags. This is just for aesthetics.

.text
{
   width: 80%;
   margin: 0 auto 0 auto;
   font-family: georgia;
   font-size: 12px;
   font-weight: bold;
   line-height: 2.5em;
}

.poem
{
   font-style: italic;
}

.author
{
   text-align: right;
}

Here we go!

The backgrounds

First, we add ids to the divs that we need to set backgrounds for.

Next, we add in some script. We first get sky and earth using the getElementById() method. Then we get the date using new Date() and assign it to the variable d.

function getTimeBg()
{
   var sky = document.getElementById("sky");
   var earth = document.getElementById("earth");

   var d = new Date();
}

Set an If block, that examines the output of the getHours() method. If the time of the day is between 6 AM and 6 PM (i.e, d.getHours() is between 6 to 18 inclusive), we use the day versions of the backgrounds. If not, we use the night versions.

It's now 11 PM where I am, so the night versions are on!

At this point, we should be turning off the background color of the divs.

div {background: rgba(255, 0, 0, 0); }

Also, add breaks here to adjust the content. It's a really lazy way, I know.

Animating the sky
For this, we will use CSS3 animations. This will go right into the CSS class sky. We use an animation name, skymove, which we will expand on later, and set a duration. Make this fast, maybe 1 second, so we can check on the smoothness of the animation. The iteration count is set to infinite so that it will go on forever, and we'll set the timing function to linear so that there won't be any acceleration or deceleration of the animation. It will be at a constant speed.

.sky
{
   width: 400%;
   height: 100%;
   background-position: left top;
   background-repeat: repeat-x;
   margin-left: 0px;
   animation-name: skymove;
   animation-duration: 1s;
   animation-iteration-count: infinite;
   animation-timing-function: linear;
}

Now we will define skymove. At 100%, the margin-left property is set to minus twice the width of the background, which is minus 1600 pixels.

Now the sky div moves left! You can see that the animation appears infinite due to the background-repeat property being set to repeat-x.

Set the overflow property of these classes to hidden.

.container
{
   width: 800px;
   height: 700px;
   margin: 0 auto 0 auto;
   outline: 1px solid rgba(255, 100, 100, 1);
   overflow: hidden;
}

.top
{
   width: 100%;
   height: 200px;
   overflow: hidden;
}

Ah, great stuff! Now that the overflow property is set to hidden, it looks like the sky is animated! You may want to make the animation slower, maybe set the duration to 200 seconds or something.

Final loving words

I know I know, this Valentine's day web tutorial is kind of corny. But hey, they can't all be winners. And let's face it, Valentine's Day can be pretty corny.

Did the ~~earth~~ sky move for you? Heh heh.
T___T

Sunday, 6 February 2022

Spot The Bug: Runaway Stock Code

Good day to you, bug-hunters. And welcome to another edition of Spot The Bug!

Let's get busy,
bug-hunters.

Today's problem revolves around PHP. My task was to read from a vendor database that stored data about inventory operations. The frustrating thing was that they didn't have transaction data such as stock code and quantities, but rather a table of comments generated whenever a transaction took place.

Stock (XLL2323) removed. Qty: -2
Stock (XLL2323) removed. Qty: -22
Stock (FRU551100) removed. Qty: -10
Stock (FRU551100) added. Qty: 25
Stock (SMQ010080) added. Qty: 9

So basically I had to retrieve a list of comments generated during a certain date range for a certain stock code, and deduce quantities and direction (in or out) of each transaction. To that end, I basically wrote some test code using a nested Foreach loop. If the Stock Code was inside the comment, then I would obtain the quantity from the comment with a strategic use of some string functions.

$skus = ["XMB2311"];

foreach ($skus as $sku)
{
   echo $sku;
   echo "<br />";
   $inv = 0;

   foreach ($logs as $log)
   {
       if (strpos($log, $sku))
       {
           $split = explode("Qty: ", $log);
           $qty = $split[1];
           $inv += $qty;
           echo $inv;
           echo "<hr />";
       }
   }
}

What went wrong

This was the result. For the Stock Code XMB2311, which I was investigating, the quantity of the product was increasing over time.. This was despite the fact that a cursory examination of the database showed that there were very few transactions for XMB2311, and none of them involved amounts that would push the inventory up to 88, assuming we started off with a base amount of 0.

Why it went wrong

I added to the test code by displaying the log statements.

foreach ($skus as $sku)
{
   echo $sku;
   echo "<br />";
   $inv = 0;

   foreach ($logs as $log)
   {
       echo $sku . ":" . $log;
       echo "<br />";

       if (strpos($log, $sku))
       {
           $split = explode("Qty: ", $log);
           $qty = $split[1];
           $inv += $qty;
           echo $inv;
           echo "<hr />";
       }
   }
}

And here, I discovered that not only were the log statements for XMB2311 matching, so were the ones for XMB231109!

The problem was that the some of the Stock Codes were subsets of each other. For instance, "XMB2311" was a subset of "XMB231109". So if I was doing a search for "XMB2311" in a comment that featured Stock Code XMB231109, the result would be true, and the code in the If block would be fired off erroneously!

How I fixed it

Adding a ")" in the argument to be searched. This way, the strpos() function would search specifically for the Stock Code and only that Stock Code (because all the Stock Codes in the comments ended in a ")"!

if (strpos($log, $sku . ")"))

There it was!

Moral of the story

Sometimes it's not so much that your code is completely wrong, but that it is not exact enough for that particular use case. In this instance, a closer look had to be taken at the data in order to see exactly what went wrong.

Warm_regards . ")",
T___T

Thursday, 3 February 2022

How I Learned Data Visualization

Recent years have seen me dwelve into Data Analytics. But even before that, I was exploring Data Visualization.

It all began when I ran some experiments with HTML, CSS and JavaScript to produce bar charts. That was straightforward enough, though line charts and pie charts took a lot more creativity.

HTML//CSS/JavaScript projects.

During one of my job interviews, I came across something in the list of required skills - D3 or Highcharts. I did not pass that interview, but after seeing that requirement in other job descriptions, I got curious enough to look it up. What I found blew my mind.

Exploring D3

D3 was the first thing I explored on the official website. Half of the library functions did not make sense to me and I have never been great at reading documentation. Thus, there was one thing left to do. I copied the example code and pasted it into a script, then ran it. Then I started changing the code, finding out what would happen when I commented out certain lines, or changed certain numbers. With my findings, I now knew what to look up on the documentation.

D3 projects

Within a week of practice, I was able to replicate the results of my earlier experiments, but now using D3. And then I wrote web tutorials pertaining to creating that code, because teaching code is a great way to cement that knowledge in your brain.

My experimentation was stalled somewhat due to the fact that exploring D3 naturally brought me to explore SVG as well. This was fun as well, but only peripherally related to Data Visualization.

Exploring Highcharts

This portion of my exploration was delayed due to several reasons - struggles with the chaos caused in the wake of the COVID-19 pandemic, experimentation in other areas and generally, bigger fish to fry. Yet, get to it I did eventually. Again, it began with the documentation in the official website, example code and a lot of trial and error before I produced my first web tutorial.

Highcharts project

At the time of this writing, I have yet to produce as extensively as I have done with D3. The only thing I have produced is a column chart in the same vein of what I had produced earlier with D3 and old-fashioned HTML, CSS and JavaScript... yet it was so easy that I suspect that even if I were to produce as many projects, it would not take me a lot of time.

I will sing the praises of Highcharts some other time.

Exploring other Data Visualization tools

Harkening back to almost the whole of 2021, I undertook a course in Data Analytics, an experience of which I will speak of another day. Suffice to say, part of the course covered Data Visualization. To whit, I was introduced to Data Visualization software which I had never encountered prior to this, such as Tableau, Spotfire and Power BI.

Data Visualization tools

It opened my eyes. Now I was no longer writing code to display the data; it was already being handled by these tools. Instead, I was doing the work prior to the Data Visualization. The cleaning. The reorganization. Making things make sense.

In other words, I was working on another level of the Data Visualization. I won't pretend that it wasn't sometimes mind-numbingly boring work, but it did give me a greater appreciation of the process.

All in all...

My little dive into Data Visualization was mostly brought on by professional curiosity. A curiosity that was in turn triggered by my almost obsessive need to know what things my industry wants of its professionals. But all the knowledge I have gained would not have been possible had I not followed up with the work required. This sounds like a lot of patting myself on the back, but what I really want to say is - stay curious, stay committed. It'll go a long way.

Serious as a chart attack,
T___T

Disclaimer

All opinions expressed on this blog are mine, and mine alone. Deal with it.

All technical information and code are written at my current level of understanding and may be incomplete. Some of it has been deliberately kept simple for clarity. Don't hesitate to correct me.

My experiences may be limited, and as such there may be blind spots in my reasoning. Feel free to disagree.

I'm not being paid to write this blog. This is purely a leisure activity. Keep your expectations low.

This being the Internet, someone is bound to take offense. The Web's a big place, and you're all big boys and girls. If so, seek life elsewhere.

Pages