I got quite a few SPAMs on my twitter last week. While it’s annoying to be spammed, I did pay a little attention to the spammers and noticed that their commonality: big number of tweets, 0 following, 0 follower.
These characteristics reminded me of the Java virtual machine technology to garbage collect unused Java objects. In Java, you don’t need to explicitly de-allocate memory as in C++. The Java garbage collector (GC) takes care of this for you. It looks at the objects to see if there is any reference to it. If not, it deletes the objects.
Lost VMs or Containers? Too Many Consoles? Too Slow GUI? Time to learn how to "Google" and manage your VMware and clouds in a fast and secure HTML5 App.
In the Twitter case, a follower is a reference to an account. If there is no follower, it’s subject to be garbage collected. Well, a twitter account is not a Java object. There is still a valid, but not normal, case in which an account is created just to follow others. As long as it does not tweet, I think it is OK to leave it there.
Now, what if SPAM accounts follow each other and fake the follower number? It’s a little tricky but has already been solved in Java GC. It’s pretty common for a group of Java objects to reference each other but not with others. Java GC has an algorithm to identify them and delete them. The same algorithm can be easily ported to identify the SPAMs.
It may not be good enough yet. Smart spammers (I mean real persons) can probably follow their SPAM accounts to bypass the GC algorithm. Well, nothing is perfect. I will leave it to Twitter to improve the algorithm. Given the number of hundreds of millions of twitter users, it may be challenging to port and run the Java GC algorithm as it is. It needs to be changed anyway.