Blog spambots are programs that automatically post comments to blogs. They are basically stupid programs written by people whose own mothers would rather not acknowledge their existence. At any rate, they are a hassle for bloggers who allow comments. There are a variety of techniques to automatically delete the spam posts, but they have advantages and disadvantages.

One approach to the problem is to require people to log in to post a comment. The drawback to this approach is that many people don't want to go through the hassle, so you lose comments that may well be valuable.

Another approach is to use blacklists. These can be words, web sites, or other things. Black listing web sites is a job that is never done as the web sites change constantly. Blacklisting words can block legitimate comments dealing with subjects we might discuss.

For quite a while "captchas" worked pretty well. A captcha is the box with letters and numbers that you have to type in. The clowns writing the spambot software figured out how to programmatically defeat these. We are testing a product that use asks a question which you must answer. Even with this some spam was coming through so I decided to do a little testing.

I asked the question "Is Mickey Mouse a dog, a cat, or a mouse?" The spambots got the answer. Next I tried "What company produces NOD32? (Hint it is in the logo at the top left). Guess what, they guessed ESET :)

So, I think, perhaps the issue is if the word appears on the web page... so I asked "The Disney character Mickey is a mouse. What kind of bird is Donald?” Well, the spambots seemed to have guessed it was a duck. But, what if they didn’t actually guess it? One of my favorite quotes of all times came from a robotics book. It said “If you see only one solution you probably don’t understand the problem.” Perhaps the problem is not the challenge question and response. Perhaps there is a vulnerability somewhere that the spambots are exploiting. So how can I determine if the spambot is really guessing the answers or not? Well, I’ll start with an easy test. The question will have an answer that is wrong and relatively hard to guess. Any question will do, but the answer will be something like 378jf4#%wmf9#f9o@d9edem’;:+`?44$. Essentially this is a strong password.

The unfortunate effect is that you won’t know the answer so effectively comments will be disabled for a short time. I’ll let you know when they are back and how my little test went. I probably could Google the issue and find that the answer already exists, but where is the fun in that?

Randy Abrams
Director of Technical Education