Previously I wrote a post about how to list poison email harvesters. Today I discovered that an unknown harvester/scraper bot has stumbled into my one of my traps. Here is the description of the bot:
User agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; InfoPath.1)
From the log snapshot (image), you can see that the bot had recursively crawl through 14464 pages, harvested anywhere between 5 – 20 fake email addresses per page (that’s about 12 * 14464 = 173,568 emails harvested), and wasted nearly 10 minutes on my site before deciding that it’s done. You can see that the last link the bot visited was something that looks like this:
This is infinite loop at its finest :). Not only that the bot wasted its time spidering my site, the bot has probably added these 170,000 fake email addresses to the master spam list somewhere in the net. Now spam bots referring to that master list in the other end will waste even more time and resources spamming these fake addresses.
If you like the results, please join me in sticking it to these harvesters/spammers by installing my bot trap. Please also let me know when you’ve caught a live one like I did. It’s entertaining to hear these stories.