Subscribe via

List Poisoning Email Harvesters

Thaya Kareeson


You may not know it, but your site is probably being regularly harvested for email addresses. In this post I will show you how to easily help fight email spam using a Lojack technique called List Poisoning (see previous post for more Lojack anti-spam philosophy). Though this is not a new technique, it is definitely worth spreading the word and implementing.

The goal here is to pollute the harvester’s email list with fake email addresses and fake recursive links. In doing so, the harvester will waste time and resources harvesting and spamming fake addresses. (see this in action)

In the demo below, you will notice that the first three links are recursive links that will just redirect to the same index.php. The next set of links will be fake email addresses generated for harvesters.

Demo: http://omninoggin.com/suspicious

Download: List Poisoning Package

Installation

  1. Unpack list-poisoning.zip into a directory on your web server. Make sure to name the directory something unique (i.e. don’t use “spamtrap/” or anything that describes its funcionality).
  2. Open .htaccess and modify

    “RewriteRule . /suspicious/index.php [L]”

    to

    “RewriteRule . /youruniquedirectory/index.php [L]”

  3. Place a link on your site that links to this new directory.
  4. You may make this link invisible to visitors by applying the following CSS style on the link:
    .your_unique_class {
      display:block;
      visibility:hidden;
      height:0px;
    }
  5. Verify that it works by visiting your trap directory on the web browser and clicking on some recursive links to make sure that you can recurse. For example, my trap is located at http://omninoggin.com/suspicious.

That should be all! As always, if you have a better way of doing this, or have tips/tricks on List Poisoning, then please share in comments.

Save and Share
StumbleUpon
Reddit

14 Responses to “List Poisoning Email Harvesters”

[go to last comment]
  1. Rosyidi

    For protect my email, I usualy use image . So, it do not captured by email harverster.

  2. Thaya Kareeson

    After my run in with a harvester, I decided to add a sleep delay of 3-5 seconds between each visit to the bot trap. This will waste even more of the harvester’s time. I’ve updated the List Poisoning Package to include this change.

  3. miCRoSCoPiC^eaRthLinG

    Hey Thaya,
    Thanks for the article and the code. I’ve been a long-time user of Project Honeypot and always wondered how the service operates. Never really bothered to check on the code. Seeing your demonstration I’ve a much clearer idea on how this whole spam-bot catcher / list poisoning system works.

    Cheers,
    m^e

  4. Thaya Kareeson

    @miCRoSCoPiC^eaRthLinG
    Thank you for visiting. I’m glad that the article was helpful to you. I have a question though. Do you use the Http:BL PHP API they supply or do you use the Apache mod_httpbl? It seems that the only way to use Project Honey Pot on a shared host is via the Http:BL PHP API they supply.

  5. miCRoSCoPiC^eaRthLinG

    Hey Thaya,
    Sorry about the late reply. I guess I’m using the Http:BL PHP API… I haven’t really looked into the various options they offer – nor, did I reasearch the terminology they use for the options they offer. I simply followed their instructions and installed the script in a subfolder of my site and added a link in the WP_loop to publish a spmanbot-trapper link following every post in my blog. I’m definitely not using the Apache mod_httpbl…so this has to be the other one, i.e. the PHP API. If you require further details, I’ll be more than happy to fill you in.

    Cheers,
    m^e

  6. Thaya Kareeson

    @miCRoSCoPiC^eaRthLinG
    Thank you for your reply. That was the only detail I was looking for. I’m glad that there is still need for the PHP API because I am working on packaging a nice plugin for this. I guess for shared-hosting, the PHP API is the only way to go. Thank you!

  7. Sarah

    What a great idea! I uploaded it to my site, but the recursive links don’t seem to be working. Any ideas? It’s at
    http://www.sarahbohr.com/poohbear

  8. Thaya Kareeson

    @Sarah
    Thank you for visiting and thank you for your kind words! Let’s troubleshoot.

    Can you check what your .htaccess file says? By default, the 6th line shows:

    RewriteRule . /suspicious/index.php [L]

    You’d have to change this to

    RewriteRule . /poohbear/index.php [L]

    I appologize that I forgot to mention this and have updated the instructions. Please let me know if this works out for you.

  9. Sarah

    It’s working now! I edited the .htaccess file, but I think the bigger problem was that it never uploaded in the first place. I used Dreamweaver the first go, and it uploaded the other two files but not the .htaccess file. After I edited it and uploaded using FireFTP, it started working.

    Thanks again for your help, I can’t wait to get my first ‘catch’!

  10. Thaya Kareeson

    @Sarah
    Great to hear this! Please let me know when you catch one for yourself!

  11. Bazilicum

    Hi there,

    Thanks for the code.
    I’m interested in building a trap that will verify the action of the bot.
    In the case of your trap, you assume that the bot is harvesting emails but there is no real indication for that, isn’t it? it can be a downloading bot, index bot etc?
    If I’m right, Are you familiar with methods to actually verify an email was harvested?

  12. Thaya Kareeson

    @Bazilicum
    I can’t see any way of verifying if a bot is harvesting emails or not. Even if that’s the case, this trap also fights against downloading bots and indexing bots since there are recursive links and time-delays between each script call. Bots will waste valuable time doing useless things like crawling a recursive page structure.

    Project Honey Pot is a centralize database that tracks bots’ IP addresses and activities to determine what kind of bots they are (not sure how they do the “determining” part). Regardless, you should check them out. I also made a plugin that you can use to easily integrate them into your WordPress blog (if you have one).

  13. Bazilicum

    Thanks Thaya,

    I’m into creating this kind of DB for research purposes.
    I actually found a way to figure out an email harvesting bot.
    Still checking what will be the way to figure out a downloading bot.
    I need to collect this info my self, but thanks again for the redirection.

  14. Mack

    The link CSS style tip is great. Thanks a lot.

[go to first comment]

Leave a Reply