Advertise on Bikeforums.net



User Tag List

Results 1 to 18 of 18
  1. #1
    Unique Vintage Steel cuda2k's Avatar
    Join Date
    May 2005
    Location
    Allen, TX
    My Bikes
    Kirk Frameworks JKS-C, Serotta Nova, Gazelle AB-Frame, Fuji Team Issue, Schwinn Crosscut, All-City Space Horse
    Posts
    11,493
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)

    Screwing with spam bots

    Anyone who has created any sort of website that isn't 100% static html in the last decade or longer has probably encountered some sort of malicious activity from an automated spam bot computer crawling across the web looking for websites to fill with all sorts of crap. Years ago I added a simple guestbook page to my website that allowed visitors to write a message which was then displayed in a list. Within weeks I was constantly adding to the list of 'filtered' words to keep the spam out.

    With my current website, a user is required to create a user account, sign-in and all that before being allowed to post comments, and additional checks are in place to keep the vast majority of the data safe from such pests. Earlier this week I (finally) added some custom error handling code that sent me an email any time a visitor encountered an error on the website. Suddenly, within minutes of deploying these changes, my inbox was starting to fill up with mail. I was worried at first that users were actually hitting errors all over the site, till I looked closer at the request addresses. Spam bots. 99.5% of them. Attempts at submitting page requests with URLs embedded into the query strings and the sort. Of course my code was blowing up on that sort of thing when it's not looking for a http string. I wasn't so worried that these spammers were getting the error page, but that my inbox was getting full, and fast.

    So I've spent the last few evenings adding some changes to the request handling and error handling of the website. Since I have confirmed that no valid request string on the website will contain 'http://' within it, I am now redirecting those to the definition of 'BLARG' at UrbanDictionary.com. No longer even allowing them to get to the error page. For less obviously spam related bad requests I'm redirecting to google and a few other redirection destinations. Haven't got a new error email all day long, other than 2 that I created myself just to verify that valid errors were still being caught.

  2. #2
    You Know!? For Kids! jsharr's Avatar
    Join Date
    Apr 2005
    Location
    Just NW of Richardson Bike Mart
    My Bikes
    '05 Trek 1200 / '90 Trek 8000 / '? Falcon Europa
    Posts
    6,026
    Mentioned
    10 Post(s)
    Tagged
    3 Thread(s)
    i like pie.
    Are you a registered member? Why not? Click here to register. It's free and only takes 27 seconds! Help out the forums, abide by our community guidelines.
    Quote Originally Posted by colorider View Post
    Phobias are for irrational fears. Fear of junk ripping badgers is perfectly rational. Those things are nasty.

  3. #3
    Chepooka StupidlyBrave's Avatar
    Join Date
    Sep 2006
    Location
    South Central PA
    My Bikes
    1990 Trek 1400 7spd; 2001 Litespeed Arenberg 10 speed
    Posts
    1,155
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Even if it is straight HTML... Check your web logs... CodeRed-like attacks and misconfigured IIS cgi directories are still being requested. At least they were the last time I lit up my Apache server.

    Did you consider a captcha?

  4. #4
    Opus PATH's Avatar
    Join Date
    Jan 2007
    Location
    North of City
    My Bikes
    Bianchi Cross Concept, Bianchi Axis, Specialized Crosstrail, Specialized Roubaix, Miyata Sportsrunner, Trek T1 Track Bike, Specialized CrossTrail Expert, Specialized Tricross Comp Triple
    Posts
    221
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Doesn't everybody love SPAM? So what if Bots make it!



    Go raibh an chóir ghaoithe i gcónaí liom!

    2007 Specialized Tricross Comp Triple, 2007 Trek T1, 2006 Specialized Roubaix
    2006 Bianchi Cross Concept, 1989 Miyata Sportrunner, 2006 Bianchi Axis, 2008 Specialized Crosstrail Expert







    Lullaby Of Foo

    Now I lay me down to sleep
    Keep my bike safe from the bicycle thief
    Keep my tootsies toasty warm
    keep my carbon from any harm

    Good Night Road Bike
    Good Night Moutain Bike
    Good night all you Foosters
    And good night Moon

  5. #5
    Senior Member hos13's Avatar
    Join Date
    Mar 2007
    Location
    552 LATA
    My Bikes
    2007 Mercier Sperns (I'm a shill) and a 99 Diamondback Invert
    Posts
    778
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    We process around 2 million emails in a 24 hr period, less then 1 percent is legit email. Dan Bernstein has written some interesting material on SPAM as to the solution, thus far I think he maybe correct. However it will never happen, it would be similar to the PSTN and email exchanges would bill for other exchanges for routing mail through them.
    "Don't give up, don't ever give up" jimmyv

  6. #6
    Senior Member
    Join Date
    Aug 2006
    Posts
    998
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    On my websites, I use a script called wpoison, which gives web bots that ignore the robots.txt file oodles of seemingly valid, but absolutely bogus E-mail addresses. Some spam bots will happily run down random links the wpoison CGI URL gives for hours and hours, slurping up hundreds of thousands of bogus E-mail addresses.

    This makes the spammer E-mail databases get full of nonworking, useless addresses which they can't really filter out. Combine this with some form of tarpit Apache module (which slows down HTTP requests exponentially after they hit a certain threshold), and this can occupy a spam harvester bot for a long while.

  7. #7
    Blasted Weeds Tude's Avatar
    Join Date
    Aug 2006
    Location
    Rochester, NY
    My Bikes
    Trek 1200C, Specialized Rockhopper, Giant Yukon FX, Giant Acapulco
    Posts
    1,182
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by mlts22 View Post
    On my websites, I use a script called wpoison, which gives web bots that ignore the robots.txt file oodles of seemingly valid, but absolutely bogus E-mail addresses. Some spam bots will happily run down random links the wpoison CGI URL gives for hours and hours, slurping up hundreds of thousands of bogus E-mail addresses.

    This makes the spammer E-mail databases get full of nonworking, useless addresses which they can't really filter out. Combine this with some form of tarpit Apache module (which slows down HTTP requests exponentially after they hit a certain threshold), and this can occupy a spam harvester bot for a long while.
    oooo I likes that one!

  8. #8
    Air
    Air is offline
    Destroyer of Wheels Air's Avatar
    Join Date
    May 2006
    Location
    Creating some FA-Qs
    My Bikes
    Nishiki Sport, Downtube IXNS, 1950's MMB3 Russian Folding Bike, MTB
    Posts
    3,564
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Me too!!

  9. #9
    Unique Vintage Steel cuda2k's Avatar
    Join Date
    May 2005
    Location
    Allen, TX
    My Bikes
    Kirk Frameworks JKS-C, Serotta Nova, Gazelle AB-Frame, Fuji Team Issue, Schwinn Crosscut, All-City Space Horse
    Posts
    11,493
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    I had a coworker who worked for Match.com before joining the team. He had all sorts of stories about how he'd screw with scammers. One included dealing with programs designed to automatically download the content from Match.com to setup look-alike scam sites. When identified, they instead fed a single profile hundreds of thousands to millions of times. And let's just say this profile wouldn't bring in a lot of interested men.

  10. #10
    Elite Rep
    Join Date
    Aug 2004
    Location
    Melbourne - Australia
    Posts
    2,097
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Nerds.

  11. #11
    Crushing souls Hickeydog's Avatar
    Join Date
    Jun 2007
    Location
    Sagamore Hills, Ohio.
    My Bikes
    Trek 1500
    Posts
    1,591
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by blue_neon View Post
    Nerds.
    incorrect. The correct term would be geek. Specifically, an Internet geek. A nerd is one who attempts to know everything possible. A geek is one who specializes in a specific field, and is a self- trained expert in that field.
    Quote Originally Posted by Wordbiker View Post

    What's frightening is how coherent Hickey was in posting that.

  12. #12
    J E R S E Y S B E S T Jerseysbest's Avatar
    Join Date
    Apr 2005
    Location
    DC
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Hickeydog View Post
    incorrect. The correct term would be geek. Specifically, an Internet geek. A nerd is one who attempts to know everything possible. A geek is one who specializes in a specific field, and is a self- trained expert in that field.


    A Nerd would know the difference between a Nerd and a Geek.
    Quote Originally Posted by SingingSabre View Post
    Cheating: a symptom of the problem.

  13. #13
    Elite Rep
    Join Date
    Aug 2004
    Location
    Melbourne - Australia
    Posts
    2,097
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Lol

  14. #14
    Lanky Lass East Hill's Avatar
    Join Date
    Sep 2005
    Location
    Take a deep breath, and ask--What would Sheldon do?
    My Bikes
    Nishiki Nut! International, Pro, Olympic 12, Sport mixte, and others too numerous to mention.
    Posts
    21,575
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by mlts22 View Post
    On my websites, I use a script called wpoison, which gives web bots that ignore the robots.txt file oodles of seemingly valid, but absolutely bogus E-mail addresses. Some spam bots will happily run down random links the wpoison CGI URL gives for hours and hours, slurping up hundreds of thousands of bogus E-mail addresses.

    This makes the spammer E-mail databases get full of nonworking, useless addresses which they can't really filter out. Combine this with some form of tarpit Apache module (which slows down HTTP requests exponentially after they hit a certain threshold), and this can occupy a spam harvester bot for a long while.
    Oooh, sneaky, clever, and devilish!

    East Hill
    ___________________________________________________
    TRY EMPATHY & HAVE LOVE IN YOUR HEART, PERHAPS I'LL SEE YOU ON THE ROAD...

  15. #15
    So say we all.
    Join Date
    Nov 2004
    Location
    Austin, TX
    My Bikes
    Gary Fischer Wahoo, upgraded Specialized Allez
    Posts
    728
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I've always liked the teergrube: a host configured to be

    v-e-r-y s-l-o-w

    It works much better with a mailserver, but it works for anything that waits for and parses your reply. Wastes their time and keeps them from trashing other sites. (Well, slows down one thread, but you can only do what you can.)

    Of course, the real solution is to extend the death penalty to spammers.

  16. #16
    1/2 man,1/2 bear,1/2 pig ManBearPig's Avatar
    Join Date
    May 2004
    Location
    .
    My Bikes
    .
    Posts
    1,126
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by jsharr View Post
    i like pie.
    I agree, I like pie too.

    (You lost me at "website.")
    ...

  17. #17
    Senior Member
    Join Date
    Aug 2006
    Posts
    998
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by RedHairedScot View Post
    I've always liked the teergrube: a host configured to be

    v-e-r-y s-l-o-w

    It works much better with a mailserver, but it works for anything that waits for and parses your reply. Wastes their time and keeps them from trashing other sites. (Well, slows down one thread, but you can only do what you can.)

    Of course, the real solution is to extend the death penalty to spammers.
    I like the teergrube idea, but have been too lazy to implement. Instead, I used SpamCannibal which did a great job of reducing one domain's spam from 20,000 messages a day to a couple hundred. Eventually, I just got tired of the bandwidth drag, and moved the domain to a hosting ISP.

    One pleasant surprise of Exchange 2007 -- it does tarpitting automatically.

  18. #18
    Footballus vita est iamlucky13's Avatar
    Join Date
    Jun 2002
    Location
    Portland, OR
    My Bikes
    Trek 4500, Kona Dawg
    Posts
    2,118
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    My site went 3 years after I added user commenting before spam really got to be a big problem.

    Last month I finally buckled down and wrote a script to filter the spam. I'm stopping probably 99% of it now just by scoring posts based on their usage of certain words, and for the time being, storing the rest for analysis and to reaffirm how awesome I am at stopping mindless robots. In 4 weeks I've collected 614 of the little twerps.

    As a tip, about half of them give no http_referer. That's a really easy way right there to filter out a big chunk of the human versus otherwise.

    "I receive a ton of spam every day. Much of it offers to help me get out of debt or get rich quick." ~Bill Gates
    "The internet is a place where absolutely nothing happens. You need to take advantage of that." ~ Strong Bad

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •