Originally Posted by
IBkatherine
Tech did add a 5 second delay over the weekend (which looked like it did not do much, as a lot of you were still experiencing the error). It was then changed back to a one second delay yesterday. An IP was banned, but the errors still persisted. There's nothing starkly out there that points towards why we were experiencing the db errors. But it seems to have died down today- I at least have not come across any errors...
The server logs must give you an idea which domains are hitting the server the hardest. Also, the robots.txt file is advisory. Clients agree to adhere to its dictates, but there's no enforcement. It's possible someone is either ignoring the crawler delay or hitting one or more of the disallowed URLs. In either case, that would warrant a block. If you banned a single IP, might it have just been a poorly behaved homegrown crawler? If it was a larger crawler, they would likely have hit you from multiple IP addresses, so you might have to block a larger block of addresses.
Also, sticking a modern web server like nginx in front of your actual web server (if you haven't already) could give you more knobs to turn regarding blocks, pauses, blacklists, etc.