home info scripts guides links contact

Robots.txt

A while back, I was having some major bandwidth issues, it was being eaten up by hundreds of MBs a day. It was really scary, and I had no idea why.

So I looked through my Webalizer (a stats analyzer that's given to you in a domain package in CPanel)

I noticed one of my hostees that only had one page in her entire website was eating over 3 gigs of bandwidth a month. Yikes.

Then I had to check my latest visitors to her site, turns out, Google.com/images visits her site at LEAST three times an hour, tripling her traffic, and displaying her images by hotlinking in Google Image Search.

Don't get me wrong, Google Image Search is God's gift to images, but I really can't have Google sucking up the bandwidth that I paid for, so I had to stop letting Google get through, or at least slow it down.

I did this by creating a file called robots.txt and pasting in the following code:


That's all there is to it. Save the file (robots.txt) and put it in the root of your domain. This limits how many times Google can go to your site every day, slowing down the image cache in google images.

Not that you care, but an Image Cache is the record of an image. Say that you uploaded a picture called blue.jpg, and Google added blue.jpg to its Image Search. And then, about twenty minutes later, you change blue.jpg, editing it in some way.

Google refreshes this image every 10 minutes or so (eats up a hell of a lot of bandwidth, I might add), to keep up with the changes you're making to blue.jpg.

Robots.txt tells google "hey, get out, don't refresh my images". And guess what? Google Listens (and so does yahoo).

Hopefully that slows down the bandwidth use in your site. Or helps, at least.


copyright 2008 Maggie N