Tuesday, 17 July 2007

Blogger now uses a robots.txt file to exclude /search pages from the Google index

This morning, I checked the Google Webmaster Tools statistics for my blog, davids-pics.blogspot.com, when I noticed an odd message: "URLs restricted by robots.txt - 14". Blogger doesn't allow you to edit the robots.txt file, so I was confused as to how all these urls came to be restricted. I certainly didn't want Google missing out on some of my site's content.

A quick check to the "robots.txt analysis" page revealed the cause of this disturbance. It tuns out that Google (Blogger) has created a default robots.txt file on all blogspot blogs, with the following lines included:

User-agent: *
Disallow: /search

This change has resulted in all of the label search pages in any blogspot blog being excluded from the Google index, as well as from the index of any other search engines that obey the robots.txt exclusion protocol.

I don't know why the blogger developers have decided to do this, but I certainly don't enjoy the consequences. A large portion of my traffic came from search engines to my blog's label search pages - I will now lose all of that traffic. Also, it makes sense that when people search for "flora nature photography gallery" they will want to go to the flora search page on my nature photography blog.

Regardless of my traffic loss, the robots.txt addition is here to stay. What are your thoughts about this recent change to blogspot.com blogs, and most importantly: Do you think that this exclusion will have good long-term effects on the blogosphere?

Related Sites:
Blogger
David's Nature Photo Gallery
robotstxt.org

4 comments:

acca said...

Yeah, that drived me crazy, too, when I saw. But I still see my search pages in Google index. maybe it will cause problem with new labels?

David said...

Google certainly won't index any new label search pages, but I think for the moment at least they are leaving the existing search pages in their index. They generally take a while to get rid of content from their index.
David

Phatrick said...

I think their reasoning must be that labels are essentially open to the same abuses that metatags were.

However I'm not sure that this is going to be a permanent thing

Blogger said...

Submit your blog or website now for indexing in Google and over 300 other search engines!

Over 200,000 websites handled!

SUBMIT TODAY with I Need Hits!!!