Googlebot: Tips On How To Direct The Crawler

Matt Cutts offers a few tips on how to handle Googlebot. These tips will help ensure your site is indexed correctly.

From Matt Cutts Blog:

  • At a site or directory level, I recommend an .htaccess file to add password protection to part of a domain.
  • At a site or directory level, I also recommend a robots.txt file.
  • At a page level, use meta tags at the top of your html page.
  • At a link level, you can add a nofollow tag on the granularity of individual links to prevent Googlebot from crawling individual links (you could also make the link redirect through a page that is forbidden by robots.txt).
  • If the content has already been crawled, you can use our url removal tool.

There are also a few curious pieces of information in the comments, namely that Google has “gotten better” at crawling Javascript links. I’ve also noticed that Google has “gotten better” at crawling scripted cgi-links, which unfortunately can earn you a dup content penalty.

Leave a Reply