Auto-Detection Of Sitemaps Via Robots.txt

Google, Yahoo, MSN, and Ask have got together and announced a new robots.txt feature, sitemap auto-discovery.

The new open-format autodiscovery allows webmasters to specify the location of their sitemaps within their robots.txt file, eliminating the need to submit sitemaps to each search engine separately“.

What are site-maps?

A sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site. More information here. Formatting guidelines are here.

What is the robots.txt specification for a sitemap?

Sitemap: <sitemap_location>

  1. benben04-11-2007

    This is definitely a better idea than having to register at each service separately. I still wonder what the verdict is on sitemaps. There is one downside to this though, it allows anyone to see every url from a site (theoretically but not likely) and thus might find a previously hidden attack vector in a page vulnerable to an application vulnerability. This isn’t that big of a drawback though, because you can probably safely assume that if a site has a vulnerable page it is in the google index.

  2. Peter Da VanzoPeter Da Vanzo04-12-2007

    They could, but I guess they have the means already, if they’re determined enough.

Leave a Reply