Avoid Duplicate Content Penalties

Bill has a great post (again) on duplicate content, and looks at what conditions may cause a search engine not to list pages:

1. Product descriptions from manufacturers, publishers, and producers reproduced by a number of different distributors in large ecommerce sites
2. Alternative print pages
3. Pages that reproduce syndicated RSS feeds through a server side script
4. Canonicalization issues, where a search engine may see the same page as different pages with different URLs
5. Pages that serve session IDs to search engines, so that they try to crawl and index the same page under different URLs
6. Pages that serve multiple data variables through URLs, so that they crawl and index the same page under different URLs
7. Pages that share too many common elements, or where those are very similar from one page to another, including title, meta descriptions, headings, navigation, and text that is shared globally.
8. Copyright infringement
9. Use of the same or very similar pages on different subdomains or different country top level domains (TLDs)
10. Article syndication
11. Mirrored sites

What a great checklist! Bill goes into a lot more detail, so be sure to read his post.

Bill also looks at some of the papers on duplicate content issues. I’m reading through one of the Microsoft papers, which is hard-going yet interesting, and it occurs to me that keyword based SEO has a fundamental problem:

all things being equal, if you choose the same keyword phrase that a lot of other people are using, you may be more likely to be taken out by duplicate content filters

The probability that two unique pages have the same text phrase, at a higher than average density is low, and therefore should raise duplicate content flags – not necessarily because the pages are an exact match, but they are too similar in terms of content to be shown in the same SERP.

  1. ChrisChris06-12-2006

    So what are our options in terms of targeting specific key phrases and key words? Also, how does this compare to sites such as Dell and Gateway that don’t focus on those keywords but still obviously get top rankings for ‘computer’.

    This an even larger step in the direction that anchor text is ‘most’ important, at least I would think so.

  2. Peter Da VanzoPeter Da Vanzo06-12-2006

    >>Also, how does this compare to sites such as Dell and Gateway that don’t focus on those keywords but still obviously get top rankings for ‘computer’.

    Indeed. Obviously, external quality scores count for a lot more than on-page text.

    >>So what are our options in terms of targeting specific key phrases and key words?

    That’s a good question.

    I found it interesting that the scientists are describing duplicate content in a different way than many SEOs do. To them, duplicate means too similar in terms of topic. If this is so, then pages featuring popular, tightly optimised keyword phrases are more likely to appear aberrant when the duplicate content filters are applied across the data set. The winner will be the site with the most quality indicators, the rest will be de-emphasised in order to promote SERP variety.

    So, yes – links are important 🙂 Use semantic variation. Produce unique documents about a topic, rather than about keyword terms.

  3. ChrisChris06-12-2006

    Thanks for the reply Vanzo. Variation is definately going to be key to successful SEO, it’s just a changing market. I somewhat agree with Google’s methods of changing up the algorithm….but it makes it very hard on Webmasters….but possibly will eventually make it easier on the common search user.

    Unique documents…. so limit your quotations. 😉

  4. Peter Da VanzoPeter Da Vanzo06-12-2006

    Peter, actually 😉

    >>so limit your quotations.

    Heh heh. Yep – or be careful about the context.

    >>but it makes it very hard on Webmasters

    Sure. It isn’t Google’s job to make life easy for anyone competing with Adwords, or those who don’t provide the end user with content Google considers valuable.

    Then again, the webmasters who know how to work the new systems don’t have half the competition they used to. The bar has been raised.

  5. vrejenvrejen08-15-2006

    Hey Peter,

    You seem to really know your stuff! Any chance you may be interested in sharing your expertise with the Online Casino Industry at a Conference in Las Vegas?

    Jen

Leave a Reply