jump to navigation

Canonical Issues & Duplicate Content November 12, 2008

Posted by Sarah Bernier in Things.
Tags: , ,

In reviewing/auditing websites for my current employer, I continue to run across a very common issue: duplicate content due to canonical issues. Now, I’m sure any SEO knows that – over time – search engines will naturally canonicalize URLs (pick what they think is the preferred URL). But in the meantime, if you’re acquiring any 3rd-party links, odds are they’re spread out amongst the various ‘versions’ of the URLs.

The issue here isn’t dupe content, since you don’t incur “penalties,” per se, for having duplicate content. Rather, as aforementioned, over time the search engines just pick what they see is the original source of the content and display only that version.

For the sake of argument, let’s review some possible versions of a URL. Keep in mind that in spite of the rendering the same content, search engines still see these as separate URLs:

In instances where the site is set up incorrectly – and the preferred domain isn’t denoted – sites can have multiple versions (upwards of 4) of their content. The longer the site is online, the greater the chances the “link love” will be spread out between different URLs. This is especially true since in the majority of sites where I see this happening, the internal links to the home page of the site don’t point to the root level domain (www.mysite.com), but instead some other version (most often some variation of http://www.mysite.com/index.html).

Great, so we’ve diagnosed this…now what?

The first thing to do is realize which URL is the canonical version. Most webmasters use http://www.mysite.com, but there could be reasons why they’d choose mysite.com. Whichever version is chosen, you must be sure to remain consistent throughout the site.

I’m not a technical person, so I’m referring to “code nerds” on this one, but I do know you need to 301 redirect the non-www to www version. Every content management system (CMS) should have a relatively easy way to do this. For those of you not using a CMS, read about avoiding duplicate content by using .htaccess files in more techy detail.

Keep in mind that these 301s need to be implemented at the page level. Recently I gave an audit to a company with an in-house tech team and they didn’t quite understand how to go about fixing the problem and in fact made it worse by redirecting all non-www versions of URLs across the site back to the root level domain! Did I mention they did this with 302 redirects? Ooops.

Long story short: Make sure you don’t have more than one “site” floating about online, and when you try to fix something that’s broken, make sure you completely understand the inner workings of a website first 🙂



No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: