Duplicate content is just what it says it is – content that appears in more than one place. When duplicate content occurs on the internet, search engines can’t be relied upon to decide which is the most relevant version to give when someone makes a search.
Search engines will display all the relevant results, but won’t show the duplicate content versions. Google and the other search engines have to choose which version is the original one, or which is the best one.
What does this mean for your website? Your SEO can be compromised. Your duplicate content links are diluted, and your ranking potential for a given keyword is reduced.
Most of the causes for duplicate content are technical and baffle those of us who remain un-techified. The fixes, however, are often less daunting.
Here are the most common duplicate content causes and some easy fixes.
URL Parameters that track data
An SEO-friendly URL won’t have parameters, and if it does, no more than two should be used. Parameters are the part of the URL that provides data for correct retrieval of records.
For example, these two sample links demonstrate different parameters for tracking data:
http://www.example.com/keyword-x/
http://www.example.com/keyword-x/?source=rss
Be sure to tell your programmer to always build your parameters in the same order. See, that’s easy to do!
Link Back to Original Content
Duplicate content that comes from outside your own site can be “scraped,” or used without your consent. This leaves the search engine with another version to rank, making competition in ranking for your original version.
Because search engines aren’t able to effectively filter out original content from the scraped content, website owners should create links to their own sites within their content. This way scraped content will point back to the original content and allow for some potential traffic.
If the links aren’t removed by the scrapers, it could also help search engines to determine the original content if enough links point to your site.
Session IDs
For every website visitor you have, a different session ID is stored in the URL. This is used for ecommerce sites that want to keep track of visitors and make storing items in a shopping cart possible.
A unique ID number is added to the URL for every visitor to your site, and for every page of your site. For example:
http://site.com/product?id=1234567890
http://site.com/product?id=1234567891
http://site.com/product?id=1234567892
The other option is to disable session IDs in your systems settings and instead allow cookies for tracking visitor’s products.
Printer Friendly Pages
If you have a link on your website that reads something like “click here for a printer-friendly version,” you have duplicate content.
Every time a visitor follows this link, a separate document containing duplicate content is loaded, search engines will find these and decide which version to show.
Because you want the version that contains all your additional site information, not just the stripped-down, printable version, you should use a print style sheet.
Since this is a rather un-techie solution, ask your web developer to help you. Or you can go to the WordPress Styling for Print page to see for yourself.
Pick Either WWW or non-WWW
Search engines still can get this wrong when it’s possible to access both versions of your site. Solving this problem means choosing your preferred domain and telling Google which site should be shown in search engine page results – the one with the WWW or the one without it.
Take your pick:
http://www.example.com/example.html
http://example.com/page.html
Get your easy fix to this split-identity issue by following my help in Improve Your WordPress Site’s SEO With a Single URL. Remember, if I can do it, so can you.
Remember to stay consistent when you link within your website and always stick with your preferred domain.
Two More Duplicate Content Solutions
Sometimes you want to have multiple versions of a page available for users, or you simply can’t get rid of it. You can manage this duplicate content in two ways.
1. Adding a Canonical URL Link
Once you’ve chose your preferred domain, you might need to add the rel=”canonical” links to the <head> section of your site. It will look like this:
<link rel=”canonical” href=”http://example.com/keyword-x/”/>
2. Do a 301 Redirect
Another option for dealing with wrong URLs for content is to redirect them. A 301 redirect is a permanent redirect that’s set up from the duplicate content page and sent to the original content paging, passing the ranking power to the redirected or original page.
See Improve Your WordPress Site’s SEO With a Single URL for instructions.
For the non-techies, this could prove to be confusing, so ask for help.
For a great tool to check for duplicate content, use Google Webmaster Tools. Go to Search Appearance, and then HTML Improvements to see if there’s any concerns you should know about.