Ultimate Guide to WordPress SEO – Indexing Control

OK, so you’ve got yourself an XML Sitemap, and you’ve told Google where to find your sitemap using Google Webmaster Tools.  So far, you’re well on your way to optimizing your WordPress site.

The next thing in your journey to the ultimate in WordPress SEO is telling the Search Engines what they need to index.

The first thing you need to understand is the fact that there are certain things Search Engines like, and certain things Search Engines DO NOT like.

As as sub-point, we need to understand that Search Engines determine the content of your site by “crawling” it.

So what is “crawling”?  In a nutshell, “crawling” is when a Search Engine arrives at your page and follows every single link on your site.  When it arrives at the next page, it does the same thing, and so on, and so on.  When it reaches a new page, it “indexes” that page in its (very large) database of indexed sites.  So, if you have a WordPress site with 5 Pages and 50 Posts, search engines will find and index those Pages and Posts based on the links that point to them.

For a more in-depth explanation of crawling, see this Wikipedia explanation.

The Second thing you need to understand is that you have the ability to control what content on your site gets indexed by the crawlers.

And with WordPress, the process is very simple.

But we first need to understand why certain content shouldn’t be indexed by search engines.

Duplicate Content Penalty

Google has been very public in it’s penalizing what it calls “duplicate content”.  When two or more pages on either the same or different domains or subdomains contain substantial blocks of content that “either completely match other content or are appreciably similar”, a penalty of some sort (usually a penalty in rankings for the duplicate content) will come into effect.

If you’re wanting to rank high for your targeted keywords, then being penalized by Google is the last thing you want, even if the penalty is minor.

You may be thinking, “But I don’t have any duplicate content on my site”.

Oh Really?

Little do most people know, WordPress creates what are considered “Archives” of posts.  These “Archives” are generated automatically based on things such as categories, dates, tags, authors, etc.  And each one of these archives is considered by search engines as being another page of content.  And since many posts fall into multiple archives, a single post excerpt could be showing up in multiple locations throughout your archives.

You have to stop this! Thankfully, this is not hard to do.

There is a META tag that you can place in the header.php file for your theme that will tell the search engines not to index the content found on these archive pages.  And with a little conditional tag magic, we can tell Google to index only the content we want indexed.

<?php if(is_home() || is_single() || is_page()) { echo '<meta name="robots" content="index,follow" />'; } else { echo '<meta name="robots" content="noindex,follow" />'; } ?>

What this little bit of code does is tell the search engines, “If this is the homepage, a single post, or a Page, then you are allowed to index it. If not, then do NOT index it.”

And if you, for whatever reason, want the search engines to index your homepage, Posts, Pages, and your Category Archives, then all you have to do is this:

<?php if(is_home() || is_single() || is_page() || is_category()) { echo '<meta name="robots" content="index,follow" />'; } else { echo '<meta name="robots" content="noindex,follow" />';  } ?>

So now, you are safe from the duplicate content penalty! Your site is one step closer to being Search Engine Optimized!

Next time, we’ll cover how to optimize your Permalinks and Permalink structure for keyword targeting. Why not go ahead and Subscribe to this blog and get this series delivered to you daily? I promise, you’ll be glad you did!

Comments

  1. says

    This is great information Nathan. It seems to me that any theme developers would want to include this bit of code in their theme header.php files and then call that out in the theme’s feature list, but I haven’t ever seen anyone do that.

  2. says

    @Peter A. Mello:
    Not exactly … but it does need to be between the and tags.

    In most themes, you could probably just add the following code to your functions.php file between the opening and closing PHP tags:

    function index_the_right_stuff() {
    if(is_home() || is_single() || is_page() || is_category()) {echo '';} else {echo ''; } }
    add_action('wp_head','index_the_right_stuff');
    

    This adds the code using a function that will output the code in your header.php through the wp_head hook.

  3. says

    @Chris,
    One small note, after visiting your site, I’m noticing that my syntax highlighter tends to add a space between the < and ?php when I post PHP code. Be sure to remove that space or the code won’t work.

    Nathan

  4. says

    Can I suggest the All In One SEO plugin- this does the same thing without having to go into the code (which will be wiped out if you update your theme), and you have a nice GUI for the options.

  5. says

    @Guy,
    Sure you can, but I disagree with you. These kinds of changes are not things that plugins should be responsible for doing. Code changes that will change the way your site interfaces with search engines should be handled by the theme, not a plugin.

    And you mentioned upgrading … I agree, it’s a risk. But think about what happens if the All in One SEO pack becomes incompatible with a future WordPress release. At least now you know how to do think kind of thing manually, so you’re not 100% dependent on a plugin for your SEO.

  6. says

    While viewing your source I noticed that you don’t seem to have any plugin-generated meta information. It seems as if you manually entered the meta tag information on every single page you have on your site. And also because I use a plugin (platinum SEO), the meta information in the head section yes, but it’s toward the lower end of the head tag while your meta information is in the upper end of the head section right next to the title tag. How can I move up my meta information up in the head area and how can I build my meta on individual posts and pages without relying on a plugin? Thanks.

  7. Rich Staats says

    is this the same as using a robots.txt file at the root? And if you use this, is it okay to still use the robot.txt file?

  8. says

    Hi! Your code is great but I have come across a few problems…

    First off, I have Pages for the Archive and Tag Cloud that will be indexed using this. I guess that is bad SEO? Is there any way to exclude certain Pages like these?

    Also, Pages made using Page Templates are “empty” and thus will lack description.

    Is there any good solutions? Sorry if this is really simple stuff, but I don’t know php so well.

  9. says

    Hello,

    If I want that search engine index only one category (in my case the category “musique” for all playlist) what I need to change in the code ? thanks