The robots meta tag is an HTML tag that goes the
<head> of a page and provides instructions to bots. Like the robots.txt file, it tells search engine crawlers whether or not they are allowed to index a page. You can also use it to tell search engines whether or not to follow links on the page. Unlike the robots.txt file, the meta robots tag only applies to the page it’s on, so you can’t use it to disallow entire directories or file types.
The robots meta tag matters because it adds an extra layer of protection to the robots.txt file. When a crawler follows an external link and lands on one of your pages, it can still crawl and index that page because it hasn’t seen the robots.txt file. The robots meta tag prevents this crawling and indexing from happening.
Using the robots meta tag to stop a page from being indexed and links from being followed looks like this:
<meta name="robots” content=”noindex, nofollow”>
While NoIndex and NoFollow are probably the two most used values for the meta robots tag, there are other directives you can use to pass instructions to search engine crawlers:
Index - Allows robots to crawl and index the page. This is the default setting for every page if you don’t use the meta robots tag, so the index directive isn’t necessary.
NoImageIndex - Tells search engines not to crawl images on the page. Note that search engines can still crawl images that are linked to directly, so it’s a good idea to use the X-Robots-Tag HTTP header instead.
None - NoIndex and NoFollow in one command. This tells crawlers to do nothing with the page.
NoArchive - Stops search engines from displaying the cached version of the page.
NoCache - The same as NoArchive, this command is used only by MSN/Live.
NoSnippet - Prevents search engines from caching the page and from displaying the page’s snippet in search results.
NoYDir - This directive is only used by Yahoo!. It prevents the search engine from using the Yahoo! Directory description in the page’s search snippet. Since no other search engine uses the directory description for this, they don’t recognize this directive.
NoODP - Prevents search engines from using the description of the page from DMOZ in the search snippet. ODP is the community that runs and maintains the DMOZ directory.
If the whole point of SEO is getting pages into search results, how on Earth does the meta robots a page help SEO?
It prevents any private files or folders from being indexed and displayed in search results. It’s generally advisable not to publish this content to your site at all, or to password protect it. However, if for some reason you have to put it on your site, the robots meta tag will keep it out of Google.
It helps search engines to crawl your site more efficiently. Search robots have limited crawl budgets, so they could theoretically spend all their time crawling pages you don’t really care about ranking while ignoring your most important ones. Blocking indexing of these unimportant files will help guide crawlers to your more valuable pages.
If you’ve got a page that’s acquired a lot of link juice, but you don’t want it indexed, use the follow directive to pass that link juice to other pages of your site.
The most important part of using the robots meta tag is to make sure you’re using it correctly. It’s not unheard of for an entire site to get deindexed because someone accidentally added the robots noindex tag to the entire site. So understanding how the robots meta tag works is absolutely vital for SEO.