Google Sends Alerts to Sites Using NoIndex in Robots.txt
If you’ve been using the "NoIndex" directive in your website’s robots.txt file there’s a good chance you logged in to Google Seach Console account this morning to see an alert from Google:
You probably also received an email from Google with the same alert:
Not sure what this notification means or why you’re getting it? Concerned about how it will impact your rankings?
This isn’t a notice that you’ve done something wrong or incurred a penalty from Google. But it is still important you address this issue they’ve highlighted.
Why Did I Receive This Alert?
If you got this alert when you logged in to your Google Search Console account or got an email from Google, it’s because you were using the noindex directive in your robots.txt file.
As scary as it can sometimes be to receive alerts from Search Console, don’t be too alarmed if you see this one. They’re not telling you that you’re being penalized, your site is inaccessible or that there’s any other issue that will impact your site’s performance.
What to do if you receive an alert
To address this issue, just dig into your robots.txt file and find where you’ve used the
noindex directive and remove it. You might think it’s better to just leave it and let Google ignore it (which they will start doing September 1, 2019), but resist the temptation! It’s entirely possible you’ve noindexed a page you actually want to appear in search results, which is one of the main reasons Google is ending support for this directive in the first place.
It’s a good practice to review the pages you’re blocking via robots.txt anyways to avoid these problems.
If you truly don’t want a page you’ve noindexed via robots.txt to appear in search results, you should change the way you do this by:
disallowline for this URL or path to your robots.txt file. This will prevent Google from crawling these pages when they access your website.
Add noindex to the page by a robots meta tag. Since Google might still crawl a page that’s been disallowed by robots.txt if it arrives from a link, the meta robots tag is the most effective way to tell Google not to index a page. When Google crawls a page, it will see the robots tag and know not to index this page.
How you go about editing your robots.txt file and adding a meta robots tag will depend on what sort of content management system you use for your website. We recommend consulting your CMS help center to learn how to take these steps.
Why is Google Ending this Support?
Early in July, Google published a blog post announcing that they would no longer support unofficial rules and directives that aren’t part of the robots exclusion standard. Google made this decision in the "interest maintaining a healthy ecosystem and preparing for potential future open source releases."
It’s also worth noting that an analysis by Google of the robots.txt files they’d discovered on the web found that these directives (such as
crawl-delay) weren’t used very often and when they were, they were often contradicted within the same robots.txt.
So, overall, these unsupported directives were actually more likely to hurt a site’s SEO than help it.
What is/was the noindex directive?
If you’re familiar with the meta robots tag or using this attribute in a link’s anchor tag, the noindex directive in a robots.txt file works the same way. It told any crawler specified in the preceding
user-agent line not to index that page.
You can learn a bit more about other non-standard directives that other search engines might use in our robot.txt guide’s section on non-standard directives.