Robots.txt - AIOSEO https://aioseo.com Fri, 26 Jan 2024 17:48:04 +0000 en-US hourly 1 https://wordpress.org/?v=6.3.1 https://aioseo.com/wp-content/uploads/2020/11/symbol-logo-lg-1.png Robots.txt - AIOSEO https://aioseo.com 32 32 When to use NOINDEX or the robots.txt? https://aioseo.com/docs/when-to-use-noindex-or-the-robots-txt/?utm_source=rss&utm_medium=rss&utm_campaign=when-to-use-noindex-or-the-robots-txt Wed, 07 Oct 2020 08:42:59 +0000 https://aioseo.com/docs/when-to-use-noindex-or-the-robots-txt/ One of the questions we are most often asked is what the difference is between the NOINDEX robots meta tag and the robots.txt, and when each should be used. This article addresses this question. In This Article The NOINDEX robots meta…

The post When to use NOINDEX or the robots.txt? first appeared on AIOSEO.]]>
One of the questions we are most often asked is what the difference is between the NOINDEX robots meta tag and the robots.txt, and when each should be used. This article addresses this question.

The NOINDEX robots meta tag

The NOINDEX tag is used to prevent content from appearing in search results. The NOINDEX meta tag appears in the source code of your content and it tells a search engine not to include that content in search results.

The NOINDEX robots meta tag looks like this in your page source code:

<meta name="robots" content="noindex" />

The robots.txt file

The robots.txt file tells search engines where their crawlers can and cannot go on a website. It includes “Allow” and “Disallow” directives that guide a search engine as to which directories and files it should or should not crawl. 

However, it does not stop your content from being listed in search results. Also, if the blocked directory or file is linked from any page on your website or on another website, search engines can still crawl them.

An example of how you’d use the robots.txt file is to instruct search engines not to crawl the “/cgi-bin/” directory that may exist on your server, because there’s nothing in the directory that is of use to search engines.

The default robots.txt for WordPress looks like this:

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

The difference between NOINDEX and robots.txt

The difference between the two is as follows:

  • The robots.txt file is used to guide a search engine as to which directories and files it should crawl. It does not stop content from being indexed and listed in search results.
  • The NOINDEX robots meta tag tells search engines not to include content in search results and, if the content has already been indexed before, then they should drop the content entirely. It does not stop search engines from crawling content.

The biggest difference to understand is that if you want search engines to not include content in search results, then you MUST use the NOINDEX tag and you MUST allow search engines to crawl the content. If search engines CANNOT crawl the content then they CANNOT see the NOINDEX meta tag and therefore CANNOT exclude the content from search results.

So if you want content not to be included in search results, then use NOINDEX. If you want to stop search engines crawling a directory on your server because it contains nothing they need to see, then use “Disallow” directive in your robots.txt file.

You can find documentation on using the NOINDEX feature in All in One SEO in our article on Showing or Hiding Your Content in Search Results here.

You can find documentation on using the Robots.txt feature in All in One SEO in our article on Using the Robots.txt Tool in All in One SEO here.

Further Reading

The post When to use NOINDEX or the robots.txt? first appeared on AIOSEO.]]>
Using the Robots.txt Tool in All in One SEO https://aioseo.com/docs/using-the-robots-txt-tool-in-all-in-one-seo/?utm_source=rss&utm_medium=rss&utm_campaign=using-the-robots-txt-tool-in-all-in-one-seo Tue, 22 Sep 2020 00:08:19 +0000 https://aioseo.com/docs/using-the-robots-txt-tool-in-all-in-one-seo/ Are you looking to customize the robots.txt on your site? This article will help. The robots.txt module in All in One SEO lets you manage the robots.txt that WordPress creates. This enables you to have greater control over the instructions…

The post Using the Robots.txt Tool in All in One SEO first appeared on AIOSEO.]]>
Are you looking to customize the robots.txt on your site? This article will help.

The robots.txt module in All in One SEO lets you manage the robots.txt that WordPress creates.

This enables you to have greater control over the instructions you give web crawlers about your site.

Tutorial Video

Here’s a video on how to use the Robots.txt tool in All in One SEO:

About the Robots.txt in WordPress

First, it’s important to understand that WordPress generates a dynamic robots.txt for every WordPress site.

This default robots.txt contains the standard rules for any site running on WordPress.

Second, because WordPress generates a dynamic robots.txt there is no static file to be found on your server. The content of the robots.txt is stored in your WordPress database and displayed in a web browser. This is perfectly normal and is much better than using a physical file on your server.

Lastly, All in One SEO doesn’t generate a robots.txt, it just provides you with a really easy way to add custom rules to the default robots.txt that WordPress generates.

Using the Robots.txt Editor in All in One SEO

To get started, click on Tools in the All in One SEO menu.

Tools menu item in the All in One SEO menu

You should see the Robots.txt Editor and the first setting will be Enable Custom Robots.txt. Click the toggle to enable the custom robots.txt editor.

Click the Enable Custom Robots.txt toggle in the Robots.txt Editor

You should see the Custom Robots.txt Preview section at the bottom of the screen which shows the default rules added by WordPress.

Robots.txt Preview section in the Robots.txt Editor

Default Robots.txt Rules in WordPress

The default rules that show in the Custom Robots.txt Preview section (shown in the screenshot above) ask robots not to crawl your core WordPress files. It’s unnecessary for search engines to access these files directly because they don’t contain any relevant site content.

If for some reason you want to remove the default rules that are added by WordPress then you’ll need to use the robots_txt filter hook in WordPress.

Adding Rules Using the Rule Builder

The rule builder is used to add your own custom rules for specific paths on your site.

For example, if you would like to add a rule to block all robots from a temp directory then you can use the rule builder to add this.

Adding a rule in the robots.txt rule builder

To add a rule, click the Add Rule button and then complete the fields which are described below.

User Agent

First, enter the user agent in the User Agent field.

For example, if you want to specify Google’s crawler then enter “Googlebot” in the User Agent field.

If you want a rule that applies to all user agents then enter * in the User Agent field.

Directive

Next, select the rule type in the Directive drop down. There are four rule types you can select from:

  • Allow will allow crawlers with the specified user agent access to the directory or file in the Value field.
  • Block will block crawlers with the specified user agent access to the directory or file in the Value field.
  • Clean-param lets you exclude pages with URL parameters which can give the same content with a different URL. Yandex, the only search engine that currently supports this directive, has a good explanation with examples here.
  • Crawl-delay tells crawlers how frequently they can crawl your content. For example, a crawl delay of 10 tells crawlers not to crawl your content more than every 10 seconds.
    Currently this directive is only supported by Bing, Yahoo and Yandex. You can change the crawl rate of Google’s crawler in Google Search Console.

Value

Next, enter the directory path or filename in the Value field.

You can enter a directory path such as /wp-content/backups/ and file paths such as /wp-content/backups/temp.png.

You can also use * as a wildcard such as /wp-content/backup-*.

If you want to add more rules, then click the Add Rule button and repeat the steps above.

When you’re finished, click the Save Changes button.

Your rules will appear in the Custom Robots.txt Preview section and in your robots.txt which you can view by clicking the Open Robots.txt button.

Completed custom robots.txt

Editing Rules Using the Rule Builder

To edit any rule you’ve added, just change the details in the rule builder and click the Save Changes button.

Editing a custom robots.txt rule in the rule editor

Deleting a Rule in the Rule Builder

To delete a rule you’ve added, click the trash icon to the right of the rule.

Deleting a custom robots.txt rule in the rule editor

Changing the Order of Rules in the Rule Builder

You can easily change the order in which your custom rules appear in your robots.txt by dragging and dropping the entries in the rule builder.

Click and hold the drag and drop icon to the right of the rule and move the rule to where you want it to appear as seen below.

Changing the order of custom rules in the Robots.txt editor

Google has a good explanation here of why the order in which you place your rules is important.

Importing Your Own Robots.txt into All in One SEO

You can import your own robots.txt or rules from another source very easily.

First, click the Import button to open the Import Robots.txt window.

Import button shown in the rule builder in All in One SEO

In the Import Robots.txt window you can either import from a URL by entering the URL of a robots.txt in the Import from URL field or you can paste the contents of a robots.txt in the Paste Robots.txt text field.

Import Robots.txt window showing the Import from URL field and the Paste Robots.txt text

Once you’ve done this, click the Import button.

Using Advanced Rules in the Rule Builder

The Robots.txt Rule Builder also supports the use of advanced rules. This includes regex patterns as well as URL parameters.

Here are three examples of how advanced rules can be used:

In the examples above, these advanced rules are shown:

  • /search$ – this uses regex to allow access to the exact path “/search”
  • /search/ – this blocks access to paths that start with “/search/” but are not an exact match
  • /?display=wide – this allows access to the homepage with the matching URL parameter

Advanced rules such as these allow granular control over your site’s robots.txt file so that you have full control over how user agents access your website.

Robots.txt Editor for WordPress Multisite

There is also a Robots.txt Editor for Multisite Networks. Details can be found in our documentation on the Robots.txt Editor for Multisite Networks here.

The post Using the Robots.txt Tool in All in One SEO first appeared on AIOSEO.]]>
Robots.txt Editor for Multisite Networks https://aioseo.com/docs/robots-txt-editor-for-multisite-networks/?utm_source=rss&utm_medium=rss&utm_campaign=robots-txt-editor-for-multisite-networks Thu, 30 Aug 2018 15:28:38 +0000 https://aioseo.com/docs/robots-txt-editor-for-multisite-networks/ Check out our video on using All in One SEO on WordPress Multisite Networks here. If you have a WordPress Multisite Network, you can manage global robots.txt rules that will be applied to all sites in the network. In This…

The post Robots.txt Editor for Multisite Networks first appeared on AIOSEO.]]>
Check out our video on using All in One SEO on WordPress Multisite Networks here.

If you have a WordPress Multisite Network, you can manage global robots.txt rules that will be applied to all sites in the network.

Tutorial Video

To get started, go to the Network Admin and then click on All in One SEO > Network Tools in the left hand menu.

You will see our standard Robots.txt Editor where you can enable and manage custom rules for your robots.txt.

Click on the Enable Custom Robots.txt toggle to enable the rule editor.

You should see the Robots.txt Preview section at the bottom of the screen which shows the default rules added by WordPress.

Default Robots.txt Rules in WordPress

The default rules that show in the Robots.txt Preview section (shown in screenshot above) ask robots not to crawl your core WordPress files. It’s unnecessary for search engines to access these files directly because they don’t contain any relevant site content.

If for some reason you want to remove the default rules that are added by WordPress then you’ll need to use the robots_txt filter hook in WordPress.

Adding Rules Using the Rule Builder

The rule builder is used to add your own custom rules for specific paths on your site.

For example, if you would like to add a rule to block all robots from a temp directory then you can use the rule builder to add this.

To add a rule, enter the user agent in the User Agent field. Using * will apply the rule to all user agents.

Next, select either Allow or Disallow to allow or block the user agent.

Next, enter the directory path or filename in the Directory Path field.

Finally, click the Save Changes button.

If you want to add more rules, then click the Add Rule button and repeat the steps above and click the Save Changes button.

Your rules will appear in the Robots.txt Preview section and in your robots.txt which you can view by clicking the Open Robots.txt button.

Any rule you add here will apply to all sites in your network and cannot be overridden at the individual site level.

Editing Rules Using the Rule Builder

To edit any rule you’ve added, just change the details in the rule builder and click the Save Changes button.

Deleting a Rule in the Rule Builder

To delete a rule you’ve added, click the trash can icon to the right of the rule.

Managing Robots.txt for Subsites

In the multisite Robots.txt Editor, it’s very easy to edit robots rules for any site on your network.

To get started, simply click on the site selector dropdown from the top of the Robots.txt Editor. From there select the site you wish to edit the rules for, or search for it by typing in the domain:

Once you’ve selected a site, the rules below will automatically refresh and you can modify/change the rules for just that site. Make sure to click Save Changes after modifying the rules for the subsite.

The post Robots.txt Editor for Multisite Networks first appeared on AIOSEO.]]>
NGINX rewrite rules for Robots.txt https://aioseo.com/docs/nginx-rewrite-rules-for-robots-txt/?utm_source=rss&utm_medium=rss&utm_campaign=nginx-rewrite-rules-for-robots-txt Mon, 18 May 2020 16:37:58 +0000 https://aioseo.com/docs/nginx-rewrite-rules-for-robots-txt/ All in One SEO no longer generates its own robots.txt as a dynamic page. Instead, it uses the robots_txt filter in WordPress to tap into the default robots.txt created by WordPress. This means that rewrite rules are no longer needed…

The post NGINX rewrite rules for Robots.txt first appeared on AIOSEO.]]>
All in One SEO no longer generates its own robots.txt as a dynamic page. Instead, it uses the robots_txt filter in WordPress to tap into the default robots.txt created by WordPress.

This means that rewrite rules are no longer needed for NGINX servers.

When you use the Robots.txt feature in AIOSEO, you’re modifying the default robots.txt created by WordPress.

The post NGINX rewrite rules for Robots.txt first appeared on AIOSEO.]]>