What Is Robots.txt File? And How To Use Robots.txt File In Website/Blog?

Design

create robots.txt file

If you are newbie in blogging world or just enter in website designing then maybe you are not so much aware about robots.txt file. But I am sure you know the importance of SEO for website. Whatever it is new or old website.SEO help you to improve your Search Ranking and can drive traffic on your site .read more about SEO .robots.txt is another factor in SEO. While you start your journey you will come to know about Robots.txt file which is just a simple file can create by robots.txt file generator or you can create it in Notepad yourself. One more fact about robots.txt file is that 40% of newbie users are not aware with robots.txt file and never try implementing it in their own website. May be it is just because they don’t know importance of robots.txt file.

Today here in this post I will try to clear all about robots.txt file. Here I am including all important questions that I searched on many forum websites and FAQ websites about robots.txt file. Like…

  • What is Robots.txt File?
  • How Robots.txt Works?
  • Why Robots.txt file is important For Website?
  • How Create And Check Robots.txt File- By Using Google Webmaster Tools?
  • What is Robots.txt File?

What is Robots.txt File?

Robots.txt file is a simple file at the root of any website .by using of robots.txt file website owner instruct the search engines {bot/spider} that which sections are not allowed for crawling. In the simple way when you create robots.txt file that means you are telling Google bot,Bing bot,yahoo! bot and all other third party bot what to crawl and what not to crawl from your website.

As we all know that to improve search engine ranking we need to index all site links in search engine. In process of indexing search engines spider crawl our data. In the some way we don’t want to share everything to search engines. Like our admin panel,  plugin directories etc.

      Simply type websitename.com/robots.txt in your website address bar and you can see a list of directories of the website that the site owner is asking the search bot to “disallow” or “skip”. See how it looks like?…….

robots.txt file

Here in above example you can see easily Allow and disallow section. In this example website owner allowed only affiliate program and their sitemap {video, image, text} for search engines bots.

How Robots.txt Works?

As we all know Search engines crawl our site with the help of “spider” or “bots”. This spider searches your site and brings information back to search engines so that the pages of your website can be indexed in the search engine and visitor can found your website. Simply robots.txt file generate two commend for these spiders, Allow or disallow and spider follow these commend strictly. Suppose if you you have a welcome page on your website and you don’t want that bots will follow this page than

It work like this : a robots want to visit a particular website — that is http://www.websitename.com/welcome before it bot have to check http://www.websitename.com/robots.txt ,bot will see it like this

User-agent: *

Disallow: / welcome

Here “user agent : *  means this section is applied for all robots.

The “disallow: / welcome” tells the robot that it should not visit on the website (means not allowed to visit) and can’t crawl the particular webpage.

You can also block a particular search engine bot by following step…

User-agent: GoogleBot

Disallow: /welcome

By using this commend { User-agent: GoogleBot} you can block the Google Bots ,while other bots would still have access to the page.

However, by using “* “ character, you are specifying that this commend is refer to all bots.

Why Robots.txt file is important For Website?

create robots.txt file

After reading above all point I think you are much clear about robots.txt. There are three reasons, why we should block some page by using robot.txt file. First is, if you have page on your site and it is duplicate of another page. This type of duplicate content can hurts your SEO. Search engine will index your both duplicate pages and will treat it as a duplicate content. It will affect negative your ranking in search engine. You can also do it by redirect your link with another link. Read how to redirect duplicate links?

          Second reason is privacy, you never want to share your private files in your website such as cgi-bin ,wp-admin etc. The third reason is that sometime you never want to index some important pages in search engines.

     So it is necessary to you to include robots.txt files for all these reasons .robots.txt file allow you to create privacy from bots. You can control search bot according to you.

How Create And Check Robots.txt File- By Using Google Webmaster Tools?

Google webmaster tools allowed to creating robots.txt file for your website which you don’t want to crawl by search bot. login in your Google Webmaster Tools Account and select your website. Now click on “robots.txt Tester” under the “crawl” section.

{ Deshboard > Crawl > Robots.txt tester }

create robots.txt file by using Google webmaster tools

If you have any error regarding your website you can see it in current status section. You can also check it yourself by using robots.txt tester. Now just type robots.txt after the website URL (Find It bottom of page) and click on “Test” button. If you get result as “Allowed” in green than nothing is worry for you.robots.txt file for your website is working properly. You can check it for different bots like googlebot-image, googlebot-video, googlebot-news etc.

Most common structure of robots.txt file for website is here:

User-agent: *

Disallow: /wp-admin/

Disallow: /cgi-bin/

 By above default structure you are telling Googlebot not to crawl cgi-bin folder and admin panel of your website. You can add more other folder here which you don’t want to crawl by Google bot.

Conclusion

Now you can check your robots.txt file yourself. Just type yoursitename.com/robots.txt in your browser and make enter. So update your robots.txt file, if you have added any new pages, files or directories to your site that you don’t want to index in search engine. It will ensure your website security and allow you to index in Google Good book. Share with us if Do you get any error message from Google webmaster tools? Want to know more about robots.txt file?how to create robots.txt files?

1 thought on “What Is Robots.txt File? And How To Use Robots.txt File In Website/Blog?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.