You may have heard about robot.txt, but probably you do not know the use of it. It is a misconception in people that robots.txt file a way of protecting folders and pages from hackers and thieves. In fact, it is totally unrelated concept to it. Few internet marketing companies use robot.txt file as a part of their SEO services. The question arises that what is the use of robots.txt for you? It is helpful to protect files that you don’t want indexed by search engine. If I tell about it in simple form, it is simply a text file. A file of commands accepted by the search engines, and it will help in getting more website traffic for you.
Robots.txt file is a file, which is initially checked by the search engine crawlers (robots), and based on it search engine decides, which pages have to do index, and which pages do not have to index. If you want search engines does not access the certain folders, you can use simple robot txt command “Disallow: /cgi-bin/” (without the quotes) – and the directory will not be accessible for search engine. Some search engine optimization experts claim that bots do not follow the rules, but it is not true. You can also use robots.txt file declaring your site map without creating a Google and Yahoo account, in which you have to submit it manually.
If you have a new website, you can use robots.txt file to tell search engines how robots or crawl will index your site. Suppose that you have a website related to dog, and you have put a content related to cat on your website. Search engine assumes that this site does not focus on particularly on one theme, this concept is called theme bleeding, and search engine will do lower your page rank. To prevent theme bleeding in your site, you can use robot.txt file to stop search engine to index pages, which does not match with the theme of your site.
After knowing that what is robot.txt, you will be excited to know that how to create it. Well, Notepad is a best option to create it, save your file with the name “robot.txt”. Now you have to upload it in the root directory of your server. Perhaps you may not know its commands. You can use free creating scripts available online. Robots.txt file works as a messenger for search engines, which tell search engines, what pages are indexed, and what pages are not indexed. You can restrict private pages or files to index by search engines by using following command
Disallow: / YourFiles
Disallow: / PrivateD
In the given example, star (*) is used to give allow permission to all user-agents. The second line use to deny the permission of crawling for directory www.example website.com/Your Files. The third line is the indication that you are giving permission to allow all pages except for Login Page.
Some content in your website may be created in programming languages, such as ASP or PHP. Many search engines are not so advanced that they can check dynamically generated content properly, by smartly implementing robots.txt; you can stop search engines to index these contents.