Archive

Posts Tagged ‘google’

How to block search engines from indexing your SharePoint Site

March 15, 2012 3 comments

If you want to block search engines from indexing your site you need to create a robots.txt file and place it in the root of your root site.

What is a Robots.txt

Robots.txt is a text (not html) file placed in the root of your site to tell search robots which pages should and should not be visited/indexed. It is not mandatory for search engines to adhere to the instructions found in the robots.txt but generally search engines obey what they are asked not to do.

It is important to note that a robots.txt does not completely prevent search engines from crawling your site (i.e. it is not a firewall) and the fact that you may have a robots.txt file on your site is something like putting a note “Please, do not enter” on your unlocked front door. Put simply, it will not prevent thieves from coming in but the good guys will not open to door and enter.

It goes without saying therefore, if you have sensitive data, you cannot rely 100% on a robots.txt to protect it from being indexed and displayed in search results.

The location of robots.txt is very important. It must be in the main directory because otherwise user agents (search engines) will not be able to find it. They do not search the whole site for a file named robots.txt. Instead, they look first in the main directory (i.e. http://www.sitename.com/robots.txt) and if they don’t find it there, they simply assume that this site does not have a robots.txt file and therefore they index everything they find along the way. So, if you don’t put robots.txt in the right place, don’t be surprised that search engines index your whole site.

Creating a Robots.txt

  1. Launch Notepad
  2. Put the following in your robots.txt file:

User-agent: *
Disallow: /

  1. Save the file as: robots.txt

Adding a robots.txt file to the root of your public anonymous SharePoint site.

  1. Open up your root site in SharePoint Designer.
  2. Double Click the folder All Files
  3. Drag and drop the newly created robots.txt to the All Files folder.
  4. Exit SharePoint Designer.

Alternatively you can create the robots.txt from within SharePoint Designer itself.

To ensure the file is accessible to search engines go to your site URL adding “/robots.txt” at the end.

Example: http://www.sitename.com/robots.txt

You should see the contents of your robots.txt file displayed in the browser.

refernces:-

http://www.agileit.com/Blog/Lists/Posts/Post.aspx?Id=5

http://sharepointingitout.blogspot.com/2011/06/sharepoint-2010-adding-robotstxt.html

http://blog.drisgill.com/2009/01/adding-robotstxt-to-sharepoint.html

Advertisements