Free Website Content Website Spidering Spidering Websites By Sharon Housley Website Spidering refers to the automated process of indexing a web site by a search engine. An automated program, known as a web crawler or spider, will go through a website following the links on each page, and will gather pertinent information from each page until it has properly indexed the entire website. If a search engine is unable to spider a website, it may be a unable to index some or all of the content on that site. As a result, the website may not appear in the search results from that search engine, even when associated keywords are searched for. Potential customers may use search engines to seek out a product or service, but if a website does not appear in the search results due to missing or incomplete indexing, that website may be losing out on an opportunity. As such, it is very important to make sure the search engine spiders can indeed "crawl" and index your website. There are a number of things that webmasters can do to improve the "crawlability" of their websites to make them more spider-friendly... Display Using HTML HTML is by far the easiest type of content for search engines to spider. If the webmaster uses scripting or flash to display some of the site's content, the search engine spiders may have a difficult time following the links. Use a Sitemap Sitemaps are simply roadmaps for a website. The sitemap will help insure that all the pages on the website are indexed by the search engine. Create a proper sitemap for the website, and then submit the sitemap to the major search engines. Sitemap Details - http://www.small-business-software.net/ins-and-outs-of-sitemaps.htm Robots.txt A properly-formatted robots.txt file will help direct search engine spiders to the various parts of the website that should be indexed, as well as specifying any parts that should not be indexed. The robots.txt file should be included in the website's root directory. Secure Keep in mind that a search engine spider can not follow links behind a password or secure server (https). Any important web pages that require indexing should never be located behind a password or secure server. Avoid ID= Avoid using "ID=" or similar parameters in the webpage urls. Search engines will often ignore any URLs that include an "ID=" as a parameter. No Frames Avoid using frames if possible. Content that is contained in a frame cannot be spidered by search engines. Consider implementing these few easy steps to increase the spiderability of your website, to help insure that the site will be properly indexed. About the Author: Sharon Housley manages marketing for FeedForAll http://www.feedforall.com software for creating, editing, publishing RSS feeds and podcasts. In addition Sharon manages marketing for RecordForAll http://www.recordforall.com audio recording and editing software. ********************************************************** This article may be used freely in opt-in publications and websites, provided that the resource box is included and the links are active. A courtesy copy of the issue or a link to any online posting would be greatly appreciated send an email to sharon@notepage.net . Additional articles available for publication available at http://www.small-business-software.net/free-website-content.htm ********************************************************** |
No comments:
Post a Comment