li What is a Search Engine?

What are Search Engines?

Search Engine has become a "generic" term used to refer to both crawler-based (Google) and directory-based (Yahoo!) site search companies. The two types of searches produce listing in significantly different ways and therefore can produce widely varying results. Search engines create listings automatically by crawling website and compiling information about that site into an index. When "searching" one of these indices, results are presented giving emphasis on specific words. The methodology that determines specific results from this search and delivery is known as an algorithm. And these algorithms are proprietary and confidential to each search engine company.
 

Crawling a Site

Search Engines that use bots or spiders to crawl a site create references automatically. People typing in "key?€? words find those words in the sites list of referenced words and the crawler-based sites serve those pages where those words are referenced.

These kinds of sites like to see site content changed on a frequent basis as changes are seen as an indication of new and revised information. A site that uses bots or spiders to crawl a site eventually finds these changes and that can affect your ranking in their listings. How you lay out your pages matters. Page titles, your page content, links to your site, and other items are all important in determining your rank relevance to the words being referenced.
 

Crawler-Based Search Engines Dissected

The first of 3 major components to crawler-based search engines is the spider (or bot) known as the crawler. A bot visits a site, reads the main words while eliminating all connector words and follows all links it can find to other pages within the site. If you change one of your web pages, search engines eventually find those changes, which can affect how you are listed. Bots return to sites on a schedule known only to the search engine company, but more frequently to sites that continually update content.

The words found by the bots are placed in a database called an index or catalog ?€" the second component of a crawler-based search engine. This is where your keywords for each page indexed by a search engine will be searched and produces a results page and the order in which that page will be displayed. If a page is changed, then this index is updated with that new information the next time a bot visits the site.

New sites and new pages can take some time before a spider finds and adds the page to the index, and depending on how the site is constructed and whether or not a spider can "read?€? a page or links to a page, the page may never be found, read, or indexed. And pages are not immediately added to the index as soon as they have been crawled. It may take some time to get from the read stage to the index. Until the content is added to the index, the content is not available to the search engines.

Search software may be the most important component to the user in producing relevant results from the indexed content. The software is a program that searches through billions of pages of indexed content finding matches and determining relevance to the terms searched. Each search engine handles this task differently and that is why each may produce varying results. You may rank near the top on one and be nearly invisible on another.
 

The Human Element

A directory-based site depends more on people for its listings. Sites are submitted with short descriptions for a site by site owners, reviewers, and fee-based listing services. Search engines look for matches in the submitted descriptions.

Content changes have little to no effect on your listing. The items crucial to good ranking on crawler sites have almost nothing to do with ranking in a directory-based site. The exception would be that both types of sites appreciate good content. A site with good content is more likely to get a good review than a poor one.
 

One or the Other?

Search engines used to present only their specific type of results but today competitive pressures demand that both types of results are presented to the searcher. The search engine will typically favor its core function over the other which produces varying results depending on whether the search engine is a directory or a crawler based one.
 

Search Engines Result Listings

Type in a word or phrase in your favorite search engine and almost instantly you are presented with an overwhelming amount of information that contains your search words based on your search engine's determination of relevance. Of course, these are not always the most relevant to you. Sometimes it may take a bit more refining of your keywords to get what you are really looking for.

When you consider the alternatives, having to do a bit of refining in your search is well worth the time. Think about searching for almost anything; a reference to a phrase in the volumes of materials at the Library of Congress, or all the people who live on a specific street in the phone book, or where your favorite spice is located in a new grocery store particularly if you do not know the name of the spice, just what texture and flavor it adds.

Search engines follow a specific set of rules, each unique and proprietary to the specific search engine company called algorithms to determine relevancy. Though the algorithms are unique, they all follow these basic steps to some degree:
 

Location and Frequency

A primary rule in a ranking algorithm revolves around the location and frequency of keywords on a web page.

If you went to the library to learn more about football, it would make sense to first look at books that have "football?€? in the title. Search engines operate the same way. Pages with the search terms appearing in the HTML title tag are often assumed to be more relevant than others to the topic.

Keywords appearing in a headline or the first few sentences of the content of a web page are assumed to be more relevant to the topic than those that appear later in the page content by search engines.

How often the keywords appear in natural, readable content also helps a search engine determine its relevance. A search engine will analyze not only the frequency but how the words appear in relation to other relevant words on a page. Those with matching relevance to other words frequently used in combination with the keywords are determined to be more relevant to the search term.

No search engine company does it exactly the same. And not all search engines index the same quantity of pages. This results in each search engine company having different information in their indices which results in different results.

Factor in that some search engine companies may exclude pages or penalize sites that they believe are presenting content solely to improve their site relevance factors (known as "spamming?€? by the search engines) and you begin to get a feel for the every changing landscape of search results.
 

Other Relevance Factors

Search engine companies try to stay one step ahead of tricks and techniques to attempt to fool the search engines into ranking sites higher. In addition to frequency, location, and relevancy, search engines have included factors known as "Off Page?€? to rank site pages. These include sites that are relevant that link to a site as an authority site. The more authoritative a site appears to be the higher the site ranks in relevance.

By analyzing how pages link to each other, a search engine can both determine what a page is about and whether that page is deemed to be "important." Search engines may also monitor the search engine results for clicks to determine the relevance - the more clicks the higher the relavance.