Friday, April 3, 2009

How Search engine works

Many of today’s search engines use a two-step process to retrieve pages related to a user’s query. In the first step, traditional text processing is done to find all documents using the query terms, or related to the query terms by semantic meaning. This can be done by a look-up into an inverted file, with a vector space

method, or with a query expander that uses a thesaurus. With the massive size of the web, this first step can result in thousands of retrieved pages related to the query. To make this list manageable for a user, many search engines sort this list by some ranking criterion. One popular way to create this ranking is

to exploit the additional information inherent in the web due to its hyper linking structure. Thus, link analysis has become the means to ranking. One successful and well-publicized link-based ranking system is Page Rank, the ranking system used by the Google search engine. Actually, for pages related to a query, an IR (Information Retrieval) score is combined with a PR (Page Rank) score to deter-

mine an overall score.

No comments:

Post a Comment