Search 101 - how search engines work - part 2

Posted by Mark Rogers on Mar 13, 2008 | 


In Part 1 we covered how a search engine crawler visits web pages. In this part we're going to investigate how words on web pages are indexed.

You'll recall the three phases of search engines:

A search engine index works very like the way the index in book works: in a book each word in the index lists page numbers the word appears on; in a search index each word has a list of pages the word appears on.

Here's what happens when you search for "blue widgets":

  1. Get the list of pages containing the word "blue"
  2. Get the list of pages containing the word "widgets"
  3. Return the pages that appear in both lists

The really clever stuff on search engines happens when deciding which pages are most relevant and get listed first, which we'll cover in Part 3.


First posted Mar 2008