A search engine is a software program or script available through the Internet that searches documents and files for keywords and returns the results of any files containing those keywords. Today, there are thousands of different search engines available on the Internet, each with their own abilities and features. The first search engine ever developed is considered Archie, which was used to search for FTP files and the first text-based search engine is considered Veronica. Today, the most popular and well known search engine is Google.
How a search engine works
Because large search engines contain millions and sometimes billions of pages, many search engines not only just search the pages but also display the results depending upon their importance. This importance is commonly determined by using various algorithms.
As illustrated in the image on the right, the source of all search engine data is a spider or crawler, which automatically visits pages and indexes their contents.
Once a page has been crawled, the data contained within the page is processed. Often, this can involve the steps below.
- Strip out stop words.
- Record the remaining words in the page and the frequency they occur.
- Record links to other pages.
- Record information about images or other embedded media.
The data collected above is used to rank the page and is the primary method a search engine uses to determine if a page should be shown and in what order.
Finally, once the data is processed it is broken up into one or more files, moved to different computers, or loaded into memory where it can be accessed when a search is performed.
- How to create a search engine on your website.
- Help with finding information on the Internet.
- Listing of popular Internet search engines.