Search engines have two major functions – crawling & building an index, and providing answers by calculating relevancy & serving results.
Crawling and Indexing
Imagine the World Wide Web as a network of stops in a big city subway system.
Each stop is its own unique document (usually a web page, but sometimes a PDF, JPG or other file). The search engines need a way to “crawl” the entire city and find all the stops along the way, so they use the best path available – links.
Crawling and indexing the billions of documents, pages, files, news, videos and media on the world wide web.
Providing answers to user queries, most frequently through lists of relevant pages, through retrieval and rankings.
“The link structure of the web serves to bind all of the pages together.”
Through links, search engines’ automated robots, called “crawlers,” or “spiders” can reach the many billions of interconnected documents.
Once the engines find these pages, they next decipher the code from them and store selected pieces in massive hard drives, to be recalled later when needed for a search query. To accomplish the monumental task of holding billions of pages that can be accessed in a fraction of a second, the search engines have constructed datacenters all over the world.