Search Engines: A View From the Bridge

The Internet, with more than 500 million Web pages, is a vast database with few signposts. It’s incredibly easy to lose your way, but searching it methodically is possible. A good understanding of search engines and how they work is essential; it provides the roadmap for finding the people you need. First, what they’re not. Directories aren’t search engines. Places like Yahoo, Newhoo , or the Mining Company are directories, a bit like telephone yellow pages. You can find a category of interest and browse the Web listing. Directories are great if you want a collection of links to a particular subject, but with few exceptions, they’re not too helpful in finding the passive candidates you’re looking for. And, because they’re created by people who review Web sites or receive URL submissions from others, they just don’t have the same number of listings as search engines do. Search engines don’t have built-in categories. And they don’t rely on people. Instead, they use an automated program, often called a spider, that traverses the Web and hunts for pages. It reads them and puts the information into its index. This is a key difference between search engines and directories. The indexes of the major search engines contain millions of pages. When you enter your search query, the software that the search engine uses looks at your keywords and then scans everything in its index. When all the pages are found, they’re ranked in order of relevance and you get the results. But, as you know from experience, a simple keyword search can bring back thousands, even millions of matches – many of which have little relation to the candidates you’re seeking. How come? Much of it has to do with where in the page and how often your keywords were found. Let’s say you look for “resume.” Many search engines will go through their indexes, looking at the title of the Web page, any headers on it, and even the meta tags (which are part of the HTML coding that doesn’t even show up on the page). If the word appears often and in those locations, you get a higher ranking hit. So if a page contains lots of instances of “resumes,” it will rank more highly than will Joe Doe’s simple resume page where “resume” only appears once (if at all) in the title. Instead, you’ll get back loads of recruiting sites looking for resumes; companies offering to write resumes; or career centers with information about resumes. In other words, this page, because of this “resume” paragraph alone, will probably appear higher in the search results – leaving poor Mr. Doe at number 2147. What to do? Learn the quirks of each search engine you use. Understand how they index and what “formula” they use to determine relevancy. You’ll find that some are better for finding certain information than are others, just because of the way they work. Take a look, too, at the Web resumes you already have. See if you can spot a trend in the types of words that appear often. Consider the help sections of each search engine to be required reading for developing your Internet research skills. Over the next few weeks, I’ll talk about individual search engines and how to get the most out of each of them. Next week – Site Visit: AltaVista.