The Macworld Web Searcher's Companion
- 28 March, 2000 12:01
SAN FRANCISCO (03/28/2000) - The best thing about the Web is that it contains just about anything you'd ever want to find. The worst thing about the Web is that sometimes it's almost impossible to find what you want. The bigger the Internet gets, the more difficult it is to focus in on what you're looking for among its 800 million Web pages -- and the more important it is that you learn how to search the Web effectively.
Notice that we said effectively. It's easy to search the Web these days, but in most cases, you can't easily find what you want -- or you do find what you want, but it's mixed in with irrelevant sites. Cast your line into the Web ocean with some search engines, and you're likely to find 43,965 sites wriggling on your hook.
There's no need to slog through tons of results, hoping that you stumble across what you're looking for. You can be an expert Web searcher, reaching your destination with a minimum of fuss (as well as fewer visits to out-of-the-way, irrelevant sites). It's as easy as choosing the right search sites -- we'll show you how -- and following our advice to expertly home in on what you seek.
Once you've learned the skills, you'll view the Web as the world's biggest library -- not as the world's biggest haystack with a valuable needle lodged deep inside.
Know Your Tools
When you start searching, knowing a bit about the tools you're using helps. The most basic distinction between Internet search sites is that some are search engines and others are directories.
A search engine builds an index by using pieces of software called spiders, which crawl the Web, indexing pages as they encounter them. Spiders return to a site periodically to check for changes, and the changes eventually get posted to the search engine's index.
Search engines rely on keyword searching-you type a word or a string of words, and the site searches its index for those specific words. Web-site designers can affect how highly their site is ranked by search engines, with careful selection of page titles, body copy, and invisible HTML tags called metatags.
(If you're interested in the ins and outs of search engines and how to design Web pages with their capabilities in mind, check out Search Engine Watch, at http://www.searchenginewatch.com.) The metatag system should give you nicely relevant results, but many Web designers manipulate this system by filling their metatags with lots of extra information. For example, some search engines used to judge a Web site's relevance to a particular topic by how many times that topic was mentioned on the site. Sneaky Webmasters simply put a couple hundred repetitions of the topic keyword into the page's metatag or put hidden text on their pages that was the same color as the page's background color. A reader couldn't see the hidden text, but search engines could.
Most often used by pornography sites, these methods of duping search engines have become so rampant that they've developed a name: spamdexing. The better search engines have fine-tuned their indexing software to exclude most spamdexed results, but some lag. If you use Excite (http://www.excite.com) or Lycos Inc. (http://www.lycos.com) to search for Monica Lewinsky, for example, you'll get links to porn sites as your top results.
The Best Ones
One of the oldest (and still among the best) search engines is AltaVista Co.
(http://www.altavista.com). It has one of the largest indexes, which is important, since a larger index means the engine covers more of the Web and your search will more likely be successful.
Google (http://www.google.com) has been around only since 1998, but it's already become a major player, thanks to its superb ability to find relevant results. Google lets you search by keyword, but it ranks its results based on how many other sites have linked to the sites that contain the term that you're looking for. The logic here is that if a bunch of sites link to a particular site, that site is more likely to have useful information. Google recently announced a specialized Apple-specific search engine (http://www.google.com/mac.html) that is devoted to information about Macs and Apple.
When to Use a Search Engine
Search engines shine when you're looking for a topic that can be easily described by a keyword as well as when you're looking for things that are very specific or obscure. For example, if you want to know everything about "Monster Trucks" or are looking for Web pages that include your name.
Directories, by contrast, aren't built by automated software; people construct them. The biggest and most successful example, Yahoo (http://www.yahoo.com), maintains a staff that accepts site suggestions from Yahoo users, categorizes the sites, and adds them to the directory. The staff (their official name is the Yahoo Surfers) assigns sites a relevancy score, which is why the most popular sites in a given category show up at the top of a Yahoo listing, with lesser-known or lower-ranked sites listed below it in alphabetical order.
Because people are a lot better at evaluating Web pages than software is, directory-based sites tend to give you not only fewer results but also results that are more relevant to your search. But because people are also slower than indexing software, it can take a long time for a site to get added to a directory, no matter how good it is.
Another important directory is the fast-growing Open Directory Project (http://www.dmoz.com), which was acquired by America Online when it bought Netscape. Using the slogan "Humans do it better," the Open Directory Project is a Web directory with more than 20,000 volunteer editors, who had categorized more than 1.4 million sites as of January 2000. The remarkable thing about the Open Directory Project, however, is that it's freely available for licensing and use by other search sites, and many commercial services draw on it to supplement their own offerings. If someone adds a site to the Open Directory Project, you'll immediately find it on all the search sites that subscribe to the Open Directory Project.
When to Use a Directory
It's often a good idea to start your Net search at a directory such as Yahoo, because most common searches there will get good results. For example, a search on Yahoo for Charon turned up several sites about the moon of Pluto, clearly indicated by the category Science & Astronomy & Solar System & Planets & Pluto.
This was just what my son needed for his science report. Looking for information about Charon with search engines brought up a bewildering mix of irrelevant things such as companies with that word in their name and sites dedicated to Greek mythology's ferryman of the dead, who also went by that name.
Neither Fish nor Fowl
As you might guess, some search sites use a hybrid approach, with spiders backing up humans. Even Yahoo, the granddaddy of the human-based approach, does this now. When you do a search for something that Yahoo can't find in its own directory, your query gets bounced to a search engine from Inktomi, which searches though its spider-built index. Similarly, the search engines at HotBot (http://www.hotbot.com) and AltaVista are now backed up by the Open Directory Project's database.
Ask a Simple Question
The Ask Jeeves site (http://www.ask.com) combines the directory and search-engine approaches, too, but in a unique, user-friendly way. It allows you to ask questions in plain English, rather than requiring keywords. You simply type in a question such as "Where can I find a cost comparison for mortgages?" and Jeeves returns a choice of answers.
How does it work? The site parses your question and compares it to the millions of questions, compiled by Jeeves staff, already on file. When it finds a match, Jeeves displays the answer to your question. If your question isn't on file, Jeeves performs a keyword search on your question, returning results from several other search sites.
The advantages to using Ask Jeeves are that you can phrase your question the way you would ask it in the real world and that you don't have to learn any search techniques. Plus, it almost always gives you useful results.
Another category of search site is the metasearch site. This is a search site that doesn't do its own Web indexing but instead searches other search sites.
Most metasearch sites take your search term and submit it to several search engines, eliminating duplicate results; a good example of this kind of site is Go2Net (previously known as MetaCrawler), at http://www.go2net.com.
There are also metasearch utilities, such as Apple's own Sherlock, which can do smart searches of multiple sites (see ""Sherlock Power Searching"," Secrets, March 1999, for more tips about using Sherlock). Metasearching is useful because it winnows out a lot of the repeated hits you get from some search engines. But results are at the mercy of the accuracy of the individual sites in the metasearch. Following the ancient computing law "garbage in, garbage out," a metasearch site is only as good as its component sites.
Instead of going to a lot of sites, why not just build a site that lists other sites and lets you pick which one to search? If a search turns out to be not useful, another site is as close as your browser's Back button. That's the idea behind all-in-one search sites. One of my favorites is Search It All (http://www.search-it-all.com), which gives you a single query box that lets you search from any of 23 other search sites. The site also has specialized areas, such as Biomedical, Government, Reference, and Sports, for more specific searches, and each area gives you several information sources to choose from.
Find What You Want
For most of your searches, you'll probably start with a directory or Ask Jeeves. There are going to be times, however, when you can't beat a search engine. To zero in on the information you really want, you'll need to learn how to talk the language that search engines use.
Let's say that you've just bought a new Power Mac G4 and you want to add a second monitor and a fast video card to it. A search for G4 multiple monitors on AltaVista produces 1,940 results-more than you can comfortably look through.
But a search for G4 "multiple monitors"-using quotation marks to group the last two words-trims the results list to 125, and adding voodoo3 (the name of a video card) narrows the results to a manageable 34 sites.
Power-Searching with AltaVista
AltaVista is an amazingly powerful search engine if you know all of its little shortcuts and helpers. In case, like most of us, you don't, here are some ways to make your searching more efficient.
You'll want to learn AltaVista's methods for two reasons: first, it's one of the largest and most popular search engines, and second, AltaVista has been around longer than many of the other search engines, and lots of them have adopted AltaVista's way of doing things. (Check the More Information or Advanced Search sections of other sites for their unique features.) Except when noted, you can employ all these search tips from the main AltaVista page, rather than from the Advanced Search page.
Where Will You Search?
Below the search field on AltaVista's opening page, you'll see the Find Results On choices. Here, AltaVista lets you choose where to search: The Web, News, Discussion Groups, or Products. Selecting News limits your search to current news stories, whereas selecting Products instructs AltaVista to search shopping sites.
Specify Your Native Tongue
One way to cut down on the number of irrelevant pages your search returns is to specify what language you speak. Chances are, you're only interested in Web pages in a single language. Why wade through hundreds of sites in Romanian? You can cut back the number of results dramatically if you use the pop-up menu to specify your preferred language.
Use the Right Case
AltaVista allows you to use wild cards-partial words-in your searches by adding an asterisk to your words. Searching for auto* returns all pages with words that include auto (for example, automobile, automotive, and automatic). Use wild cards to easily capture the singular and plural forms of your keywords.
If you want to learn all about your favorite breed of dog, you might assume you can simply type Labrador retrievers in the AltaVista search field. Not so fast: all you've asked the search engine to do is find all pages that include either of those words. That means you'll get results that include pages on Labrador (the Canadian region) and pages on golden retrievers, as well as a few pages about Labrador retrievers. If you don't want to spend time scrolling through search results that are off the mark, you need to be more precise.
To find only pages that contain the exact phrase Labrador retrievers, put quotes around the phrase (like so: "Labrador retrievers"). However, if you want to know about taking your pet to a dog show, you probably won't have luck searching for "Labrador retriever dog show" unless you happen to find a page that has exactly those four words in that order. And if you search for "Labrador retriever" "dog show", you'll get all the dog-show pages and all the Labrador retriever pages, not just the ones about showing your breed.
You need to be able to tell the search engine that you want to find only pages that contain both the phrases Labrador retriever and dog show. For that, you'll need to move beyond typing simple phrases and begin adding powerful symbols and commands to your searches.
To tell AltaVista that each phrase must be on the pages it returns, add a plus sign-which signifies that the item following it must appear-before each phrase.
Searching for +"Labrador retriever" +"dog show" requires that the pages contain both phrases. And as you'd expect, adding a minus sign before a word or phrase means that a returned page cannot contain that topic. That lets you narrow your search request-so +"Labrador retriever" +"dog show" -California would exclude shows held in the Golden State.
Try an Advanced Search
You may still get too many results, even when you've narrowed things down with the above techniques. When you need a more precise search, you can click on the Advanced Search tab on AltaVista's main page. The Advanced Search option doesn't allow you to use pluses and minuses to specify whether to include or exclude pages. Instead, you need to use Boolean logic-a mixture of keywords separated by ANDs, ORs, and NOTs-which gives much more specific results.
Let's say that you're searching for a review of the latest G4 Macs, trying to figure out which one to buy. You could go to AltaVista's main search bar and enter +G4 +review, but that search results in reviews of lots of strange things (such as a page on cell aging), all of which have G4 in their names. But if you search for +G4 +review +"power mac", you'll miss all the pages that refer to the machine just as a Mac or Macintosh. It's time to try a Boolean search.
A regular search doesn't let you search for options where any one of a list of items must be true. But a Boolean search lets you use the command OR to do just that. On the advanced-search page, therefore, you could search for G4 AND review AND ("power mac" OR mac*). You need to put your OR possibilities within parentheses so the search engine knows where the list begins and ends. Also, note that on the Advanced Search page, you use the command AND instead of the plus sign.
If you want to narrow the search down further, you can use the AND NOT option, which works like the minus sign in a regular search. For example, if you want to skip all results from http://www.apple.com because you know their opinion about the G4 already, your search would be G4 AND review AND ("power mac" OR mac*) AND NOT host:apple.com. (What's that funky host: apple.com thing we just did? See the table, "8 Ways to Find Things Faster", for these special commands.) The Last WordAutomated searching has come a long way in the last couple of years, but the best software still can't match a human editor when it comes to finding relevant results. That's why your first stop when looking for information on the Web should usually be a directory, such as Yahoo or the Open Directory Project. Ask Jeeves's plain-English queries also deserve special mention, because the site is very easy to use and nearly always gives useful results.
If you're looking for a very specific topic, you'll want to use one of the keyword-based search engines. AltaVista and Google, although they work in vastly different ways, both excel at providing relevant results. Google in particular does a great job of sparing you most of the junk sites.
Finding information on the World Wide Web is still far from a perfect process, but with your newfound knowledge and a little persistence, there's no doubt that your search will be a success.µ Contributing Editor Tom Negrino's latest book is Quicken 2000 for Macintosh, Visual QuickStart Guide (Peachpit Press, 1999). Negrino's Web log can be found at http://www.backupbrain.com.
The Best Places
Best Place to Find Music
You've got to love a technology that makes acquiring and listening to music easier while simultaneously scaring the pants off fat-cat record-company moguls. MP3 audio has achieved those goals. Most people are using MP3 programs just to rip their CD collections to their hard disks so they can cart them around on an iBook or download them to a sporty MP3 player. But there are many musicians who use easily transportable MP3 files to reach a worldwide audience.
At MP3.com, you'll find some new music from well-known artists as well as a lot more music from artists unsigned by record companies. You'll discover that in many cases, there are darn good reasons why these people haven't been signed to record deals, but you'll also find some real gems. MP3.com has hi-fi and lo-fi versions of most of its music; the lo-fi versions are best for modem users.
Best Place to Track Down Old and Rare BooksBibliofind (http://www.bibliofind.com)It's nice that you can get practically any book in print at any number of online bookstores, from Amazon (http://www.amazon.com) to Powell's Books (http://www.powells.com). For those who are looking for a hard-to-find volume, however, Bibliofind has listings of more than 10 million used and rare books at booksellers around the world. Once you find a book in Bibliofind's database, you can buy the book directly from the seller; Bibliofind doesn't take a cut.
Best Place to Find Discussion Boards
Net newcomers tend to think the Web is the Internet, but that's not so; a huge amount of the data transmitted over the Net is part of a large number of discussion boards called Usenet. There are more than 30,000 of these boards (called newsgroups) on Usenet, and chances are good that your ISP (Internet service provider) doesn't carry the entire newsfeed. That's why you'll want to turn to Remarq, which not only has virtually all of Usenet but also stores it so you can search down discussions about, well, just about anything. Remarq organizes newsgroups into subject areas and puts a friendlier face on Usenet's arcane naming structure, so you don't have to remember the name of the comp.sys.mac.misc newsgroup.
Best Place to Decipher Tech Terms
Computer Currents High-Tech Dictionary
(http://www.currents.net/resources/dictionary/inex.html)Can't tell a GIF from a GIMP or a VAR from a VAD? You're not alone. There are so many acronyms and technical terms to keep track of in the computer field that it's good we have the High-Tech Dictionary to separate the real terms from the technobabble. In addition to a searchable database of tech terms, the site has a list of emoticons (those sideways facial expressions made from keystrokes), HTML tags, file extensions, and Internet domain suffixes. The site can also generate a random term.
Best Place to Look for a Job
Tired of your job? Looking for a new one but don't have the time to pound the pavement or scan through the Sunday paper? Monster.com may just have your new job waiting for you. You can use the site by browsing the job listings, sorted (of course) by geography, category, and keywords. Or if you're ready to make the jump to "e-lancing," you can list yourself in an auction where employers bid for your skills. The site also includes job-hunting resources; lets you store your résumé online; and has thousands of company profiles, so you can check out a potential employer before you apply.
Best Place to Find Movie and TV Info
The Internet Movie Database (http://www.imdb.com)The ultimate resource for settling bar bets about virtually any movie or television program, IMDb (now owned by Amazon.com) has listings of titles, people (actors, actresses, and crew), characters, plots, and famous quotes.
You'll also find their lists of the top films of all time (as well as the worst!) and Academy Award winners. You can even discover which entertainers share your birthday. My favorite IMDb features are the listings of trivia and goofs associated with individual movies. If you've ever wondered about the significance of oranges in the Godfather movies or thought you saw a mistake in Terminator 2, you can get the facts here.
In recent months, the site has transformed from a directory of film and TV information into an entertainment portal with discussion boards, local movie times, and entertainment news. IMDb accepts some reader submissions, so if you know something about a movie that's not already in the database, you can add it (they'll check what you tell them, of course).
Best Site for Finding Web Graphics
Art Today (http://www.arttoday.com)
Looking for graphics for your Web site? Want a snappy look but aren't so hot with Photoshop? Look no further than Art Today, which has 150GB of images available for paid members to download (membership starts at $29.95 per year and goes up to $99.95 per year). But a free membership still gives you access to over 40,000 Web graphics. The free images include buttons, icons, backgrounds, and other useful graphics. When you cough up some cash, you get access to clip art, photographs, and fonts.
Best Way to Find Web Logs
Eatonweb Blog Portal (http://www.eatonweb.com/portal/)Not yet familiar with the term Web log? You soon will be. Web logs are personal Web sites operated by people who make lists of links to stuff that interests them, usually adding their own personal spin. A Web log (or weblog-sometimes shortened to blog) can range from simple pointers to other sites, to deeply personal online diaries, to random thoughts and jokes. There are now hundreds of these Web logs, and Web loggers become guides to the information that interests them.
Some blogs focus on specific topics, such as Web design, politics, or tech news. Others have a wider scope and take on just about any subject.
Sidebar: 8 Ways to Find Things Faster
AltaVista has a variety of search attributes you can use to make your searches even more precise. To use one, go to Altavista's Advanced Search page and type it in the search field, followed by a colon and the text you want to require or disallow. These search attributes let you zoom in on images, links, or portions of Web pages.
Search Attribute What It Does How It Can Help Youanchor: Finds pages that contain the specified word or phrase in the text of a hyperlink. Searching for anchor:"Click here to make money fast!" would bring you pages with the words Click here to make money fast! as links.This search attribute lets you find links with particular wording.domain: Finds pages within a particular top-level domain or two-letter country code. Use domain:gov to find pages from government sites or domain:uk to find pages on British sitesThis is particularly helpful if you're trying to find information on an official site. For example, you might want to see what the Justice Department itself has to say about Microsoft.host: Searches just the host-name portions of URLs. For example, host:apple.com would find pages on Apple's Web site, and adding -host:geocities.com to your search would keep pages with annoying pop-up windows out of your results.With this search attribute, you can narrow your search to just a single host but still get all the sites under that host. For example, you would get pages at info.apple.com as well as www.apple.com.image:Finds pages with images that have a specific file name. Use image:pikachu to find all the pages with an image named pikachu.This will save you a lot of time if you want to see something instead of just read about it.link:Finds pages with a link to another page with the specified URL text. You can use link:www.yoursite.com to find all the Web pages that link to your site.This search attribute lets you find out how popular a site is. One indicator of a site's popularity is the number of other sites that link to it.
If you're a Webmaster, this search lets you find out which sites have linked to yours.text:Finds pages that contain the specified text, excluding image tags, links, or URLs. This is much the same as a regular Web search; it's not used very often.title: Finds pages that contain the specified word or phrase in their titles.
Searching for title:"welcome to adobe golive" gives you all the Web pages of GoLive users who couldn't figure out how to turn off the program's default title (there are more than 30,000 Web pages with this title out there, at last count).Since the title of a page usually refers to the topic or the most important information of a page, this search attribute helps you find relevant results and cut down on the number of hits.url: Finds pages with a specific word or phrase in the URL. Use url:macintosh to find all pages on all servers that have the word macintosh anywhere in the URL.This search attribute lets you do broad searches for terms in URLs.