PRINCIPLES OF SMART SEARCHING
The key to successful searching: Search engines are fast, they are not smart. Utilize your intelligence. And...practice.
A search engine does not understand what your keywords mean or why they are important to you. To a search engine a keyword is just a string of characters. It does not distinguish the difference between cancer the crab and cancer the disease.
You do know what you are looking for and what your query means. The search engine will supply the raw computing power and you must supply the brains.
The success of your searches depends on three factors:
1. Your ability to create exact matches between terms you search for and terms used in the documents you want to find.
2. The size and contents of the database you choose.
3. The features used for searching its contents.
Before beginning your search, determine: Do you want a specific hard-to-find document on an esoteric subject, or general information on a broader topic? Do you need to search the entire Web, or is what you are seeking likely to be found on a number of sites, or only the most popular sites?
For any one query, you will considerably improve your chances of finding the information you want by using several search tools or a 'meta search engine' that combines search results from multiple engines. Studies have revealed that combining the results of six engines turned up 3.5 times the number of documents on average compared with the results from only one engine.
Most experienced Web searchers use at least two or three different search tools regularly, and have mastered their features. It is wise to check more than one search tool for any topic, because search results vary widely from one to another.
Know Where To Look First
Internet search engines, indexes, catalogs or directories are used to find :
Web documents, databases, sites, pages, books, articles, ads, product descriptions, etc.
Pictures, photos, music, videos, games, maps, software, etc.
People, phone numbers, email addresses, newsgroup postings, many other types of information and media
If you are searching for a company, person, job, wife, software, email address or phone number there are various databases containing specific information that might be more useful to you than a general search engine.
If you're more interested in broad, general information, the first place to go is to a Web directory.
If you're after narrow, specific information, a Web search engine is probably a better choice.
Fine-tune Your Keywords
A keyword is simply one item or question for which you want information. Keyword searching allows you to search for words without concern about the order in which they appear. Since most nouns are subsets of other nouns, be specific and enter the smallest possible subset that describes what you want.
You may have alternate, broader and narrower search terms to expand (or narrow if necessary), use synonyms, think of variants, or related themes to find what you want. If you search on commonly used words, you will get many irrelevant documents containing your words but not your subject.
Be Refined
Learn how to refine your queries using advanced techniques, e.g. using operators to include other keywords that you would expect to find in relevant documents. For example, exclude with the Boolean NOT; excluding is has become important with the growth of the Web. Read and study the help files in order to take full advantage of advanced search options. Run your initial query over again several times, each time adding further refinements to narrow down your list of relevant hits. Use phrases, if possible.
Query By Example
Utilize the option that many search engines offer where you can "query by example," or "find similar sites," to the ones that come up on your initial hit list.
Meta-Search Engines
In an ordinary search engine or search tool (such as AltaVista), you submit keywords to a single database of web-pages owned by the search tool, and you get back a display of documents from that search engine's unique database of web-pages.
In a meta-search engine, you submit keywords in its search box, and it transmits your search simultaneously to most of the popular search engines and their databases of web pages. What you get back a compilation of results containing matching sites from all of the
search engines queried.
Meta-search engines act as intelligent middle-agents to pass your search through, gather the responses from the individual search tools they query, and then give you a more unified report of results from many different resources.
General searching is sufficent to find the information you are looking for in most cases. While using an engine's advanced features is normally reserved for more precise searches, utilize these when needed. And remember, each search engine has different search principles and different methods of searching.
Return to Index
SEARCH OPERATORS:
HOW TO SEARCH MORE EFFECTIVELY
Search engines are great tools for finding information on the Internet. At the most basic level, you type a word or phrase into the search engine, and you get a list of Web sites that contain those words or phrases. Unfortunately, the list of sites can be so long you may never find the one you really need. To limit the number of "hits" you get from your searches, there are ways to pose your query more effectively by using operators or connectors.
Operators are the rules or specific instructions used for composing a query in a keyword search. While each search engine has its own operators, some operators are used in common by a number of search engines. The following are the most commonly used operators.
BOOLEAN OPERATORS
Boolean logic is a symbolic logic system invented by French mathematician George Boole in 1849. Boole formalized a set of rules which transformed formal logic from a philosophical discipline to a mathematical one. Known as Boolean algebra, they use operators (AND, OR, NOT) to create relationships among words and concepts. In addition to making modern binary digital computers possible, Boolean logic is essential to the operation of search engines.
Boolean searching is a powerful way to search online information resources. The effective use of these special words as "connectors" in your search statements will greatly improve the results of your online searching as they will help you expand or narrow the scope of your search.
Boolean operators include AND, OR, NOT, and NEAR. These operators should be capitalized. Capitalizing them will help you differentiate your operators from your search words as well.
What they do:
AND indicates that the documents found must contain all the words joined by the AND operator. To find documents that contain the words los, angeles, and dodgers, you would enter: los AND angeles AND dodgers. The AND operator narrows a search to include only documents that contain all the keywords.
AND operators are a great way of limiting the numbers of search results because they link
subjects to create a compound subject.
OR indicates that the documents found must contain at least one of the words joined by the OR operator. To find documents that contain the word republicans or the word democrats, enter: republicans OR democrats. The OR operator simply broadens searches to include documents that contain any keywords of a search.
OR operators can be useful when searching for alternative spellings, such as color OR colour, or in order to broaden a query when searching for synonyms, such as city OR urban. In this case, the combined keywords will cover the topic better than a search for either topic alone, because some documents may use only one of these words. The OR operator is also helpful when you want to search for a keyword that is commonly abbreviated, such as Kentucky OR KY.
NOT or AND NOT operators indicate that the documents found cannot contain the word that follows the NOT or AND NOT operator. To find documents that contain the word football but not the word modell, enter: football AND NOT modell. The NOT or AND NOT operator narrows a search to exclude certain keywords.
If you wanted to search synonyms of the keyword ghosts, you could use the OR operator to broaden the search and submit the query phrase ghosts OR apparitions OR spirits, but this search probably would also result in documents about moonshine. You can use the NOT operator to restrict the search: ghosts OR apparitions OR spirits NOT moonshine NOT alcohol.
PROXIMITY SEARCH
NEAR operators are not found in true Boolean logic. NEAR operators, though, work similarly to AND operators, but NEAR operators further limit the search results by requiring that the keywords be within ten words of each other. For example, tobacco near cancer looks for documents that contain the word tobacco where it is found 10 words or less from the word cancer.
Or you can be more specific with the N (near) operator specifying how many words intervene between search terms. For example, men N2 space locates items where the words men and space occur within two words of each other. This would find men in space but not men who have been in space.
The W (within) operator allows you to specify how many words intervene between search terms. This operator requires the second term to appear after the first term. For example, book W2 Shakespeare should locate documents where the word book comes within two words after the word Shakespeare.
FOLLOWED BY or ADJ designates that one term must directly follow the other.
WITH searches for terms in the same field of a record (e.g. in the author field or in the same subject field), but in any order and not necessarily next to each other:
SAME searches for terms in the same group of fields in a record (e.g. all subject fields).
PARENTHESES
Boolean queries can be grouped together by parentheses to make your searches more specific. You can use the parenthetical operand to create "nested" queries within your larger query to form more complicated queries. Examples: To find documents that contain the word rocket and either the word nasa or the word apollo, enter: rocket AND (nasa OR apollo). Or, the search string (oil OR corporations) AND (environment AND disasters) will find any documents that contain information about oil or corporations which are associated with environment disasters.
+ AND - SIGNS; FORCE & EXCLUDE WORDS:
If you type a plus sign (+) directly in front of a word or phrase, you force the word or phrase to appear in the search results. To find documents that contain parks in cincinnati ohio enter: parks +cincinnati +ohio. (Note there is no space to the right of the '+' sign.)
Conversely, if you type a minus sign (-) directly in front of a word or phrase that word or phrase is excluded in the search results. To find documents that do contain airplanes but do not contain boeing enter: airplanes -boeing. (Note there is no space to the right of the '-' sign.)
QUOTATION MARKS
Enclosing a multiword phrase in "quotation marks" tells the search engine to list only sites that contain those words in that exact order. To find documents that contain the phrase who will rid me of this meddlesome priest (as in The Murder Of Thomas Becket) enter: "who will rid me of this meddlesome priest." If you want to find documents on the San Francisco Golden Gate Bridge, your query should look like: "Golden Gate Bridge".
STEMMING [Truncation]
The ability of a search to include the stem or the main part of a word can be automatic, or it may require use of a wildcard, symbolized by an asterisk [*], to initiate. The search wildcard is a handy shortcut plunked down anywhere in the midst of your keywords. The wildcard, or truncation symbol, allows you to search for a substring (or part) of a word. For example, comput* retrieves compute, computing, computers, computations, computerized, etc.
This finds slight variations of your search terms or search for strings of words or for differences of spelling within words. (e.g. "wine*" will find wines, wineries, winery, etc.; garden? will find hits for gardens, gardening, gardeners, etc.; wom*n" will find woman, women or poe?s will find poets or poems).
The question mark (?) is also sometime used for this purpose.
TITLE SEARCH
Using title: or t: will restrict searches to the title portion of web documents. To find documents that contain the word law in their titles, enter: title:law or t:law. (Note there is no space between the letters and the colon.)
Return to Index
TYPES OF SEARCH METHODS
Although all search tools are commonly referred to as search engines, there is a distinct difference. There are really directories and engines. Think of a Web directory as a subject catalog--something like the subject catalog in your local library. Think of a Web search engine as an index that enables you to seek out specific words and phrases.
Search engines also use a web searcher (otherwise known as a crawler, spider or robot) which traverses the web, cataloging everything in its path. It is estimated that the best-rated search engines include over 90% of existing web sites.
On-line web directories are manually maintained indexes that list and categorize pages for your surfing pleasure. They can also can be very helpful. However, the scope and integrity of these lists are dependent on the diligence of the person or organization who builds them.
There are essentially four search methods. The following describe each one:
1. A directory search tool searches for information by subject matter. It is a hierarchical search that starts with a general subject heading and follows with a succession of increasingly more specific sub-headings. It is also known as a subject search.
Tip: Choose a subject search when you want general information on a subject or topic. Often, you can find links in the references provided that will lead to specific information you want.
2. A search engine searches databases by using keywords. It responds to a query with a list of references or hits. It is also known as a keyword search. Because of their broader scope and greater complexity, keyword searches require far greater
explanation than subject searches.
Tip: Choose a keyword search to obtain specific information, since its comprehensive database is likely to contain the information sought.
3. A directory with search engine uses both the subject and keyword search methods described above. As a directory search, it follows a directory path through increasingly more specific subject matter. At each stop along the path, a search engine option is provided to enable the searcher to convert to a keyword search. A subject search and keyword search are thus said to be coordinated. The further down the path the keyword search is made, the narrower is the search field and the fewer and more relevant the hits.
Tip: Use when you are uncertain which of the two search methods, subject or keyword, will provide the best results.
4. A multi-engine search, also called a meta-search, utilizes a number of search engines
simultaneously. The search is conducted via keywords employing commonly used operators or plain language. It then lists the hits either by search engine or by integrating the results into a single listing.
Tip: Use to speed up the search process.
Return to Index
GLOSSARY OF SEARCH TERMS
This glossary contains terms applicable to searching the World Wide Web (WWW) and the Internet.
Bookmark: A page on the Netscape Browser that lists URLs or Web addresses. Bookmarks serve as links for easy access to Web addresses.
Boolean Search: A keyword search that uses Boolean Operators for obtaining a precise definition of a query.
Browsing: In the WWW browsing refers to a directory search. In popular use, browsing, or surfing, is casually looking for information on the Internet.
Browser: A computer program used to connect to Web sites on the World Wide Web and access information.
Concept Search: A query that implies a term’s broader meaning, rather than its literal meaning. Concept searching makes your keywords trigger additional matches on the same concept: shame might also retrieve guilt; or: dogs might also retrieve pet grooming.
Data: Information such as text, numbers, images and sound contained in a form that can be processed on a computer.
Database: Stored information at a Search Tool’s Web site. For search engines, a robot is used to keep the database current by an automated procedure called spidering. For directories, the database is kept current through reviews conducted by qualified people.
Directory Search: A hierarchical search starts with a general heading and proceeds through selection of increasingly more specific headings or subjects. It provides a means of focusing more closely on the object of the search. It is also referred to as subject search, directory guide or directory tree.
False Drops: Documents that are retrieved but are not relevant to the user’s interest.
Fields: Components of a Web page such as a title, URL, summary, text and images often displayed by a search engine to help narrow a search.
Full-Text Indexing: A database index that includes all terms and URLs. In practice, each search tool uses a filter to remove words it considers unnecessary.
Hierarchical: A ranking of subjects or things from the most general to the most specific.
Hits: A list of links or references to documents that are returned in response to a query, also called matches or matching queries.
Home Page:The first page of a search tool’s Web site.
Hypertext Link: A highlighted word or image [shown in color] on a Web page that when clicked connects or links to another location with related information.
Index or catalog: A file that designates the location of specific data in a search engine’s database.
Internet: The Internet, with a large I, refers to a worldwide system of linked computer networks that serve as a communication system. When used with a small i, a term used to mean a group of interconnected local networks.
Keyword: A term that a computer can recognize and use as the basis for executing a search.
Keyword Search: A search that utilizes terms that defines the user's interest.
Link: More accurately hypertext link. It is a connection between two Web pages or sites that have related information. For example, highlighted data such as text and graphics at one Web site when clicked provide related information residing at another Web site.
Location Box, or Address Box: A designated place within a browser for an address [URL]. It is the starting point for accessing a Web site.
Multi-Engine Search: A search that uses a number search engines in parallel to provide a response to a query.
Operator: A rule or a specific instruction used in composing a query.
Phrase Search: A search that uses a string of adjacent, related words enclosed in quote marks as the query.
Popular Items: A search category created to cover frequently sought subjects and services. Most search tools list Popular Items on their Home Page.
Precision: A standard measure of information retrieval, defined as the number of relevant documents obtained divided by the total number of documents retrieved.
Proximity: How closely words appear together within a document. In this context, adjacency or phrase usually means that words must appear exactly in the order specified with no intervening words.
Query: A search request. A combination of words and symbols that defines the information that the user is seeking. Queries are used to direct search tools to appropriate Web sites to obtain information.
Query By Example: Use of an example to solicit more like information.
Ranking: A means of listing hits in the order of their relevancy. It is usually determined by some selection of the number, location and frequency of the term in the document being searched.
Relevance: The usefulness of a response to a query.
Robot: The software for indexing and updating Web sites. It operates by scanning documents on the Internet via a network of links. A robot is also known as a spider, crawler and indexer.
Search Box: A place within a search engine’s Web site to enter a query. Also called a location box and address box.
Search Engine: A host computer that serves a Web site and provides information from within its own sites and via links with other Web sites. This is accomplished by using the keywords of a query to match index terms in the search engine’s database.
Search Tool: A computer program which conducts a search on the World Wide Web.
Site: The location of a page on the Internet. In WWW, it is called a Web site and identified by its URL.
Spider: To spider is the process of scanning Web sites to add new pages and to update existing ones. A spider is the same as a robot.
Stemming: The use of a stem [i.e. root] of a word to search words that are derived from it. For example, "child" would retrieve information on child, children, childhood, childless and so on.
Term: A single word or combination of words used in a query.
Truncation: See Stemming.
Uniform Resource Locator [URL]: Uniform Resource Locator is the Internet designation of a Web address.
Web Page: The address of a Web site. It can also refer to a page within a Web site. When Web pages are part of the same document, they are also collectively known as a Web site.
Web Site: In search use, it is a specific address or URL on the WWW. In function, it is a computer system that is set up to distribute documents stored in its database. Web sites range in size from as little as one page to a vast number of pages, such as those of a search engine’s database or a full text book.
World Wide Web [WWW] or the Web:The World Wide Web is a collection of graphical documents made available on computers around the world called servers.
The World Wide Web is not the Internet. The Internet is not the World Wide Web. They are distinctly different, and it is important that you understand the difference.
GOOD LUCK with all your searches.
Return to Index