web.pdx.edu/~rueterj/bi445/notes/boolean.htm
NOTES on Searching and Boolean Logic
A. The structure of information
- can you divide the life sciences into subject areas
- identify key words that are associated with each area
- try to make a map
- meta-information and the importance of "Being Digital"
B. Key words
- value of a key word (ref 1)
- N = number of documents in the total collection
- Tf = number of occurences of that term in the total collection
- Df= number of documents that a term occurs in
- IDf = inverse document frequency, a measure of how good a particular
term is as a descriminator between documents
- IDf= log(N/Df)
- the importance of a term
- the best indexing terms are those that occur often in a particular
document but are infrequent (i.e. have a high IDf) in the total set
of documents
- Tf x IDf = Tf x log(N/Df)
C. Boolean Logic
Used extensively in searching
- AND - true if both values true
- OR - true if either one or both of the values are true
- NOT - turns a true statement into a false statement
Why are their eight regions?
Write a statement for each of the 8 regions.
|
 |
Assignment:
Pick an area of algae that you wish to describe and index. Make a list of key
words. Describe how this area relates to other areas. Pay particular attention
to what terms are unique to your chosen area.
REFERENCES
1. Cheong, F-C. 1996. Internet Agents: Spiders, Wanderers, Brokers, and Bots. New Riders Publishing. pg 92.
2. Boole, G. An Investigation of the Laws of Thought. London: Walton and Moberly, 1854.
3/29/2000 John Rueter