nsiday6.htmHTMLBOBOJ55q nsiday6

NSI SCRIPT DAY 6

Journal Starters:

Pick an article in the paper (or a magazine) that describes the results of a scientific study. Analyse the description of the study like we did with the examples in class.

"The most important function of education at any level is to develop the personality of the individual and the significance of his life to himself and to others. This is the basic architecture of a life; the rest is ornamentation and decoration of the structure."
-- Grayson Kirk
-------------------------------------------------------------------------------------------------------
Any questions or interesting things to discuss related to the pseudoscience projects you're doing?

STATISTICS: Basic intro so we'll have some common language for dealing with data. We'll need to come back to these ideas in more detail when we have data to work with in projects, but at least wanted to make sure everyone familiar with the terms, before the Global Warming activity. Also, it's very useful in letting you look critically at reports you might read, so you are not intimidated by the terminology.

Point out that these are just tools to help us see the truth about what works and what doesn't. Use example of Bayer commercial: I'm not convinced by scientific research, by charts and graphs. I'm convinced because it makes my headache go away.

Central Tendency: a way to determine a "typical" data point, or what the best guess of the "real" value is, in the face of random errors in your data

average=arithmetic mean= (sum of values)/(# of values)

You can also have weighted means, like your GPA. Your GPA is calculated by :

[sum over all classes (class grade* # of credit hours)]/total # of credit hours

So, if you took 2 classes last term and made an A in a course that was 1 credit hour and a B in a course that was 3 credit hours, your GPA for the term would be
(4*1+3*3)/(1+3) = 3.25

When we talk about a "mean", it's usually an arithmetic mean, but there are cases when you'll want to use a geometric mean instead. Geometric means are useful when you're trying to calculate an average rate of change.

geometric mean = nth root(product of N values)


Means are pretty useful, but they are can heavily affected by outliers and aren't always indicative of the quantity of interest. Imagine a community where 1 person makes 50 million dollars a year while everyone else makes $20,000.

The average income would be
[(50,000,000)+ 99*20,000]/100=$519,000

If you read that the average income was this high, you might conclude that you couldn't afford to live in this community, even if you were making a $100,000, 5 times what the great majority of the residents make. A better measure to look at would be the MEDIAN income.

The median value is the middle value, if you ordered your data from the lowest to the highest (or vice-versa). So, if you had 5 data points, you'd take the third (highest or lowest) value. If you have an even number of data points, you average the two values closest to the middle.

Variabilty is a measure of how far from the mean the data points tend to lie. It's important because it gives you an idea of how representative the mean actually is, and because sometimes it's the spread of the points that's actually important. A good example is the weather report. Usually you're given the high and low temperatures for the day rather than the average temperature. An average temperature of 70 degrees could mean a pleasant fall day or it could mean a day where it got as cold as -30 degrees and as hot as 110 degrees! (other examples: quality control, financial risk, etc.)

Standard deviation is a common measure of variability or spread in data. It's particularly useful when you are trying to make comparisons between methods, to see whether they are really different (eg to see if a medicine reduces incidence of a disease).

Comparing Numbers:

Whenever you hear numbers quoted, it's a good idea to compare it something else to get a feel for its significance. Risk analysis is one an example of an area where people sometimes forget to do this.

There's often a lot of irrationality associated with risk, because we are naturally afraid of things, and we wish we could reduce risks to zero. So it's easy to think we can just put everything in a category of "good" or "bad" and pass regulations to eliminate or restrict the bad, and then we will be safe. But of course we cannot do this - every activity entails some risk.

A good illustration of this was made in a science project of a middle school student a few years ago. The student was interested in a substance called "dihydrogen monoxide." He went to a shopping mall with a petition to ban this substance, pointing out the dangers associated with it:

The Invisible Killer

Dihydrogen monoxide is colorless, odorless, tasteless, and kills uncounted
thousands of people every year. Most of these deaths are caused by
accidental inhalation of DHMO, but the dangers of dihydrogen monoxide do not
end there. Prolonged exposure to its solid form causes severe tissue damage.
Symptoms of DHMO ingestion can include excessive sweating and urination, and
possibly a bloated feeling, nausea, vomiting and body electrolyte imbalance.
For those who have become dependent, DHMO withdrawal means certain death.

Dihydrogen monoxide:

* is also known as hydroxl acid, and is the major component of acid rain.
* contributes to the "greenhouse effect."
* may cause severe burns.
* contributes to the erosion of our natural landscape.
* accelerates corrosion and rusting of many metals.
* may cause electrical failures and decreased effectiveness of automobile
brakes.
* has been found in excised tumors of terminal cancer patients.

Contamination Is Reaching Epidemic Proportions!

Quantities of dihydrogen monoxide have been found in almost every stream,
lake, and reservoir in America today. But the pollution is global, and the
contaminant has even been found in Antarctic ice. DHMO has caused millions
of dollars of property damage in the midwest, and recently California.

Despite the danger, dihydrogen monoxide is often used:

* as an industrial solvent and coolant.
* in nuclear power plants.
* in the production of styrofoam.
* as a fire retardant.
* in many forms of cruel animal research.
* in the distribution of pesticides. Even after washing, produce remains
contaminated by this chemical.
* as an additive in certain "junk-foods" and other food products.

Companies dump waste DHMO into rivers and the oceans, and nothing can be done
to stop them because this practice is still legal. The impact on wildlife is
extreme, and we cannot afford to ignore it any longer!

All of these claims are true. But does anybody know what the common name for dihydrogen monoxide is? (A molecule with 2 hydrogen atoms and 1 oxygen atom.) = water!!

So clearly, we cannot just ban everything that has potential harmful effects - most things are double edged swords, both beneficial and harmful depending on amounts and contexts.

What we really want to do is to compare risks, and focus on reducing the major ones to a level that will blend in with the background. Whenever you're thinking about some risk, and the cost associated with reducing it, what you care about is not the ABSOLUTE risk, but the risk in comparison to other things.

(put up slide showing risk of different things)

As an example, maybe you hear that natural cosmic rays contribute as much to your annual radiation exposure as do typical medical and dental X-rays. That sounds scary, so you decide this risk should be prevented. Well, we could require all buildings to be coated in a think layer of lead, to shield out most of the radiation. This would reduce our exposure, but there are trade-offs. The industrial work to make this happen would result in construction deaths, would produce more pollution, etc. We may very well end up causing more deaths in the process of reducing this risk than if we had just accepted it. The point is that you need to compare, quantitatively, before it makes sense to decide on a course of action. Noting that something has some bad effects is not enough. (Nothing is more terrible than activity without insight)

Another example: Imagine that you are trying to decide whether to rent a house which is right next to a nuclear power plant. You're told that the average risk of dying from cancer caused by radiation from the plant is 1 part per million. Is this something to worry about? To answer that question, you might want to look at the average risk of other activities that you're involved in. According to a table in Physics: Concepts and Connections by Art Hobson, the following activities have the SAME average risk of death as living by the power plant for 5 years:

Ionizing Radiation:
one chest X ray
traveling cross-country once by jet
living 1 week in a building
living 5 weeks outdoors

Internal Consumption:
smoking 1.4 cigaretttes
drinking 0.5 liters of wine

Travel:
3 miles by motorcycle
30 miles by car
800 miles by train
1000 miles by commercial airplane

Other:
spending 1 hour in a coal mine (black lung disease)
spending 3 hours in a coal mine (accident)
living 2 days in NY or Boston(air pollution)

Often it's useful to look at a number as a percentage of something else. Just to make sure that everyone is starting off on the same track, let's review what percentages are:

If you say 30% of the American people believe something, that means that, on average, 30 out of 100 people believe something.

If someone tells you that 1 out every 4 Oregonians use a certain kind of toothpaste, you can convert this into a percentage by expressing the fraction 1/4 as a fraction out of 100:

1/4 = 25/100 --> 25 percent

Percentages can be very handy, especially if you don't know the typical numbers for the quantity that you're concerned about. If you're shopping for food and a product says "only X grams" of sodium, fat, or whatever, you should compare that to the % of the RDA. A number that sounds pretty low may turn out to be a large of percentage of what's recommended.

Percentages can be misleading, though. For example, imagine that you read that moving to a particular state will increase your chance of being struck by lightening by 300% (3 times). This is a huge percentage increase, but you probably won't worry too much about it, because the magnitude of the chance that you will be struck by lightning is so small that increasing it by a factor of 3 isn't a big deal. On the other hand, if the report said that your risk of having a heart attack would go up by a factor of 3, you'd probably pay attention, because you know that that's a leading cause of death for men and women. (What you're really interested in is the percentage increase in your overall risk of death.)

The following is taken from Stephen Carey's "A Beginners Guide to Scientific Metod:"

Correlations:
1. correlation is not causation:

income, age, reading glasses--income does not cause you to require reading glasses
mere correlations-- stock price and the number of people bowling

2. correlation can be positive or negative

above case is positive: as the distance increase, the time to PSU increases
might see a negative correlation between the distance to PSU and the cost of housing (availability of parking!)

Draw on the board a graph of "time to PSU" vs. "distance to PSU" (positive correlation) and a graph of "housing costs" vs. "distance to PSU" (negative correlation)

3. correlation is rarely perfect

notice outliers in above graph--a student with a good score on the SAT doesn't necessarily have a high GPA

Note that on our graph of time to PSU vs. distance to PSU, we have points off the best fit line. Some people close by may take the bus or walk while someone living far away may have only interstate driving.

there are cases of perfect correlation, however --number of rings on a tree and its age

4. level of effect of a causal factor usualy limited--the fact that smoking is a cause of lung cancer does not mean that everyone who smokes will get lung cancer

5. most effects don't have a single causal factor: matching
In causal experiment, have a control group and a experimental group. (The suspected causal agent is introduced to the experimental group.)Want all other contributing factors to be equally represented in both groups. Sometimes this requires disqualifying some subjects from the experiment (or requiring the factor) to make sure that you have similar subjects in each group. Sometimes disqualify any subject associated with another casual factor. Have to watch out for self-selecting samples: subjects influence the composition of the experimental and control groups (i.e. subjects determine which group they're in). Ex: Professor wants to see the affect of attendance on how much people learn, so he teaches 2 sections of the same class, requiring attendance in one but not the other. If the students find out about the difference between the 2 sections, the less serious students will sign up or switch to the section that doesn't have required attendance.

6. experimental conditions can make a causal factor appear more significant than it is (experimenter expecations, experimental subject expectations). A sugar pill (with no medication) is an example of a "placebo," a device used keep the subject from knowing whether he's in the experimental group or the control group

Types of causal experiment:
1. randomized
subjects selected and randomly divided into 2 groups prior to administering the suspected causal agent: experimental and control groups
provide good causal evidence, but tend to be expensive (need large number of subjects) and time-consuming, particularly if something like a longevity study
ethical considerations--don't want to intentionally subject people, in particular, to suspected causal agents that would cause something bad

2. prospective
experimental group already has the suspected causal factor (they're already smokers, in the lung cancer example) --wait to see level of difference in the effect between this group and the control group
problem: other causal factors may come into play (smokers may also tend to not eat well, exercise, etc.) : only criteria for selection was the suspected causal factor (not random like in 1.) Can do matching to account for this. If matching is good, these experiments can provide pretty good evidence for causal link.
less expensive, time-consuming, easier to use a large number of subjects (don't need to manipulate the experimental subjects as much...take "wait and see" aproach)

3. retrospective
experimental and control groups taken from a population in which the effect is already present--look for causal effect
only weak evidence for causal link because difficult to control for other potential causal factors (may not have any information on other factors)
don't provide an estimate for the level of difference in the effect being studied...the design of the experiment insures that 100% of the experimental group but none of the control group will show the effect
best used to identify potential causal links
quick, inexpensive (often only analysis of data already available)

"Reading Between the Lines" (questions to ask of stories)
1. What is the causal hypothesis at issue?
2. What kind of causal experiment is undertaken?
3. What crucial facts and figures are missing from the report? (think about the number of subjects, composition of the control and experimental groups, and other things we just talked about)
4. Given the information at your disposal, can you think of any major flaws in the experiment? (suggestions for correcting)
5. Given the info available, what conclusion can be drawn about the causal hypothesis? How certain are you about those conclusions?

Case 1: #23, pg 47 (coffee drinking)
1. Does coffee drinking raise the risk of developing heart disease?
2. prospective experiment
3-4. large number of subjects (good), doesn't say anything about matching, don't know how long they've been drinking covvee, different kinds of coffee have different amounts of caffeine (and other ingredients), did they look at different age groups or were they all lumped together? Doesn't say anything about how they handled subjects that already had heart disease. Doesn't say how many were in experimental and control groups. Only used men who were health professionals.
5. indicative, not certain

Case 2: #11, pg 11 (effect of exercise on sleeping)
randomized experiment,
Did they self-select what group they went into? Certainly they knew which group they were in.
mix of men/women?
(Does taking longer than 25 minutes to fall asleep really indicate a sleeping problem?)
very few people in experiment, wide age range
How could they tell when they fell asleep?
Change in exercise may have changed other things (eating patterns, social behavior, etc.)
How do they define "healthy?"
Gave average change in sleeping patterns, but no idea of variability.

Case 3: #15, pg 42
randomized experiment
large number of participants, but small number of events, so statistics not good
doesn't give ages of men
don't know long term effect of taking aspirin
(students wanted to know who sponsored the study)

point out that percentages use one number as a reference: For example, say people who didn't take aspirin were twice as likely to have a heart attack. You could say people who took aspirin were 50% less likely to have a heart attack or that people who didn't take aspirin were 100% more likely to have a heart attack.


#

\*+U+**++[UU+ +*1U+U+UU,]PPOP\PP\]@9%ǁ3V&V^]d]ssO9F@:9+33-,,U+UU+UU*+*+1U U+*+*$U.33VOPOPOVUzt za\9:2O