Please note that the midterm exam this term will cover Lessons 1 to 5 in my Basic Stats 2 book.
I am so happy to see that they have returned to the traditional approach to solving 2-Sample problems as discussed in Lesson 4
of my book. If you look at the formula sheet I give you on page 1 of my book, the Standard Error formula in the first line has an insanely complicated formula to compute the degrees of freedom. I also show this same formula on page 191 at the bottom in Lesson 4.
The formula I show is for what I refer to as the generalized method. When you have a two-sample problem (two sample sizes n1 and n2, two sample
means x-bar1 and x-bar2, and two sample standard deviations s1 and s2) you have first to decide should you assume the population variances are equal?
You use the
Rule of Thumb to decide what assumption you will make. If the ratio of the two sample standard deviations is less than 2 (always dividing bigger over smaller), you should assume the two population variances are equal and use the
pooled two-sample method.
Everything is the same in your course as in my book for this method. This is the formulas given in Line 2 of my formula sheet on page 1 of my book and also shown as #2 on
your formula sheet this term.
When the Rule of Thumb is less than 2, we assume the variances are equal and use the pooled method. We
compute the pooled sample variance and use that to compute the Standard Error of the difference between the two sample means. The pooled method has df = n1 + n2 - 2 as noted on your formula sheet in Line 2.
The change to the course this year (which is a return to the traditional methods) is that they are no longer interested in the generalized method (Line 1 on my formula sheet). That method has too complicated a formula for degrees of freedom. It is the
true method when the population variances are not equal, and is the method that computer software would use, but it is too much work to do by hand. Even in the past few years when they did teach this approach, they knew it was too much work to do by hand, and students were never put in a position to use that df formula on an exam.
Instead, they are now using the
conservative method. When the rule of thumb suggests it is inappropriate to assume the
variances are equal (if the ratio of the sample standard deviations is bigger than 2), we will use the conservative approach. We compute the Standard Error for the difference between the two means using the formula outline on Line 1 of
your formula sheet this term. This is exactly the same as the standard error formula I give you
on Line 1 of my formula sheet. However, the change is, rather than use that horrible formula to compute the best estimate of the degrees of freedom I give you in my version of Line 1 of the formula sheet, we merely use
df = min{n1-1, n2-1). Which is to say, the
df = the smaller of n1-1 and n2-1. You simply subtract 1 from each of the sample sizes and use the smaller answer as your degrees of freedom.
For example, if you
have decided to use the conservative method and n1=10 and n2=15, then df= 9 since that is the smaller of 9 and 14 (n1-1 and n2-1). If n1=20 and n2=13, then df= 12 since that is the smaller of 19 and 12 (n1-1 and n2-1). If n1=19 and n2=19, df= 18 since both n1 and n2 are the same.
This is usually giving us a degree of freedom that is a little smaller than the truth, but the payoff is it takes mere seconds
to establish the degrees of freedom. This is called the conservative method because we are playing it safe. We don't know the true degrees of freedom because we have avoided using the insane df formula, but have used a slightly lower degree of freedom instead. Using a lower degree of freedom is favouring the null hypothesis, making it even more difficult to reject Ho. And that is always our philosophy in statistics: assume the null hypothesis is true, and only
reject it if there is significant evidence that the alternative is true.