Stat 2000: ICYMI Tips for Assignment 4

Published: Thu, 11/12/15

Audio and Solutions to Feb Midterm Seminar $10
Try a Free Sample of Grant's Audio Lectures
Don't have my book or audio lectures? You can download a sample containing some of these lessons here:
Did you read my tips on how to study and learn Stat 2000?  If not, here is a link to those important suggestions:
Did you read my Calculator Tips?  If not, here is a link to those important suggestions:
Did you see my tips for Assignment 1? Click here.
Did you see my tips for Assignment 2? Click here.
Did you see my tips for Assignment 3? Click here.
Tips for Assignment 4
Please note that I made major changes to my book in September 2014.  If you are using a book older than September 2014, you are missing about 100 pages of new material and an entirely new lesson on Probability.

Study Lesson 8: Inferences about Proportions (if you are using an older edition of my book, this may be Lesson 7).  You also will need to study the first half of Lesson 9: Chi-Square Tests (up to the end of question 4, you do not need to study the Goodness-of-Fit Test at this time).

To type in formulas you are using and to show your numbers subbed into the formulas click the button in the toolbar that looks like the Sigma Summation symbol (you have to click the "..." other options button to see the sigma formula input button.  Then click the various buttons to make your fractions and enter the symbols.

Exception: Always do any JMP stuff open-book.  Have my tips in front of you, and let me guide you step-by-step through any JMP stuff.  JMP is just "busy" work.  The sooner you get it done and can move on to productive things like understanding the concepts and interpreting the JMP outputs, the better off you will be.
Question 1
This is very similar to my question 1(c) and (d) in Lesson 8. Be careful to note which is the true proportion, p, and which is the sample proportion p^.

Be careful that you don't lose accuracy by rounding off too much.  I suggest you round off to no less than 5 or 6 decimal places while computing things like the standard deviation of p^ to ensure that you get accurate z-scores.  Better yet, store exact answers in memory in your calculator.
Question 2
This is standard sample size stuff, like my questions 6 to 8 in Lesson 1.  I think they screwed up here, because you would have expected this to be the proportion sample size stuff from Lesson 8, but instead it is just a rehash of the mean sample size stuff they already did in Assignment 1.

Note that part (c) is talking about the Inverse-Square Relationship for sample size which I introduced in Lesson 1, question 8.

Here is another way to think about the Inverse-Square Relationship.  Essentially, if you want your margin of error to get smaller, then you want your sample size to get larger by the square of the factor.  If you want your margin of error to get larger, then you want your sample size to get smaller by the square of the factor. 
  • This means, if you want to multiply the margin of error, you divide the sample size.
  • If you want to divide the margin of error, you multiply the sample size.
For example, if I want to divide my margin of error by a factor of 7, then I multiply my sample size by a factor of 49 (7-squared).  If I want to multiply my margin of error by a factor of 5, then I divide my sample size by a factor of 25 (5-squared).

Don't use the sample size formula in part (d)!  It will be too good an answer, possibly.  They want you to use the inverse-square relationship. 

If you are having trouble identifying the multiplier they are using in part (d), here is a trick you can use.  Divide the larger margin of error by the smaller margin of error as given in parts (a) and (d).  That is the factor you want.  Then, as always, square it to establish the factor you want to determine your new sample size.  Do you want the multiply by that value, or divide?  Do you want the sample size to get larger or smaller?
Make sure you still follow the Paint-Can Principle for your final answer!

Parts (e) and (f) return to just using your sample size formula and making observations about the results.
Question 3
This is a good run through of confidence intervals and hypothesis testing as I teach in Lesson 8 (see my questions 2 and 3). 

Be careful that you don't lose accuracy by rounding off too much.  I suggest you round off to no less than 5 or 6 decimal places while computing things like the standard deviation of p^ to ensure that you get accurate margins of error, and accurate test statistics.  Better yet, store exact answers in memory in your calculator.

Note that you will need to use the z* critical value you found in part (f) to compute p^*, the critical value for p^ where you will reject Ho (the p^ decision rule) needed to answer part (g).  We derive p^* from the standardizing formula for p^ bell curves.


Part (h) requires an alpha/beta table. You should have the p^ decision rule found in (g) in the alpha column, and the reverse of that rule (when you do not reject Ho) in the beta column.  You should have what Ho told you is p in the alpha column, and the alternative value for p given in (h) in the beta column.  On the beta side, draw a p^ bell curve centred at the alternative value of p, and shade the values of p^ where you will not reject Ho.  That is beta, the probability of type II error.  But you want the power, so that is 1 - beta.  Use your p^ bell curve formula (as you did in question 1 above), using the p^ noted in the decision rule in (g) and the alternative p given in (h) as your p (that is the centre of your curve).
Question 4
Part (a): Use the same standard two sentences as always to interpret your confidence interval as I show you way back in Lesson 1, question 1(b).  But, keep in mind that you are interpreting a confidence interval for a proportion, p, not the true mean, mu as I am doing in my example.

Part (b): Follow my examples back in Lesson 2, question 6 to see how to interpret a P-value.  As always, first stress that you are assuming Ho is correct (what is Ho in your problem?).  Keep in mind that you are testing a hypothesis for two proportions, p1=p2 in your example, not a mean.
Question 5
This is very similar to my questions about confidence intervals and hypothesis tests for the difference between two proportions taught in the latter half of Lesson 8.

Be careful that you don't lose accuracy by rounding off too much.  I suggest you round off to no less than 5 or 6 decimal places while computing things like the standard error of p1^-p2^ to ensure that you get accurate z-scores.  Better yet, store exact answers in memory in your calculator.

Part (g) introduces the concept I teach in my question 4 in Lesson 9 (Chi-Square Tests).  Note that you don't really have to do any work for part (g) if you apply the concept that relates two-proportion z tests to 2 by 2 Two-Way Chi-Square analysis.  In other words, you already know the test statistic and P-value.
Question 6
You will be using Table F for the first time in these last two questions.  Here is a link where you can download the table if you have not already done so:

This is standard Two-Way Table Chi-Square analysis as taught in questions 1 through 4 in Lesson 9 of my book.

When they ask in part (c) which four cells contribute most to the test statistic, they are asking which four cells have the largest chi-square values.

Make sure you have properly phrased your hypotheses in order to describe the Type I and Type II errors in this problem.  They told you this is a test for homogeneity.  If these distributions are homogeneous. that would mean that the three courses would have the same grade distributions.

Here is how to do Contingency Tables (2-Way Tables) in JMP:

Click New Data Table. You will need a total of three columns. Double-click Column 1 and name it "Course" and change the Data Type to "Character" and the Modeling Type to "Nominal".

Double-click the space to the right of the Course column to create a new column. Name that column "Grade" and change the Data Type to "Character" and the Modeling Type to "Nominal".

Double-click the space to the right of the Grade column to create a new column. Name that column "Count" and keep the Data Type as "Numeric" but change the Modeling Type to "Nominal".

Make sure that you have the correct Data Type and Modeling Type for each of these three columns as I outline above!

Each row in the JMP data table is used to enter the information for a particular cell of the two-way table. The first row will represent the 1,1 cell; the second row will represent the 2,1 cell; etc.
  • For example, your 1,1 cell gives you the observed count for the people who took Biology and got an A+.  In the JMP data table, in row 1 type "Biology" in the Course column, "A+" in the Grade column, and type the given observed count in the "Count" column.
  • Type the info for the 2,1 cell into the second row of your JMP table. That is the observed count for the people who took Biology and got an A, so you will type "Biology" in the Course column, "A" in the Grade column and the observed count in the Count column.
  • In the third row you will type Biology in the Course column, B+  in the Grade column, and the cell count in the Count column, the observed count for the 3,1 cell in the Count column.
  • Once you have entered all the info in the cells in the Biology column of your given 2-way table, you then proceed to enter all the info from the Chemistry column of the given 2-way table.  Then you enter all the info from the Physics column.
  • Continue in this fashion all the way to the 24th row where you will type "Physics" in the Course column, "F" in the Grade column, and the observed count for the 8,3 cell in the Count column.
  • You will notice that the first two columns of the JMP table are used to specify which row and column of the two-way table you are talking about, and the third column enters the observed count for that particular cell.
Once you have entered in all the observed counts, select Analyze, Fit Y By X. Select "Course" and click "Y, Response", select "Grade" and click "X, Factor", and select "Count" and click "Freq". Click "OK".

Click the red triangle next to Contingency Analysis of Grade by Course at the top and deselect "Mosaic Plot" to remove that from the output. You now see a Contingency Table (or two-way table) and the "Tests" below it.  If your two-way table has the rows and columns the wrong way round compared to what the question has, that doesn't really matter, but you can fix that by changing which column you called X and which you called Y.

Click the red triangle next to Contingency Table and make sure that all that is selected is "Count", "Expected" and "Cell Chi Square" to display those values in each cell of the table. Note the Pearson ChiSquare is the test statistic for the problem (in the last row of the "Tests" output) and the Prob>ChiSq is the P-value for that test.

If you don't get the Contingency Table, you have not properly labeled your data columns!  Go back to the top of these steps and check your data columns to make sure they have the correct labels for Data Type and Modeling Type.
Question 7
This is standard Two-Way Table Chi-Square analysis as taught in questions 1 through 4 in Lesson 9 of my book.  I suggest you make a table in your answer box (that is an option in the toolbar) to summarize the expected counts and chi-square values.  If you find that too annoying and clunky (it is), you could also give the answers to part (a) like this:
  • Expected Count for 1,1 (Don't smoke, never drink) cell is <blank> and the chi-square value is <blank>.
  • Expected Count for 2,1 (Light smoker, never drink) cell is <blank> and the chi-square value is <blank>.
  • etc., etc.
  • Expected Count for 3,3 (Heavy smoker, drink often) cell is <blank> and the chi-square value is <blank>.