Stat 2000: Tips for Assignment 2

Published: Sun, 01/29/12

 
You are receiving this email because you indicated when you signed up for Grant's Updates that you are taking Stat 2000 this term.  If in fact, you do not want to receive tips for Stat 2000, please reply to this email and let me know.
 
Please note that my midterm exam prep seminar for Stat 2000 will be on Sunday, Feb. 26, in room 100 St. Paul's College, from 9 am to 9 pm .  I am now ready to take registrations.  Please click this link for more information about the seminar and to sign up if you are interested:
Grant's Stat 2000 Exam Prep Seminars 
 
Simply go to www.grantstutoring.com and click the Facebook and/or Twitter icons.
 
If you ever want to look back over a previous tip I have sent, do note that all my tips can be found in my archive.  Click this link to go straight to my archive:
Grant's Updates Archive
 
Did you miss my Tips on How to Do Well in this Course? Click here
 
Did you miss my Tips for Assignment 1? Click here
 
If you are taking the course by Distance/Online (Sections D01, D02, etc.), click here for my tips for your Assignment 2.
 
If you are taking the course by classroom lecture (Sections A01, A02, etc.), click here for my tips for your Assignment 2.
 
Tips for Assignment 2 (Classroom Lecture Sections A01, A02, etc.)
 
You need to study Lessons 4 and 5 in my book (if you have it) to prepare for this assignment.  I suggest you study Lesson 4, then attempt questions 1, 2 and 3.  Study Lesson 5, then attempt the rest of the assignment.  If you are using an older edition of my study book, please note that I used to teach Matched Pairs at the end of Lesson 2, but I now teach it early in Lesson 4.
 
You must ask yourself whether question 1 is a two-sample design or a matched pairs design and proceed accordingly.  Again, you must ask yourself that question again before answering questions 2 and 3.
 
Note, if, when using JMP, you do not seem to be able to get the graphs or options I describe in my steps, make sure you double-click each column of data and confirm that it has the correct Data Type and Modeling Type.  If I do not specify, the Data Type should be Numeric and the Modeling Type should be Continuous.
 
To do the JMP work in questions 1, 2 and 3, follow these steps:
 
If you want to analyze Matched Pairs Data:
Let us assume you have matched pairs.  Each pair has an A score, and a B score. Open a "New Data Table", double-click Column 1 and name it appropriately (I will call it "A").  Double-click the region to the right of Column 1 to create Column 2 and name it appropriately (I will call it "B").  Enter all the A scores and B scores in the columns.  Select "Analyze, Matched Pairs".  Be sure to read the entire problem to determine if they have specified the order they want you to subtract (A - B or B - A).  If not, you can, of course, do what you like.  If you want JMP to do A - B, in the Matched Pairs pop-up menu, select B first, then click "Y, Paired Response", then select A and click "Y, Paired Response".  Thus, in the Y, Paired Response window, you would see B listed above A.  JMP always does Second - First, so whichever is listed second in that window will be the front of the subtraction.  Click OK.  The output then gives you all you need.  The "t-Ratio" is your test statistic, and the three probabilities are the three P-values for the two-tailed, upper-tailed, and lower-tailed tests.  If you notice that JMP has subtracted the opposite way to what you desired, be sure to redo the Matched Pairs analysis and reverse the order you select A and B in the Matched Pairs pop-up window.

If you want to analyze Two-Sample Data:
This is done very differently. Let us assume we have two independent samples comparing the income of men and women.  A set of income scores for Men, and a set of income scores for Women.  The key thing to understand is that you will type all the scores down the first column.  Double-click Column 1 and give it a name that describes the variable both scores are measuring.  In my example, I would name the column "Income".  I would then type all the men's incomes down that column and continue to type all the women's incomes below that.  I would now double-click the region at the top to the right of Column 1 to create a new column.  That column would be named whatever variable distinguishes the two samples.  In my example, I would name the column "Sex".  I would then type a word down that column that distinguishes the two samples.  I would type Men repeatedly down column 2 in all the rows that have men's incomes in Column 1.  Then I would type Women in the rest of the rows that have women's incomes.  I suggest you type your first word, then copy and paste it in all the other relevant cells in Column 2, then type your second word and copy and paste it to ensure there are no typos.  Thus, I would have two columns of data.  The first column shows all the numerical data scores (all the incomes) and the second column labels the data in the first column telling me which group the scores belong to (men or women).
 
Double-click Column 1 and confirm that its Data Type is Numeric and its Modeling Type is Continuous, changing the settings if necessary.  Double-click Column 2 and confirm that its Data Type is Character and its Modeling Type is Nominal, changing the settings if necessary.
 
Once you have decided who you are naming Sample 1 and Sample 2 (and that means you will be subtracting in that order, "Sample 1" - "Sample 2") or they have told you which order they want, make sure JMP does it the way you expect.  Click Column 2 to highlight the entire column.  Select "Cols,Validation, List Check..." (in the Columns toolbar at top).  It will show the labels you have written for Column 2.  The label written first is what JMP will consider Sample 1.  If you don't like the order it shows, highlight the label and click "Move Up" or "Move Down" to change the order of the labels.
 
Now select "Analyze, Fit Y By X".  Select Column 1 and click "Y, Response" and select Column 2 and click "X, Factor".  Click OK.  You will see a graph with dots plotted representing all the scores in two columns.  Sample 1 should be the first column of dots, Sample 2 the second column.  If the order is reversed, go back to your data spreadsheet and do the Column Validation List Check that I outlined in the paragraph above and reverse the order the variable in column 2 is listed.
 
Click the red triangle and select "Means and Std Dev" to get a summary of the means and standard deviations.  Click the red triangle and select "Means/Anova/Pooled t" to get JMP to do the pooled two-sample t test.  Click the red triangle again and select "t-Test" to get JMP to do the generalized two-sample t test (not pooling). JMP gives 95% confidence intervals for the difference in the two means by default.  If you want a different level of confidence, click the red triangle again and select "Set α level" to have the outputs change the confidence intervals to your desired level of confidence.  For example, if you want 98% confidence intervals you would set alpha to be .02, or, if you want 90% confidence intervals, you would set alpha to be .10.  In other words, α = 1 - C. Finally, if they want to see box plots, click the red triangle and select "Display Options" and select Box Plots, or whatever else they request.  You can also deselect anything in the Display Options they don't want to see.  Never do this unless they specifically request it though.
 
Never use JMP to answer a question unless they specifically tell you to.  Whenever they do tell you to use JMP, never go out of your way to click red triangles to add things to the graph (like put titles on histograms, or label axes).  Whatever JMP gives by default is all they require unless they specifically request you add something to the output or remove something from it.  Of course, I will always give you specific steps to add/remove anything they do require.
 
Note that there is no real work to do in question 3(g).  I teach the relationship between t and F in Lesson 5 of my book. 
 
You should study Lesson 5 in my book before attempting questions 4 and 5. 
 
Question 4 is an Anova question. In part (d), when they ask for the values of ni, all they mean is tell them what n1, n2 and n3 equals.
 
To use JMP to do Anova:
Follow the same steps you do for two-sample data above.
The key thing to understand is that you will type all the scores down the first column.  Double-click Column 1 and give it a name that describes the variable all the scores are measuring.  For example, in question 4, I would call the column "Lifetimes".  I would then type all the numbers for the lifetimes for Duracell, continue down the column typing in all the Energizer and Eveready lifetimes as well.  I would now double-click the region at the top to the right of Column 1 to create a new column.  That column would be named whatever variable distinguishes the three samples.  For example, in question 4, I would name the column "Battery".  I would then type a word down that column that distinguishes the two samples.  I would type "Duracell" repeatedly down column 2 in all the rows that have Duracell lifetimes in Column 1.  Then I would type "Energizer" for the appropriate cells in column 2 and type "Eveready" for the rest of the rows.  I suggest you type your first word, then copy and paste it in all the other relevant cells in Column 2, then type your second word and copy and paste it, and your third word to ensure there are no typos.  Thus, I would have two columns of data.  The first column shows all the numerical data scores (all the lifetimes) and the second column labels the data in the first column telling me which group the scores belong to (Duracell, Energizer, or Eveready).
 
Double-click Column 1 and confirm that its Data Type is Numeric and its Modeling Type is Continuous, changing the settings if necessary.  Double-click Column 2 and confirm that its Data Type is Character and its Modeling Type is Nominal, changing the settings if necessary.
 
Now select "Analyze, Fit Y By X".  Select Column 1 and click "Y, Response" and select Column 2 and click "X, Factor".  Click OK.  You will see a graph with dots plotted representing all the scores in two columns.  Duracell should be the first column of dots, Energizer, the second column, Eveready, the third column.  If the order is silly, go back to your data spreadsheet and do the Column Validation List Check that I outlined in the paragraph above.
 
Click the red triangle and select "Means and Std Dev" to get a summary of the means and standard deviations.  Click the red triangle and select "Means/Anova/Pooled t" to get JMP to do the Anova.  Click the red triangle and select "Display Options" and select the "Box Plot" they request.
 
Question 5 is very similar to my questions 3 and 4 in Lesson 5.
 
Study Lesson 3 in my book before you start this assignment.
 
Questions 3 to 6 should go quite smoothly for a person who has studied Lesson 3 in my book.
 
To be able to understand questions 7 and 8, you will need to read my "Errors in Hypothesis Testing Revisited" section in Lesson 6 of my study book (this is only in the last two editions of my book, white or blue covers).  You may also need to skim through some of the earlier questions in Lesson 6 to get a grasp of how to compute Discrete Probabilities and Binomial Probabilities.  I think mastery of these topics is not important at this time, you will get an opportunity to learn this lesson more thoroughly later on in this course.