Stat 2000: Tips for Assignment 5

Published: Sat, 03/31/12


The final exam seminar will be on Sunday, April 15, 2012 from 9:00 am to 9:00 pm. 
Please click this link for more information about the seminar and to register if you are interested:
Grant's Stat 2000 Exam Prep Seminars 
 
If you ever want to look back over a previous tip I have sent, do note that all my tips can be found in my archive.  Click this link to go straight to my archive: 
Grant's Updates Archive
 
Did you miss my Tips on How to Do Well in this Course? Click here
 
Did you miss my Tips for Assignment 4? Click here
 
If you are taking the course by Distance/Online (Sections D01, D02, etc.), click here for my tips for your Assignment 5.
 
If you are taking the course by classroom lecture (Sections A01, A02, etc.), click here for my tips for your Assignment 5.
 
Tips for Assignment 5 (Classroom Lecture Sections A01, A02, etc.)
 
You need to study the Chi-Square Goodness of Fit part of Lesson 8: Chi-Square Tests (if you are using an older edition of my book, this may be Lesson 9).  You also will need to study Lesson 9: Review of Linear Regression and Lesson 10: Inferences for Linear Regression (up to the end of question 3, you do not need to study the Multiple Linear Regression section at this time).
 
Question 1 is not unlike my question 5 in Lesson 8.
 
Question 2 is not unlike my question 9 in Lesson 8.
 
Question 3 is not unlike my question 11 in Lesson 8.
 
Question 4 is a runthrough of Linear Regression.  Be sure to study Lessons 9 and 10 in my book before attempting this and the rest of the questions in this assignment.  You should especially work through question 1 in Lesson 9 and questions 1 and 3 in Lesson 10.
 
To do Linear Regression in JMP:
Open a "New Data Table".  Double-click Column 1 and name it "Age" and click OK.  Type in all the Age data into that column.  Double-click the region to the right of column 1 to create Column 2 and name it "Price" and enter all its data.  Select "Analyze, Fit Y By X".  Highlight "Age" and click "X, Factor".  Highlight "Price" and click "Y, Response".  Click OK.
 
You should now be looking at a scatterplot.  Click the red triangle and select "Fit Line" to get the least-squares regression line.  You now have all the outputs you need.  Note that JMP gives you the coefficient of determination, r2, which you can use to determine r.  Be careful to assign the correct sign to r.  Be sure to read in Lesson 10 the connection between the t test statistic for the slope and the t test statistic for the correlation.  Although they want you to do the hypothesis in part (c) by hand JMP does do it for you and give you the P-value.
 
I show you how to make predictions and compute residuals in Lesson 9.  When they ask what does the sign of the residual tell you, they just want you to discuss whether the actual result is higher or lower than predicted.
 
Part (f) is tricky.  Think of the ramifications of your conclusion in part (c).  Does x cause y?
 
Question 5 requires the use of JMP.
 
Open a "New Data Table" and create two columns.  Name the first column "Study Time" and the second column "Score".  Remember, to create a new column, simply double-click in the space at the top of the column, to the right of a pre-existing column.  Enter in the data manually, and we are now ready to analyze the data. Double-click both column names and confirm their Data Type is Numeric and their Modeling Type is Continuous.
 
Question 5(a) wants you to list the model.  No numbers.  I show you how to write the model in Lesson 10.  Be sure to use the symbols β0 and β1 rather than α and β to tie in with the symbols they use later in the question.
 
Question 5(b):  Select "Analyze" then "Fit Y by X".  You should be able to tell which is x and which is y.  Select the y variable and click "Y, Response" and select the x variable and click "X, Factor".  Click OK.  You will now see a scatterplot.  Click the red triangle next to "Bivariate Fit ..." and select "Fit Line" to have JMP compute and graph the least-squares regression line.
 
Question 5(c): Click the red triangle next to Linear Fit and select Residual Plot to get the graph of the residuals.
 
Question 5(d): JMP gives you this estimate in the Summary of Fit.  But it sounds like they want you to compute the standard deviation of the residuals by hand (as I show in question 1(k) in Lesson 9.  This isn't hard to do at all since you are given the sum of the squared residuals.
 
Question 5(e) and (f):  These must be computed by hand using the appropriate formulas and numbers from JMP as I show in my question 3 of Lesson 10.  Note that you have been given some useful numbers in that regard at the start of this question.
 
Question 5(g): Click the red triangle next to "Linear Fit" and select "Confid Curve Indiv" and "Confid Curve Fit" to get these two intervals they want.  As I tell you in Lesson 10, the curves that are closer to the line are the confidence intervals for the mean, the outer curves are the prediction intervals.  It sounds like they want you to print the JMP output without these curves drawn in, and then print it again with the curves.
 
Question 5(h): JMP gives you most of the numbers you need for this confidence interval in the "Parameter Estimates".
 
Question 5(i):  JMP already did this test for you when you selected "Fit Line".  The ANOVA table and the "Parameter Estimates" are giving you all the info you need, but be sure to write out your hypotheses and conclusion in the file you are uploading.  However, they want you to do it by hand.  This isn't so bad because of the values you have already computed in part (d) and the givens at the start of the problem.
 
Question 5(l): JMP already made this ANOVA table for you.
 
Question 5(n):  You should know what this ratio they want here and how to determine it.  The proportion of variation is just another way of asking for the percentage of variation.  The coefficient of determination.  I talk about this in Lesson 10, and show you how to interpret it in question 1 of Lesson 9.
 
Study Lesson 5 in my study book to prepare for this assignment.
 
For Question 1, note that ni simply means they want you to tell them the values of n1, n2, etc..  Simply tell them n1 = #, n2 = #, etc. (replacing # with the appropriate numerical value).  Depending on the specific question you were given, some of you will have the same values for all of n1, n2, etc., whereas, for others, the values of n1, n2, etc. will be different. Of course, N and I are as I describe in Lesson 5.
 
For Question 2, you should certainly use the stat mode in your calculator to compute the means and standard deviations (which will, of course, enable you to know the variances), then do the rest of the problem by hand using the formulas for SSG, SSE, MSG, MSE, and F.  Make sure you have memorized those formulas (and the formula for the overall mean or grand mean).  There is almost certainly going to be a question or two on the exam that will check to see if you know these formulas (although it is rare to see an exam question that makes you do an entire ANOVA by hand).  It is common that an exam will make you compute MSG or MSE by hand having been given the sample means and standard deviations.  Note, throughout the question they tell you to do "second - first", so make sure you do.  That makes their hypotheses confusing.  I assume they mean by μ1 the mean of the first coinage, but, then again, it doesn't really matter.
 
Here is how to do the JMP part of Question 3:
It is done the same way you did the JMP in the previous assignment.  Open a New Data Table and type the data in manually in this manner (don't bother pasting and stacking, it is not worth the effort):  Name your first column "Silver" or something like that, and type all the silver contents down that column.  Which is to say, type in the numbers from the First coinage down the column, then continue to type all the numbers from the Second coinage, and finally continue to type all the numbers for the Third coinage.  Double-click at the top to the right of the "Silver" column heading to create a new column and name it something like "Coinage".  Down that column type "First" repeatedly down that column in all the rows that have the data for the First coinage.  Then type "Second" repeatedly down the column in the rows that have data for the Second coinage.  Finally, type "Third" for the rest of the column.  You may want to type the phrase once and then copy and paste it down the rest of the relevant rows to ensure there are no typos.  Once you have done that, double-click the "Coinage" column heading and confirm that the Data Type is Character and the Modeling Type is Nominal and click OK.
 
Select Analyze, then Fit Y By X.  Highlight the numeric column "Silver" and click the Y, Response button.  Highlight the character column "Coinage" and click the X, Factor button.  Click OK.
 
You should now see a graph with three vertical arrays of dots showing the silver content for the three different coinages separately.  (If you do not see a graph like this, but, for example, see a Mosaic Plot, that means you did not label the Data Type and Modeling Type properly.  Go back to your data table, double-click each column and make sure the Data Type and Modeling Type for each column is as I indicate above.)  Click the red triangle and select "Means and Std Dev" to get a summary of the means and standard deviations of the three samples.  Click the red triangle again and select "Means/Anova/Pooled t" to get the output you need.
 
By the way, be sure to study my questions 5 and 6 in Lesson 5 thoroughly to better understand what they are getting at in part (d).
 
In Question 4 the applet is pretty straightforward to learn from.  Note that the pooled standard error is essentially MSE.  Remember, F = MSG/MSE and think about what affects the value of MSG and MSE and how those two values affect the value of F.  The applet pretty well teaches you what happens.  Messing with one part changes MSG, messing with the other part changes MSE.
 
Do the JMP in Question 5 just like I showed you what to do in Question 3 above.  Here at least you can copy and paste the data into JMP.  Be sure to paste it into JMP by selecting Edit, then, while holding down the Shift key, select Paste in order to paste the column headings properly (or, after you have copied the data, select "Edit" then "Paste with Column Names").  Note, double-click the "bfed" column and confirm that its Data Type is Character and Modeling Type is Nominal; double-click the "energy" column and confirm that its Data Type is Numeric and Modeling Type is Continuous.  Always make the numeric column Y and the character column X when you select Fit Y By X.
 
Re-read my section on the P-value in Lesson 2 of my book if you still are not sure how to interpret a P-value.