Stat 2000: Tips for Assignment 5
Published: Sat, 03/31/12
The final exam seminar will be on Sunday, April 15, 2012 from 9:00 am to 9:00 pm.
Please click
this link for more information about the seminar and to register if you
are interested:
Did you miss my Tips on How to Do Well in this Course? Click here
Did you miss my Tips for Assignment 4? Click here
If you are taking the course by classroom lecture (Sections A01, A02, etc.), click here for my tips for your Assignment 5.
You need to study the Chi-Square Goodness of Fit part of Lesson 8: Chi-Square Tests (if you are using an older edition of my book, this may be Lesson 9). You also will need to study Lesson 9: Review of Linear Regression and Lesson 10: Inferences for Linear Regression (up to the end of question 3, you do not need to study the Multiple Linear Regression section at this time).
Question 1 is not unlike my question 5 in Lesson 8.
Question 2 is not unlike my question 9 in Lesson 8.
Question 3 is not unlike my question 11 in Lesson 8.
Question 4 is a runthrough of Linear Regression. Be sure to study Lessons 9 and 10 in my book before attempting this and the rest of the questions in this assignment. You should especially work through question 1 in Lesson 9 and questions 1 and 3 in Lesson 10.
To do Linear Regression in JMP:
Open a "New Data Table". Double-click Column 1 and name it "Age" and click OK. Type in all the Age data into that column. Double-click the region to the right of column 1 to create Column 2 and name it "Price" and enter all its data. Select "Analyze, Fit Y By X". Highlight "Age" and click "X, Factor". Highlight "Price" and click "Y, Response". Click OK.
You should now be looking at a scatterplot. Click the red triangle and select "Fit Line" to get the least-squares regression line. You now have all the outputs you need. Note that JMP gives you the coefficient of determination, r2, which you can use to determine r. Be careful to assign the correct sign to r. Be sure to read in Lesson 10 the connection between the t test statistic for the slope and the t test statistic for the correlation. Although they want you to do the hypothesis in part (c) by hand JMP does do it for you and give you the P-value.
I show you how to make predictions and compute residuals in Lesson 9. When they ask what does the sign of the residual tell you, they just want you to discuss whether the actual result is higher or lower than predicted.
Part (f) is tricky. Think of the ramifications of your conclusion in part (c). Does x cause y?
Question 5 requires the use of JMP.
Open a "New Data Table" and create two columns. Name the first column
"Study Time" and the second column "Score". Remember, to create a new
column, simply double-click in the space at the top of the column, to
the right of a pre-existing column. Enter in the data manually, and we
are now ready to analyze the data. Double-click both column names and
confirm their Data Type is Numeric and their Modeling Type is
Continuous.
Question 5(a) wants you to list the model. No numbers. I show you how to write the model in Lesson 10. Be sure to use the symbols β0 and β1 rather than α and β to tie in with the symbols they use later in the question.
Question 5(b):
Select "Analyze" then "Fit Y by X". You should be able to tell which
is x and which is y. Select the y variable and click "Y, Response" and
select the x variable and click "X, Factor". Click OK. You will now
see a scatterplot. Click the red triangle next to "Bivariate Fit ..."
and select "Fit Line" to have JMP compute and graph the least-squares
regression line.
Question 5(c): Click the red triangle next to Linear Fit and select Residual Plot to get the graph of the residuals.
Question 5(d): JMP gives you this estimate in the Summary of Fit. But it sounds like they want you to compute the standard deviation of the residuals by hand (as I show in question 1(k) in Lesson 9. This isn't hard to do at all since you are given the sum of the squared residuals.
Question 5(e) and (f): These must be computed by hand using
the appropriate formulas and numbers from JMP as I show in my question 3
of Lesson 10. Note that you have been given some useful numbers in that regard at the start of this question.
Question 5(g): Click the red triangle next to "Linear Fit" and select "Confid Curve
Indiv" and "Confid Curve Fit" to get these two intervals they want. As I
tell you in Lesson 10, the curves that are closer to the line are the
confidence intervals for the mean, the outer curves are the prediction
intervals. It sounds like they want you to print the JMP output without these curves drawn in, and then print it again with the curves.
Question 5(h): JMP gives you most of the numbers you need for this confidence interval in the "Parameter Estimates".
Question 5(i):
JMP already did this test for you when you selected "Fit Line". The
ANOVA table and the "Parameter Estimates" are giving you all the
info you need, but be sure to write out your hypotheses and conclusion
in the file you are uploading. However, they want you to do it by hand. This isn't so bad because of the values you have already computed in part (d) and the givens at the start of the problem.
Question 5(l): JMP already made this ANOVA table for you.
Question 5(n): You
should know what this ratio they want here and how to determine it. The proportion of variation is just another way of asking for the percentage of variation. The coefficient of determination. I
talk about this in Lesson 10, and show you how to interpret it in
question 1 of Lesson 9.
Study Lesson 5 in my study book to prepare for this assignment.
For Question 1, note that ni simply means they want you to tell them the values of n1, n2, etc.. Simply tell them n1 = #, n2 = #, etc. (replacing # with the appropriate numerical value). Depending on the specific question you were given, some of you will have the same values for all of n1, n2, etc., whereas, for others, the values of n1, n2, etc. will be different. Of course, N and I are as I describe in Lesson 5.
For Question 2, you should certainly
use the stat mode in your calculator to compute the means and standard
deviations (which will, of course, enable you to know the variances),
then do the rest of the problem by hand using the formulas for SSG, SSE,
MSG, MSE, and F. Make sure you have memorized those formulas (and the
formula for the overall mean or grand mean). There is almost certainly
going to be a question or two on the exam that will check to see if you
know these formulas (although it is rare to see an exam question that
makes you do an entire ANOVA by hand). It is common that an exam will
make you compute MSG or MSE by hand having been given the sample means
and standard deviations. Note, throughout the question they tell you to do "second - first", so make sure you do. That makes their hypotheses confusing. I assume they mean by μ1 the mean of the first coinage, but, then again, it doesn't really matter.
Here is how to do the JMP part of Question 3:
It is done the same way you did the JMP in the
previous assignment. Open a New Data Table and type the data in
manually in this manner (don't bother pasting and stacking, it is not
worth the effort): Name your first column "Silver" or something like
that, and type all the silver contents down that column. Which is to
say, type in the numbers from the First coinage down the column, then
continue to type all the numbers from the Second coinage, and finally
continue to type all the numbers for the Third coinage. Double-click at
the top to the right of the "Silver" column heading to create a new
column and name it something like "Coinage". Down that column
type "First" repeatedly down that column in all the rows that have the
data for the First coinage. Then type "Second" repeatedly down the
column in the rows that have data for the Second coinage. Finally, type
"Third" for the rest of the column. You may want to type the phrase
once and then copy and paste it down the rest of the relevant rows to
ensure there are no typos. Once you have done that, double-click the
"Coinage" column heading and confirm that the Data Type is Character and
the Modeling Type is Nominal and click OK.
Select Analyze, then Fit Y
By X. Highlight the numeric column "Silver" and click the Y, Response
button. Highlight the character column "Coinage" and click the X,
Factor button. Click OK.
You should now see a graph
with three vertical arrays of dots showing the silver content for the
three different coinages separately. (If you do not see a graph like this, but, for example, see a Mosaic Plot, that means you did not label the Data Type and Modeling Type properly. Go back to your data table, double-click each column and make sure the Data Type and Modeling Type for each column is as I indicate above.) Click the red triangle and select
"Means and Std Dev" to get a summary of the means and standard
deviations of the three samples. Click the red triangle again and
select "Means/Anova/Pooled t" to get the output you need.
By the way, be sure to
study my questions 5 and 6 in Lesson 5 thoroughly to better understand
what they are getting at in part (d).
In Question 4
the applet is pretty straightforward to learn from. Note that the
pooled standard error is essentially MSE. Remember, F = MSG/MSE and
think about what affects the value of MSG and MSE and how those two
values affect the value of F. The applet pretty well teaches you what
happens. Messing with one part changes MSG, messing with the other part
changes MSE.
Do the JMP in Question 5
just like I showed you what to do in Question 3 above. Here at least
you can copy and paste the data into JMP. Be sure to paste it into JMP
by selecting Edit, then, while holding down the Shift key, select Paste
in order to paste the column headings properly (or, after you have
copied the data, select "Edit" then "Paste with Column Names"). Note,
double-click the "bfed" column and confirm that its Data Type is
Character and Modeling Type is Nominal; double-click the "energy" column
and confirm that its Data Type is Numeric and Modeling Type is
Continuous. Always make the numeric column Y and the character column X
when you select Fit Y By X.
Re-read my section on the P-value in Lesson 2 of my book if you still are not sure how to interpret a P-value.