Stat 2000: Assignment 8 Tips (Distance/Online Sections)

Published: Wed, 03/06/13


 
My tips for Assignment 8 are coming below, but first a couple of announcements.
 
Please note that I am planning on splitting my final exam seminar for Stat 2000 into two days since we will have to cover Lesson 6 in Volume 1 as well as all of Volume 2.  The plan is to meet from 9:00 am to 6:00 pm each day.  Each day will cost $40 or, if you attend Day One, you can attend Day Two for half-price (you will pay a total of $60, in other words).
 
I plan to teach Day One on Easter Sunday, March 31 and Day Two 2 weeks later on Sunday, April 14.  I will send more details and start taking registrations once everything is finalized next week.
 
Did you read my Tips on How to Do Well in this Course? 
Make sure you do:  Tips on How to Do Well in Stat 2000 
 
Did you read my Tips on what kind of calculator you should get?
Tips on what calculator to buy for Statistics
 
Did you miss my Tips for Assignment 7?
Tips for Stat 2000 Distance Assignment 7
 
If you are taking the course by Classroom Lecture (Sections A01, A02, etc.), there is no Assignment 8.
 
Tips for Assignment 8 (Distance/Online Sections D01, D02, D03, etc.)
 
Don't have my book?  You can download a free sample containing Lesson 3 at my website here:
Grant's Tutoring Study Guides (Including Free Samples)
 
Continue to study Lesson 10, especially the section on Multiple Linear Regression that begins after question 3.
 
Note, the values you get for your coefficients and their test statistics in a multiple linear regression are likely to be different than the values you would get if you did a simple linear regression of y versus just one of the explanatory variables.  That is because a simple linear regression looks at the effect that one explanatory variable alone has on y, while a multiple linear regression looks at the effect a particular explanatory variable has on y while holding all the other explanatory variables constant (in a sense, filtering out the effects of other explanatory variables).  In a simple linear regression, you could always find r, the correlation coefficient, by square rooting r-squared as given by JMP, but remember r can be positive or negative (r always has the same sign as b, the slope).  In multiple linear regression, r no longer has much meaning since the model is using several explanatory variables, but you could still compute it by square rooting r-squared as given by JMP.  In multiple linear regression, r is always considered to be positive since it is unable to isolate the effects of any particular explanatory variable and it is always possible that some of the explanatory variables have a negative association with y while others have a positive association.
 
You will use JMP for question 1.  Open a "New Data Table" and copy and paste in the given data set.  Be sure to select "Edit" and "Paste with Column Names".  Double-click the GPA, IQ and Concept column names and make sure their Data Type is Numeric and their Modeling Type is Continuous.
 
Question 1(a):  Select "Analyze" then "Multivariate Methods" then "Multivariate".  Select the GPA, IQ and Concept columns and click the "Y, Columns" button to make them all Y columns, click OK.  That takes you to an output that shows a correlation matrix where you can read off the desired correlations.  Note, when they ask for the proportion of total variation they are asking for the coefficient of determination, r-squared (see my Lesson 9, question 1 part (d) for a discussion of the coefficient of determination).  But make sure you compute the correct r-squared.
 
Question 1(b): Select "Analyze" then "Fit Model" and select GPA and click the "Y" button to make it a Y.  Select both IQ and Concept and click the "Add" button to add them as explanatory variables in the model.  Make sure the "Personality" drop-down list is set at Standard Least Squares.  If it is not, and it is not even available as an option, your data has been corrupted.  Go back to the data spreadsheet, double-click on each of GPA, IQ and Concept and make sure their Data Type is Numeric and their Modeling Type is Continuous and try this again.  Click "Run Model" to have it perform the multiple linear regression.  Everything you need is in the Parameter Estimates.  (See my question 4 in Lesson 10 for an example of how to read the various outputs.)
 
Question 1(c): They just want the coefficient of detemination again that you just gave in part (a).  The additional percentage is just the difference between that coefficient of determination and the new coefficient of determination your multiple linear regression model now has (given in the Summary of Fit).  Make sure you type in the percent not the decimal.  For example, 67.1 % not .671.
 
Question 1(d): The Parameter Estimates gives you the t statistic they are looking for.  Make sure you give them the correct one.
 
Question 2: Just read off the appropriate values from the given tables.  Note that (h) is just a very tricky way of asking you for the confidence interval for the appropriate coefficient (slope).  Recall the formula to compute the coefficient of determination from an ANOVA table (see my Lesson 10, question 3 part (a)).
 
Question 3: Copy and paste the data into JMP just as in question 1, then perform a multiple linear regression using "Fit Model" as shown in question 1 above.  Parts b, c and d are asking you for the relevant outputs in the JMP tables.  Parts a, e and f you are doing by hand.  Again, make sure you double-click all the column names and confirm their Data Type is Numeric and their Modeling Type is Continuous before you do the JMP analysis.