Stat 1000: Tips for Assignment 2

Published: Sun, 01/29/12

 
You are receiving this email because you indicated when you signed up for Grant's Updates that you are taking Stat 1000 this term.  If in fact, you do not want to receive tips for Stat 1000, please reply to this email and let me know.
 
Please note that my first midterm exam prep seminar for Stat 1000 will be on Saturday, Feb. 4, in room 100 St. Paul's College, from 9 am to 9 pm .  I am now ready to take registrations.  Please click this link for more information about the seminar and to sign up if you are interested:
Grant's Stat 1000 Exam Prep Seminars 

Join Grant's Tutoring on Facebook or follow Grant on Twitter.

Simply go to www.grantstutoring.com and click the Facebook and/or Twitter icons.
 
If you ever want to look back over a previous tip I have sent, do note that all my tips can be found in my archive.  Click this link to go straight to my archive:
 
Grant's Updates Archive
 
Did you miss my Tips on How to Do Well in this Course? Click here
 
Did you miss my Tips for Assignment 1? Click here
 
If you are taking the course by Distance/Online (Sections D01, D02, etc.), click here for my tips for your Assignment 2.
 
If you are taking the course by classroom lecture (Sections A01, A02, etc.), click here for my tips for your Assignment 2.
 
Tips for Classroom, Old-Fashioned Paper Assignment 2
 
You should study Lesson 2: Regression and Correlation and Lesson 3: Designing Samples and Experiments in the current edition of my study book to prepare for this assignment.  Lesson 2 teaches the concepts for questions 1 and 2.  Lesson 3 teaches the concepts for questions 3, 4 and 5.  If you are using an older edition of my book, note that these are Lessons 3 and 4 in older editions.
 
Question 1 is supposed to be done by hand, but why not get JMP to do it for you (see my steps in question 2 below for how to do linear regression in JMP), then you can just copy out by hand the Scatterplot JMP makes for you.  I think you will find my Lesson 2, question 1 very helpful in understanding how to do this question.
 
I am certain that, even though you are supposed to do it by hand, you are still allowed to use the Stat mode on your calculator to compute the mean and standard deviation of both variables to assist you in the computation of r, a and b.  You should clarify this with your prof, but surely they are not going to make you work out the means and standard deviations by hand also.  Follow the steps in the Appendix of my book showing you how to enter x,y data pairs into your calculator in Linear Regression mode.  I show you how the calculator gives you r, a and b, but your calculator also gives you x-bar, y-bar, Sx, and Sy.  Just click the appropriate buttons.  For example, on Sharps, you click "RCL 4" to get x-bar, "RCL 5" to get Sx, "RCL 7" to get y-bar, and "RCL 8" to get Sy.  Record every single decimal place the calculator gives you to ensure your computations are accurate.
 
1(a).  Read my tips when I teach Lesson 2, question 1(b) in my book to make sure you make the scatterplot correctly.  You may also want to have JMP help you here.
 
1(b).  To compute the correlation coefficient by hand, follow my example in Lesson 2, question 1, part (c).  Note, you are not given the means and standard deviations for x and y already, so I am sure you are allowed to use the Linear Regression Stat Mode on your calculator to tell you the means and standard deviations of both x and y.  Put your calculator in Linear Regression Stat Mode (see Appendix D of my book).  After you enter all the (x,y) data points, you can ask it for the mean and standard deviation of the x values and the mean and standard deviation of the y values.  For example, Sharps use "RCL 4" to get x-bar and "RCL 7" to get y-bar.  "RCL 5" gives you Sx and "RCL 8" gives you Sy.
 
Record every single decimal place your calculator gives you for each calculation, or else your answers won't be accurate enough.  Of course, your calculator actually tells you the value of r, so you can use that as a check.
 
When they ask, "What does this value tell us?" I assume they want you to interpret the value of r.
 
1(c).  Use the formulas I show you in question 1(e) of my book, Lesson 2, to compute a and b (also given on page 1 of my book on the formula sheet).  Of course, you can compare the answers you get with the values your calculator gives you in the Linear Regression Stat mode.
 
Here is how you can use JMP to do Linear Regression:
Here is how to use JMP for linear regression.  First copy and paste the data into a New Data Table the usual way (see my previous homework tips if you are not sure how to paste the data).  If you have to type the data in manually, simply double-click the space to the right of "Column 1" to create "Column 2".  Enter the X data down column 1 and the Y data down column 2.  Be sure to double-click each column to give it an appropriate name and to ensure the Data Type is Numeric and the Modeling Type is Continuous.
 
Select Analyze, then Fit Y By X.  Highlight the column you have determined should be X, and click the X, Factor button.  Highlight the column you have determined should be Y and click the Y, Response button.  Click OK.
 
You should now see a scatterplot. Click the red tiangle next to "Bivariate Fit" and select "Density Ellipse, .99".  A stupid ellipse shows up on your scatterplot that you don't want, but you will also see an output called "Correlation" show up below the scatterplot.  Click the blue triangle next to that to open it up and it shows you the mean and standard deviation of x and y and also shows you r, the correlation.  Click the red triangle under the scatterplot where it says "Bivariate Normal Ellipse" and deselect "Line of Fit" to remove that stupid ellipse from your scatterplot.
 
Click the red triangle above the scatterplot and select Fit Line and JMP will draw in the least-squares regression line.  Note, it shows you the regression equation directly below the scatterplot.  JMP also shows you the value of r-squared (the coefficient of determination), rather than r, the correlation coefficient.  Remember, the coefficient of determination is the percentage of y's variation explained by the regression equation.  You can always square root this number to get r, the correlation coefficient, but use your scatterplot to help you decide if r is negative or positive because your calculator can't tell you that.
 
If you want to get rid of anything, click the red triangle and deselect anything you don't want to see.  Note, if you click the blue triangle next to something, that will make part of the output disappear as well, if you wish.  Just click the blue triangle again to make it reappear.
 
Use JMP as I show above to answer question 2.  Be sure to read my question 1(a) for tips on how to identify the explanatory and response variable.  Note that JMP does not answer part (e), you have to compute the residual yourself (see my question 1 for examples of all these things).  Also, take a look at my question 3 for key concepts about the correlation coefficient and question 8 for a discussion of influential observations.
 
Question 3 is similar to my question 7, in Lesson 3.
 
Be sure to study up on the Principles of Experimental Design and the Types of Experiments to help answer question 4.
 
Questions 5 and 6 involve the concepts I discuss in the first half of Lesson 3, up to the end of question 5.

Tips for Distance/Online Web Assign Assignment 2
 
Continue to study Lesson 1 in my study book (if you have it) to learn the concepts involved in HW 02.
 
Ignore any references to JMP 6SE or Crunchit!.  You are using JMP 8 in this course.  The assignment is just an old assignment that they forgot to update.  Use JMP 8 anytime they tell you to use computer stuff.
 
Question 3 should be done manually.  Note to enter the answers correct to 0.1, they mean round your answers off to one decimal place.
 
Question 4 should be done manually.  Be sure to read the Appendix at the back of my book to learn how to use Stat Mode in your calculator to compute a mean and standard deviation quickly.  By "nearest decimal place", they mean round your answers off to one decimal place.
 
Question 5 (the IQ and GPA question):
Click the link to the data file, then select and copy the entire data set (you can click "Ctrl A" on your keyboard to select all, then click "Ctrl C" to copy it all).  Having opened a "New Data Table" in JMP, select "Edit" then "Paste with Column Names" to paste the data in.  Double-click the "iq" column name at top and confirm that JMP has the "Data Type" as "Numeric" and the "Modeling Type" as "Continuous", changing those settings in the drop-down list if necessary.  Click OK.  Do the same for the "gpa" column.  Important: Double-click the "gender" column and make sure that JMP has the "Data Type" as "Character" (it probably doesn't) and the "Modeling Type" as "Nominal" (it probably doesn't), changing those settings in the drop-down list if necessary.  Click OK.  Finally, take a look at the last row of data that has been pasted into JMP.  If it just shows a bunch of dots instead of numbers, click that row to highlight it then right-click and select "delete rows" to delete that row.  Of course, do not delete any row that has numbers (data) in it!
 
To find the mean, standard deviation and median in part (a):
Select "Analyze" then "Distribution".  Highlight "iq" in the pop-up menu and click the "Y, Columns" button.  Click OK.  You are then taken to a screen that shows a histogram among other things.  You will find the mean and standard deviation in the "Moments" section and the median in the "Quantiles" section.
 
To make the boxplots and histogram in part (b):  In the toolbar at the top of your data spreadsheet, select "Analyze" then "Distribution".  Select the "gpa" column and click the "Y, Columns" button.  Click OK.  Your histogram appears sideways but they didn't ask you to switch it horizontally, so don't bother.  If they want to see it the typical way (and they will request that if they want it), click the red triangle next to your variable above the histogram and select Histogram Options from the drop-down menu.  Select Horizontal Layout.  Click the red triangle next to "gpa" and select "quantile boxplot" (if it isn't checked already) and "outlier boxplot" as well to get the desired boxplots.  Click the blue triangles next to "Quantiles" and "Moments" to hide that stuff, then "select all" (click "Ctrl A" on your keyboard) and then "copy" (click Ctrl C).  Paste it into your document.  Be sure to type in your answers to the question they ask in part b underneath the graphs you pasted into your document.  Remember how skewness and/or outliers affects a mean and median.
 
To make the side-by-side boxplots in part (c):  Back in your data spreadsheet, select "Analyze" then "Fit Y By X".  Highlight "gpa" and click "Y, Response".  Highlight "gender" and click "X, Factor".  Click OK.  Now click the red triangle and select "Display Options", then select "Box Plots" to get your side-by-side boxplots.  Select all and copy and paste into the same document you already have in part (b).  Make sure you type your answer to their question below these boxplots in your document.  You can now save the file and upload it into Web Assign.
 
Question 6 should be done manually.  Read my section in Lesson 1 on "The Effect of Changing Units on Centre and Spread" to properly prepare for this question.