Stat 1000: Tips for Assignment 4

Published: Fri, 10/28/11

Hi ,

You are receiving this email because you indicated when you signed up for Grant's Updates that you are taking Stat 1000 this term. If in fact, you do not want to receive tips for Stat 1000, please reply to this email and let me know.

Please note that my second midterm exam prep seminar for Stat 1000 will be on Saturday, Nov. 5, from 9 am to 9 pm. If you would like complete info, and/or would like to register for the seminar, please click this link:

Seminar Info and Registration

Join Grant's Tutoring on Facebook or follow Grant on Twitter.

Simply go to www.grantstutoring.com and click the Facebook and/or Twitter icons.

If you ever want to look back over a previous tip I have sent, do note that all my tips can be found in my archive. Click this link to go straight to my archive:

Grant's Updates Archive

Did you miss my Tips on How to Do Well in this Course? Click here

Did you miss my Tips for Assignment 3? Click here

If you are taking the course by Distance/Online (Sections D01, D02, etc.), click here for my tips for your Assignment 4.

If you are taking the course by classroom lecture (Sections A01, A02, etc.), click here for my tips for your Assignment 4.

Tips for Assignment 4 (Sections A01, A02, etc.)

You will need to study Lesson 6: The Binomial Distribution and Lesson 7: The Distribution of the Sample Mean in my study book to prepare for this assignment. These lessons may be in reverse order if you have an old edition of my study book.

Question 1:

I teach you what a parameter and a statistic is at the start of "The Distribution of the Sample Mean" lesson and give you examples in my question 1.

Question 2:

I taught the Uniform Distribution at the start of Lesson 4: Density Curves and the Normal Distribution (Lesson 2 in older editions). Take a look at my questions 1 and 2 (especially question 2) in that lesson.

Question 3:

If you are ever asked to decide if a particular situation is binomial or not, remember, to be binomial, four conditions must be satisfied:

(i) There must be a fixed number of trials, n.

(ii) Each trial must be independent.

(iii) Each trial can have only two possible outcomes, success or failure, and the probability of success on each trial must have a constant value, p.

(iv) X, the number of successes, is a discrete random variable where

X = 0, 1, 2, ... n.

Question 4:

If you are solving a binomial problem, and they ask you to compute a mean and/or standard deviation, read carefully. Do they want the mean of X? or do they want the mean of p-hat, the sample proportion? Be sure to study the sections about the Distribution of X and the Distribution of p-hat in my Binomial Distribution lesson (Lesson 6 in my new edition, Lesson 7 in older editions). Take a look, especially, at question 10 of that lesson as a good run through of these concepts.

Question 5:

To determine if data is normally distributed, you should construct a Normal Quantile Plot. If the data on a normal quantile plot looks like linear, then the distribution is normal. If the data looks nonlinear, the distribution is not normal.

To make a Normal Quantile Plot in JMP:

Open a "New Data Table" and enter the given data in Column 1. Double-click column 1 and name it something like "Weight" and make sure the Data Type is Numeric and the Modeling Type is Continuous and click OK. Now select "Analyze, Distribution" and select "Weight" and click the "Y, Columns" button and click OK. You should now see a histogram and stuff. Click the red triangle next to "Weight" and select "Normal Quantile Plot" to get the graph you want. You can remove any of the other outputs if you wish by clicking the red triangle and deselecting the other outputs (such as Histogram Options, etc.) and/or clicking the blue triangles.

Question 6:

Questions that give you μ and σ are undoubtedly dealing with bell curves. Make sure you have studied my lesson on the Distribution of the Sample Mean (Lesson 7 in the new edition, Lesson 6 in older editions). Always be very careful to note, are they asking you for the probability of one individual value (X)?, or are they asking you for the probability of the average or mean of n values (x-bar, the sample mean)? If you are dealing with X, use the X standardizing formula. If you are dealing with x-bar, use the x-bar standardizing formula.

Also, note that you can only do probabilities for X in these cases if you are specifically told that X is normally distributed. Otherwise, there is no X-bell curve, and the probability is unknown. However, thanks to the Central Limit Theorem, we can always assume there is an x-bar bell curve (the sample mean is normally distributed), as long as n is large.

Tips for Assignment 4 (Distance/Online Sections D01, D02, etc.)

Study Lesson 2: Regression and Correlation in my book, if you have it, to prepare for this assignment. (In older editions of my book this was Lesson 3.)

Question 1:

To compute the correlation coefficient by hand, follow my example in Lesson 2, question 1, part (c). Note, you are not given the means and standard deviations for x and y already, so you are certainly allowed to use the Linear Regression Stat Mode on your calculator to tell you the means and standard deviations of both x and y. Put your calculator in Linear Regression Stat Mode (see Appendix D of my book). After you enter all the (x,y) data points, you can ask it for the mean and standard deviation of the x values and the mean and standard deviation of the y values. For example, Sharps use "RCL 4" to get x-bar and "RCL 7" to get y-bar. "RCL 5" gives you Sx and "RCL 8" gives you Sy.

Even though they tell you to do everything to three decimal places, don't do that. Record every single decimal place your calculator gives you for each calculation, or else your answers won't be accurate enough. I suggest you do everything on paper first, then you can type in the results, rounding all of your numbers off to 3 decimal places at that time (even though you actually did the calculations using all the decimal places). Of course, your calculator actually tells you the value of r, so you can use that as a check.

Question 2 is just an algebra question. They give you three of x, y, a, and b and want you to figure out the missing one. Sub the givens into the appropriate places of

y = a + bx and solve what is missing.

Question 3 is a good run through of the formulas I show you in Lesson 3.

Question 4 uses JMP.

Here is how to use JMP for linear regression. First copy and paste the data into a New Data Table the usual way (see my previous homework tips if you are not sure how to paste the data). If you have to type the data in manually, simply double-click the space to the right of "Column 1" to create "Column 2". Enter the X data down column 1 and the Y data down column 2. Be sure to double-click each column to give it an appropriate name and to ensure the Data Type is Numeric and the Modeling Type is Continuous.

Select Analyze, then Fit Y By X. Highlight the column you have determined should be X, and click the X, Factor button. Highlight the column you have determined should be Y and click the Y, Response button. Click OK.

You should now see a scatterplot. Click the red triangle above the scatterplot and select Fit Line and JMP will draw in the least-squares regression line. Note, it shows you the regression equation directly below the scatterplot. JMP also shows you the value of r-squared (the coefficient of determination), rather than r, the correlation coefficient. Remember, the coefficient of determination is the percentage of y's variation explained by the regression equation. You can always square root this number to get r, the correlation coefficient, but use your scatterplot to help you decide if r is negative or positive because your calculator can't tell you that.

If you want to get rid of anything, click the red triangle and deselect anything you don't want to see. Note, if you click the blue triangle next to something, that will make part of the output disappear as well, if you wish. Just click the blue triangle again to make it reappear.