Stat 1000: ICYMI Tips for Assignment 2

Published: Tue, 02/03/15

Follow:

Did you read my tips on how to study and learn Stat 1000? If not, here is a link to those important suggestions:

Stat 1000: 4 Tips on How to Study & Learn This Course

Did you read my Calculator Tips? If not, here is a link to those important suggestions:

Stat 1000: Calculator Tips

Did you see my tips for Assignment 1? Click here.

Tips for Assignment 2

Study Lessons 2 and 3 in my study book (if you have it) to learn the concepts involved in Assignment 2. Don't start working on the assignment too soon. Study and learn the lessons first, and use the assignment to test your knowledge. Of course, always seek out assistance from my book, your course notes, etc. if you ever hit a question you don't understand, but try not to be learning things as you do an assignment. Learn first, then put your learning to the test.

Exception: Always do any JMP stuff open-book. Have my tips in front of you, and let me guide you step-by-step through any JMP stuff. JMP is just "busy" work. The sooner you get it done and can move on to productive things like understanding the concepts and interpreting the JMP outputs, the better off you will be.

Don't have my book or audio lectures? You can download a free sample of my book and audio lectures containing Lesson 1:

Free sample of Grant's Tutoring book and audio

A Warning about StatsPortal

Make sure that you are using Firefox for your browser. Don't even use Internet Explorer. It actually also has some glitches in the HTML editor boxes.

Do note that every time you exit a question in StatsPortal, the next time you return to it, the data may very well change. Do not press the "back-up" button on your browser in a question. That, too, will change the data. When you are prepared to actually do a question, open the link, keep it open, and do not close it until you have submitted your answers. Be sure to press "Save Answers" once you have done any calculations and entered any information to ensure the data does not change and force you to start over again.

After you submit the answer to a question, if you have been marked wrong on any parts, be sure that you write down the correct answers before you exit the screen (or grab a screen shot). To try a second attempt at the question do not click the link to the question again, that will change the data and you will have to start all over again. Also, DO NOT click "try again" or make a "second attempt." That will also reset the data.

Instead, exit back to the home screen where they show the links for all the different questions on the assignment. Where it shows the tries for a question on the right side of your screen, you should see the "1" grayed out, showing that you have had 1 attempt. Click the number "2" to get your second attempt with the same data. That way you can enter the answers you already know are correct and focus on correcting your mistakes.

You should also have already downloaded the JMP statistical software which was provided with either one of the course options for StatsPortal as mentioned in your course outline.

Make sure you have gone through Assignment 0 completely to learn how to use the interface. I also suggest you print out a copy of question 8 in Assignment 0 (Long Answer Questions - Part 3) so that you have the steps for saving and uploading files into the HTML editor in front of you.

Question 1: Correlation

To compute the correlation coefficient by hand, DO NOT follow my example in Lesson 2, question 1(c). They have given you slightly different column headings so they want you to compute r by hand a slightly different way. They are using this formula for the correlation coefficient (click the link):

Alternative Formula for the Correlation

Put your calculator into Linear Regression Stat Mode and enter the data. Note that I show you how to do that in Appendix A of my book. You can also click the Calculator Tips above to see these steps. Make sure you are following the steps for Linear Regression (the second column) and not the Basic Data Problem steps (the first column).

Once you have entered the data, you can confirm that you have the same answers for x̅ , y̅ , Sx, and Sy that they have provided for you. Your answers should match theirs when rounded to two decimal places.

For example, after you have entered your (x,y) data points, Sharps use "RCL 4" to get x̅ and "RCL 7" to get y̅ . "RCL 5" gives you Sx and "RCL 8" gives you Sy.

A lot of Casio calculators (and some Texas Instruments) use the "σ" symbol ("sigma," the Greek lowercase "s") to denote "standard deviation". For example, in many Casios, after you have entered the data, you first select "S.VAR." You will find it written above one of your buttons, perhaps above the "2" or nearby on the keyboard. It is accessed by pressing "SHIFT" then "S.VAR" (Statistical Variables). Once you select S.VAR, you are shown a menu where you see the symbol " x̅ " for the sample mean (select "1" and press "=" to get the sample mean). You are also told you can press "2" to get " xσn " or press "3" to get " xσn-1 ". That is Casio's way of designating the population standard deviation and the sample standard deviation, respectively. You will always want the sample standard deviation, Sx, so select " xσn-1 " (number 3 in the menu). Similarly, if you select S.VAR and then press your right arrow button, you will be scrolled through other options. For example, you can select " y̅ ", the mean of the y values, or " yσn-1 " to the get Sy, the standard deviation of the y values.

Here is how I suggest you do this problem:

Enter all your (x,y) data points into your calculator once you have put it into Linear Regression Stat Mode.
Ask your calculator for x̅ , y̅ , Sx, and Sy and confirm your answers match the givens when rounded off to two decimal places. If so, ask your calculator for r, the correlation coefficient, and note its value, rounded off to four decimal places (and make sure you round, don't trim: e.g. 0.61736 rounds to 0.6174) Once you have correctly found r, keep your data in the calculator ready to proceed to Step 3.
ON PAPER, proceed to calculate and record all the entries you will eventually type into the boxes. USE THE VALUES FOR THE MEANS AND STANDARD DEVIATIONS THAT YOU WERE GIVEN IN THE QUESTION.
The first column is telling you to subtract x̅ from each of the six x values. Which is to say, all you are doing here is calculating the x deviations. Take the first given x score and subtract the given value for x̅ . This should be a two-decimal place value already since you were given the mean rounded to two decimal places. Repeat this for the second, third, etc. x-values to fill in the first empty column in their chart.
The second column is telling you to subtract y̅ from each of the six y values. Which is to say, they want you to compute the y deviations. Do exactly as you did above to get the x deviations, except using each y score and the value of y̅ , the given two-decimal place rounded off value of the mean of the y-values.
The last column is telling you to multiply the entries in the first two columns together.
WRITE DOWN ON PAPER EVERY SINGLE DECIMAL PLACE YOUR CALCULATOR GIVES YOU. You should find that the answers for your products that you are putting in the third column will have three or four decimal places (depending on the given values for the means). In the boxes they provide, you will enter these values rounded off to two decimal places as instructed.
Compute the total of that last column (that is the numerator in the alternative formula for the correlation I have given you above). Be sure to compute the total using the two-decimal place values you will round off to when you enter them in the boxes provided.
Compute the denominator in the alternative formula for r I have shown you above by multiplying n-1, Sx, and Sy together (using the two decimal place values for Sx, and Sy they have given you). Write down the complete answer you have found, keeping all the decimal places. Note that n-1 is 5 in this problem since there are n=6 pairs of data.
Now compute r by dividing the total you computed in Step 8 by the answer you computed in Step 9.

Hopefully, the answer you get for r, when you round off to four decimal places as they request, will be very close to the actual value you got for r by using the Stat mode in your calculator. I would expect the answer you have computed by hand should match the answer your Stat mode gives you for r accurate to about 2 decimal places. If your two methods for computing r are basically the same to 2 decimal places (maybe the last digit is off by 1 or 2), then you can safely assume you have not made a mistake in your calculations.

Once you have confirmed that you were able to compute the correct value of r by hand, enter all the numbers you computed into the appropriate boxes. My hunch is that the assignment has been programmed to mark the value of r you compute using the rounded off numbers, whereas the value you compute using the Stat Mode in your calculator will actually be too accurate, and possibly marked wrong if it isn't close to the rounded off answer you compute by hand.

I hope this works. If they mark your value for r wrong, try entering the value you computed using the Stat Mode instead (assuming it is slightly different to the value computed by hand).

Part (b) correlation does not imply causation.

Question 2: Mileage Regression

Note that they have told you that Horsepower is x and Mileage is y.

DO NOT USE THE STAT MODE in your calculator to compute the intercept and slope! It will be too accurate.

Use the rounded off values they have given you for the correlation coefficient, means and standard deviations and the formulas to compute the slope and intercept that I introduce in Lesson 2, question 1(e) and also use again in question 5 of that lesson. These are also the formulas numbered 2 and 3 on the Formula Sheet you will be provided on your exams.

Download the Formula Sheet for Stat 1000 here.

Again, I recommend you use the Linear Regression Stat Mode on your calculator to enter the data and check that you get the same answers for the means, standard deviation, and correlation coefficient as they have given you. Then you can confirm that you have used the formulas correctly by matching your Stat mode's answers for a, the intercept, and b, the slope, as you get by the formulas you use. I suspect that your Stat Mode answers will differ slightly from the values you compute because your computations are using rounded off values for the means and standard deviations.

Do not use your calculator's perfect values. Use the rounded off numbers you were given for the means and standard deviations to compute the slope and intercept.

If your answers you compute for a and b, rounded to four decimal places do not precisely match the perfect answers your Stat Mode gives you, I recommend you enter the values computed from the formulas because that is what they expect. Once you enter those values in the boxes in part (a), use those rounded off values to answer the remaining questions.

Make sure you round your answer for the slope to 4 decimal places before you proceed to use it to compute the intercept. Then, of course, round the intercept to 4 decimal places, too. Be sure to use these rounded off values for any other computations the question requires.

Note, in part (b), they ask for a proportion, not a percentage, so leave your value for the coefficient of determination as a decimal (see my Lesson 2, question 1(d)). Do not change it into a percent.

Use the rounded off answers you submitted in part (a) to make the predictions requested in part (c) and (d).

The answer they are looking for in part (f) should help you answer their previous question, part (e).

I show you how to compute a residual (part (g)) in my Lesson 2, question 1(j). This is a two-step process. You must first make the appropriate prediction, then compute the residual.

Make sure you have taken a look at my Lesson 2, question 4 to learn some key facts about the correlation that may help you with part (h).

Question 3: JMP Regression

Click the "New Data Table" icon on the toolbar at top left in the JMP home screen. Double-click the region to the right of "Column 1" to create "Column 2." Rename Column 1 Temperature and Column 2 Viscosity by either double-clicking the columns and typing in the new name or by right-clicking the columns and selecting "Column Info," typing in the name and clicking OK.

Type in the data. You can move from one cell to the next in the data table by pressing "Enter", "Tab" or the arrow buttons on your keyboard.

Select "Analyze", then "Fit Y By X". Highlight Temperature, and click the "X, Factor" button. Highlight Viscosity, and click the "Y, Response" button. Click OK.

You should now see a scatterplot. (If you don't, your data is not properly formatted; go back and check the columns are Numeric and Continuous by right-clicking each column name and selecting "Column Info". The Data Type should be Numeric, and the Modeling Type should be Continuous.)

Click the red triangle above the scatterplot and select "Fit Line" and JMP will draw in the least-squares regression line. Note, it shows you the regression equation directly under "Linear Fit" below the scatterplot. JMP also shows you the value of r-squared (the coefficient of determination) in the "Summary of Fit", rather than r, the correlation coefficient. You can then square root this number to get r, the correlation coefficient, but use your scatterplot to help you decide if r is negative or positive because your calculator can't tell you that.

They don't ask you to hide the "Analysis of Variance" and "Parameter Estimates" parts of the output, but you can do so if you wish. Simply click the gray triangle next to those title bars, and you will see those parts of the output disappear.

If you are using Windows PC:

Press "Alt" on your keyboard or click the thin blue line that is near the top of the window to get the toolbar icons to appear. Select "File" then "Save As" to get a pop-up window. Type in whatever name you want the file to have in the "File name" section. Click the "Browse Folders" arrow and select which folder you want to save the file in (I suggest you select "Desktop" so that the file will just appear right on your desktop home screen). Finally, click the drop down arrow in the "Save as type" section and select "JPEG File". Click "Save". You should now have your file ready to upload into the assignment.
To upload your file into the text box they provide: Click "HTML editor" below the text box (if you have not already done so) to make a toolbar appear in the text box. Click the toolbar option called "Link" and select "Image." In the pop-up window that appears, click the button called "Find/Upload File" (it is at the bottom of the pop-up window, you may have to enlarge the box or scroll down to see it). Click the "Browse" button and find the scatterplot file you just saved. Either double-click that file or select it and click "Open" and you should see the path to that file appear in the Browse box. Click "Upload File" and its name should appear in the "Uploaded Files" pop-up window. Select the file in the list of "Uploaded Files" to highlight it and click OK and you should see the file appear in the text box.

If you are using Apple/Mac:

You will need to take a screen shot of your output in order to upload it. To take a screen shot hold down Command+Shift+4 and drag the cross-hairs over the image to capture it. The image will save a .png file to your desktop by default.
To upload your file into the text box they provide: Click "HTML editor" below the text box (if you have not already done so) to make a toolbar appear in the text box. Click the toolbar option called "Link" and select "Image." In the pop-up window that appears, click the button called "Find/Upload File" (it is at the bottom of the pop-up window, you may have to enlarge the box or scroll down to see it). Click the "Browse" button and find the scatterplot file you just saved. Either double-click that file or select it and click "Open" and you should see the path to that file appear in the Browse box. Click "Upload File" and its name should appear in the "Uploaded Files" pop-up window. Select the file in the list of "Uploaded Files" to highlight it and click OK and you should see the file appear in the text box.

Type your answers for the rest of the question into the box (make sure you have clicked HTML Editor).

Part (b), I show you how to interpret a slope throughout Lesson 2, and give you a specific example in question 1(f) of my book.

As I mention above, and illustrate in my question 7, you can determine the correlation from the JMP printout.

Part (d), you will use the least-squares regression equation JMP has computed for you to compute the prediction for observation 5 first, then you can compute the residual they request. When they ask, "What does the sign of the residual tell us?" they merely want to know if the actual observation was higher or lower than we predicted it would be.

Question 4: Design of Experiments 1

Make sure you have studied Lesson 3 in my book before you answer this and the remaining questions in this assignment. You should especially look at questions 6 and 7 as illustrations of the Three Principles of Experimental Design and examples of identifying the various factors, factor levels, treatments, experimental units, and response variable for an experiment. As well as identifying what type of experiment it may be (randomized comparative experiment, block design, matched pairs design).

Be sure to click the HTML Editor link before you type your answers into the box provided. When they ask for the treatments (part (e)), tell them not only how many treatments there are in the experiment, but what the exact treatments are. For example, in my Lesson 3, question 7(b), I wouldn't just say that there are 6 treatments. I would say the 6 treatments are: Dog Food A served early; Dog Food B served early; etc. up to Dog Food C served late.

Here are some extra things to clarify the three principles of experimental design which you may be asked to discuss in questions in this assignment. Do note that different students get different scenarios and questions, so I cannot be very specific:

Note that randomization is used in experiments to randomly determine which unit gets which treatment (when there are many units and each unit will be given exactly one treatment), or to randomly determine the order the treatments will be administered (when one unit is going to receive two or more treatments).

When discussing the principle of control, there is no need to speculate. Discuss the actual things they have obviously done to control outside factors or certainly should have done.

By repetition, they mean what I call replication; quite simply: how many times is each treatment being applied?

Note also that we learned in Lesson 2 that correlation does not imply causation. Just because a pattern is observed between x and y does not mean we have proven that x causes y. But, the whole point of designing an experiment is to identify possible cause and effect. If an experiment has been designed properly, we have every right to believe we have proven that blank causes blank, provided we have seen a significant difference in the response variable, when applying one treatment as compared to another.

Experiments can prove causation!

Question 5: Design of Experiments 2