Stat 2000: Tips for Web Assign HW 03
Published: Thu, 02/03/11
Hi ,
The midterm exam prep seminar for Stat 2000 is on Saturday, February 26. Go to www.grantstutoring.com and click the "Seminars" button for more info and to register (if you have not done so already). (Yes, I know that is during Study Week, but don't blame me, blame them for scheduling the midterm exam the next week.)
Join Grant's Tutoring on Facebook or follow Grant on Twitter.
Simply go to www.grantstutoring.com and click the Facebook and/or Twitter icons.
You are receing this email because you indicated when you signed up for Grant's Updates that you are taking Stat 2000 this term. If in fact, you do not want to receive tips for Stat 2000, please reply to this email and let me know.
If you ever want to look back over a previous tip I have sent, do note that all my tips can be found in my archive. Click this link to go straight to my archive:
Throughout the term I will send you all sorts of tips to help you study and learn the course. You probably already have done so, but, if not, I strongly recommend you purchase my Basic Stats 2 Study Book. You will find it a great resource to learn the course. I pride myself in explaining things in clear, everyday language. I also provided numerous examples of all the key concepts with step-by-step solutions. You can order my book at UMSU Digital Copy Centre at University Centre at UM campus. They make the book to order so please allow at least one business day. The book is split into two volumes and each volume costs $45 + tax.
Tips for Web Assign HW 03
When working with Web Assign, always enter the answer to one specific box and then click "Submit Answer" to confirm that is correct before you move on to another box. Do not enter several answers all at once in several boxes before you click "Submit Answer". You risk being marked wrong due to some typo or something.
For some strange reason, JMP 8 occasionally computes wrong answers even if you have copied and pasted your data correctly. I suggest that, if it is feasible, type the given data into your calculator (in Stat mode as shown in Appendix D of my book), and have your calculator compute the sample mean. Compare that answer with JMP's answer for the sample mean. If they are the same, everything is fine. If they are not the same, close JMP 8 and restart it, recopy and paste the data, and check again. Sometimes you have to do this 2 or 3 times before JMP finally works. If it is not feasible to use your calculator to compute the sample mean, have JMP do the question 2 or 3 times, being sure to restart JMP and recopy the data each time, and confirm that JMP gives you the same answer each time before risking entering the results into Web Assign.
If you are taking the course by distance/online (Section D01) click here to see your tips for HW 03.
Study Lesson 4 in my study book to prepare for this assignment.
Be sure to use your Rule of Thumb (Lesson 4) for all of the questions in this assignment to determine if your are using the pooled method or the generalized method. Note, if you are using an older edition of my study book, you must use that insanely complicated degrees of freedom formula for any question that requires the generalized method. Refer to #1 on the formula sheet included in your course outline to see that formula if you can't find it in my book (it is in most of the recent editions of my book, but it depends how old your book is). Also, be sure to skim through the entire question to see if they ever specify which order they want you to subtract your means, and, if so, be sure to do as they say right from the start.
Here is how to do the JMP part of Question 3:
Open a New Data Table and type the data in manually in this manner: Name your first column "Response Time" or something like that, and type all the data down that column. Which is to say, type the Cell Phone numbers down the column and then continue to type all the Control numbers below that. Double click at the top to the right of the "Response Time" column heading to create a new column and name it something like "Type of Distraction". Down that column type something like "cell phone" repeatedly down that column in all the rows that have the scores for cell phones. Then type something like "control" repeatedly down the column in the rows that have control scores. You may want to type the phrase once and then copy and paste it down the rest of the relevant rows to ensure there are no typos. Once you have done that, double-click the "Type of Distraction" column heading and confirm that the Data Type is Character and the Modeling Type is Nominal and click OK.
Select Analyze, then Fit Y By X. Highlight the numeric column "Response Time" and click the Y, Response button. Highlight the character column "Type of Distraction" and click the X, Factor button. Click OK.
You should now see a graph with two vertical arrays of dots showing the prices of three and four bedroom homes separately. Click the red triangle above the graph and select "Display Options" and select Box Plots to see side-by-side boxplots. That will enable you to get a feel for the symmetry or skewness of the distributions to help you decide if use of t is acceptable. Even if use of t is not acceptable, you are going to use it anyway. Click the red triangle again and select "Means and Std Dev" to get a summary of the means and standard deviations of the two samples. Click the red triangle again and select "t-Test" to get the output for a hypothesis test and confidence interval assuming unequal variances. Click the red triangle again and select "Means/Anova/Pooled t" to get the output that includes a hypothesis test and confidence interval assuming equal variances. Click the red triangle again and select "Set α level" to have the outputs change the confidence intervals to your desired level of confidence. For example, if you want 98% confidence intervals you would set alpha to be .02, or, if you want 90% confidence intervals, you would set alpha to be .10. In other words, α = 1 - C.
When you are using JMP to do a two-sample hypothesis test or confidence interval, watch which way it is subtracting. It may not do it the way you expected. For example, you may have called "A" sample 1 and "B" sample 2, so you would expect to do A - B, but JMP may do B - A. It is difficult to get JMP to subtract a specific way, so it is better to let JMP do what it wants to, and you adjust to it. Look at the two sample means JMP computes for A and B, then check if the "difference" in its t test has computed A - B or B - A. If it has done B - A, then define your means accordingly. Which is to say, let μ1 = mean of B and μ2 = mean of A, then state your hypotheses accordingly.
If you do not do this, your signs will be all wrong. For example, the signs in your lower and upper limits for your confidence interval for the difference between the means would be the opposite of what they should be.
Tip: When you want to do a one-sided test, if JMP has a positive test statistic, you must be doing an upper-tailed test; if JMP has a negative test statistic, you must be doing a lower-tailed test. But, again, watch the way JMP has subtracted the two means to identify who is who.
Do the JMP in Question 4 just like I showed you what to do in Question 3 above. Your first column should have all the education scores and your second column will be a character column where you type Winnipeg and Toronto in the appropriate rows. Always make the numeric column Y and the character column X when you select Fit Y By X. Since they do not specify, you may want to use side-by-side boxplots again (like question 3) and you could also ask for a normal quantile plot by clicking the red triangle and selecting that) as well.
Again, this assignment focuses on Lessons 1 and 2 in my study book. The difference is that now you are using t instead of z because σ, the population standard deviation, is not given. Be especially sure to study the last two questions in Lesson 2 of my book where I teach how to test hypotheses for matched pairs.
Question 1 is standard stuff for a student who has studied my lessons. Be sure to use the Stat mode in your calculator to work out the mean and standard deviation. See Appendix D in my book if you don't know how.
Question 2 (bone formation):
Personally, I would not bother stacking this data like they suggest (good luck even successfully copying it and pasting it at all). I would merely type the data in manually.
Open a "New Data Table" in JMP. Double-click "Column 1" and name it something like "OC". Make sure the "data type" is numeric and the "modeling type" is continuous, and click OK. Now type the given data into the column on the spreadsheet and make sure you don't make a mistake.
If you insist on copying and stacking the data, here's how: Copy and paste the given data into a New Data Table in JMP. In the toolbar at the top, select "Tables", then select "Stack". Highlight all of the columns in the "Select Columns" box and click "Stack Columns" and click OK. You will now see all of the data stacked into one column (there will be another column showing all the column names which you can ignore). Name the column something like "OC" and make sure its Data Type is Numeric and its Modeling Type is Continuous. Click OK.
To get JMP to make confidence intervals for the mean:
Select "Analyze, Distribution" from the toolbar at top. Highlight the column you are interested in ("OC" in this case) and click the "Y, Columns" button. Click OK. You are now taken to a window showing a histogram and stuff. To get a confidence interval, click the red triangle next to your column variable directly above the histogram to get a drop-down list and select "Confidence Interval". In the pop-up window that appears, select "Other" (even if the level of confidence you desire is in the list) and type in the level of confidence you want (in decimal form, so 95% is 0.95). Make sure "Two-sided" is selected. You are not given a value for sigma in this question, so make sure the "Use known Sigma" checkbox is not selected. Click OK. A Confidence Intervals table will appear in your output screen at the bottom.
Of course, JMP will already have made a histogram for you while you were getting the confidence interval, so I would use that graph. If you want the stemplot instead, click the red triangle and select "Stem and Leaf Plot". When they ask you to comment on the suitability, remember my discussion in Lesson 1 just before question 1 about the key sample size values of 15 and 40.
To get JMP to test hypotheses for the mean:
To test a hypothesis, click that same red triangle you used to make a confidence interval and select "Test Mean". Type in the value the null hypothesis believes the mean to be and type in the known value of sigma, if you have one (otherwise leave that value blank). Click OK. A Test Mean = Value table appears in your output where, among other things, JMP gives you the test statistic and three probability values. Those three probabilities are the P-value for the three possible alternative hypotheses. JMP will use a z statistic if you are given a sigma value to enter or a t statistic if sigma is unknown.
Prob > |t| is the two-tailed P-value.
Prob > t is the upper-tailed P-value.
Prob < t is the lower-tailed P-value.
To get rid of any outputs you don't want to copy and paste, click the red triangle and deselect the unwanted things.
To copy and paste the parts of a JMP printout you do want, select the icon on the JMP toolbar that looks like a fat white plus sign "+" (the Selection tool). You can then click various parts of the printout to select the sections you want. Copy and paste into Word or something like that.
Questions 2 (e) to (g):
To make a column with the logarithms: Double-click on the empty space next to the last column of data you have to make JMP create a new column for you. Name it something like "log(OC)". Double-click that new column heading to get the pop-up menu. Click the "Column Properties" button and select Formula. Now click the Edit Formula button. In the formula pop-up screen select "Transcendental" in the Functions(grouped) menu and then select "Log" in the sub-menu. You will see Log appear in the section below with a set of brackets around a red box. Highlight the OC column in the "Table Columns" section of this screen to make OC appear in that red box. Now click OK a few times to get back to your data table and you should see your Log(OC) column filled in with numbers. Each of those numbers is the natural log of the original OC scores. Which is to say, it is identical to computing the "ln" of each OC score by pressing the "ln" button on your calculator (which is right next to the "log" button on your calculator). For example, if your OC score was 49.9, then log(OC) would be ln(49.9) = 3.91002...
You can now make the confidence interval for "Log(OC)" in the same way you made the confidence interval for "OC" except using the "Log(OC)" column, of course.
Once you have found the confidence interval limits for your Log(OC) scores, you can convert those back to OC limits by simply using the ex button on your calculator. You get ex by pressing "2nd F" "ln" or "SHIFT" "ln". For example, if your lower limit for Log(OC) was 5, then you would press "2nd F" "ln" 5 to compute e5 = 148.413... to get the corresponding lower limit for your OC score. Do not think your answers in (g) have to match your answers in (a).
Why are they doing this? This all boils down to the reliability of our confidence intervals or hypothesis tests for means. Remember, our methods are only reliable if the sample mean is normally distributed. If n < 15, we can only trust our methods if our population is normal. If n ≥ 15, we can generally trust our methods even if the population is not normal. If the population is strongly skewed or has outliers, we should use n ≥ 40. That is why they are having you make graphs. To get an idea of the possible shape of the population and therefore the reliability of your methods. Statisticians sometimes transform the data (by doing logarithms or something) in order to make a new population that is more normally distributed than the original population, and so to be able to get more reliable confidence intervals or hypothesis tests.
Which data do you think will make t more reliable in your problem? The OC data or the log(OC) data? Which confidence limits do you think are more reliable?
Question 3:
Make sure you are examining this data correctly! Again, look at my last two questions in Lesson 2. Use your calculator in Stat mode to work out the mean and standard deviation (see Appendix D in my book how to do this on your calculator).