Stat 1000: Tips for Distance Assignment 1 (classroom sections should take a look, too)

Published: Tue, 01/31/17

Try a Free Sample of Grant's Audio Lectures
(on sale for only $30)
Don't have my book or audio?  You can download a free sample of my book and audio lectures containing Lesson 1:
Did you read my tips on how to study and learn this course?  If not, here is a link to those important suggestions:
Tips for Assignment 1
Here is a link to the actual assignment, in case you don't have it:
Study Lesson 1: Displaying and Summarizing Data and Lesson 2: Regression and Correlation in my book, if you have it, to prepare for this assignment.

Of course, always seek out assistance from my book, your course notes, etc. if you ever hit a question you don't understand, but try not to be learning things as you do an assignment.  Learn first, then put your learning to the test.

So that you don't have to study too much at once, I recommend this course of action:
  1. Study my Lesson 1 then attempt questions 1-6 on the assignment.
  2. Study my Lesson 2 then attempt questions 7-15 on the assignment.
Question 1
This is a standard question about classifying variables, similar to my Lesson 1, #1.
Question 2
Remember, if you find the total of the second column (the frequency or count column) in a frequency table, that will tell you n, the sample size.

You do know the sample size, n, (the total count in the Frequency column), so you can use the steps I teach in Lesson 1 to find the location of any quartile.  Then just make a running total of the counts in the intervals.  How much data is in the first interval? (The count or frequency as given in the second column.)  Now add the count in the second interval (for example if there are 3 scores in the first interval, and 7 scores in the second interval, that means there are 3+7=10 scores in total in the first two intervals.  Those must be the 10 lowest scores in the data set.  Continue adding the frequencies in each interval until you reach or exceed the count you are looking for that marks the location of the first, second or third quartile as desired.
Question 3
Make sure you compute the five-number summary by hand, as I demonstrate in Lesson 1, #4 to prepare for this question.  Now, they want you to use the 1.5 IQR Rule to identify the outliers. 
Question 4
Be clear what this question is asking. It is not asking for the limits you would compute from your 1.5 IQR Rule. It is asking where the whiskers will be drawn. The whiskers in a modified or outlier boxplot are drawn to the lowest and highest data scores that are NOT outliers.
Question 5
Remember, standard deviation is a measure of spread and indicates whether data tends to be close to the centre or far away from the centre.
Question 6
This is an example of weighted mean. Unfortunately, I do not show examples of this in my book.  However, their Unit 1 Practice Questions 18-24 illustrate the weighted mean formula in action. This particular question has thrown in a twist because it is making you work backwards.
  1. Multiply the midterm score by its weight (.35) and the assignment score by its weight (.15). You now know the total mark the student has earned so far.
  2. So, what weighted mark must the student get on the final to bring their total up to 75 (a B)?
  3. Remember, that final is worth 50%, the weighted mark needed is like getting a mark out of 50 on the exam.
Questions 7 to 11
Similar to the concepts I discuss in my Lesson 2, #3.
  • In each case, ask yourself, would you expect a positive correlation, a negative correlation, or neither? Remember, a positive correlation tells us, as x gets bigger, y gets bigger, whereas a negative correlation tells us, as x gets bigger, y gets smaller.
  • If you do expect a nonzero correlation, would you expect the correlation to be perfect (1 or -1)? Or, would it be just not 0.
  • Unless it is obvious there would be a perfect correlation or no correlation at all (r=0), they are not expecting you to know the exact value of r, just is the value of r plausible. For example, if you believe there would be a negative correlation but not a perfect falling line, r could conceivably be any negative number between 0 and -1. So r could be -0.38 for all you know, or -0.92. But r couldn't be -1.23 (because r is always between -1 and 1).
Question 12
Be sure to study Lesson 2, #1 at the very least before attempting this question.

NEVER FORGET:
r-squared tells you the proportion (or percentage) of the variation of y explained by the regression with x.  If they ever give you a percentage or a proportion in a linear regression context, they are almost certainly telling you r-squared. If they ever ask for "the proportion of the variation of [blank] ... explained by the regression with [blank]..." they are asking you for r-squared.

If you are given r-squared, then it is a simple matter of square rooting it to establish r.  But be careful! What sign should r have? r-squared does not tell you that. Look at the problem in context and think about that. Is there a positive correlation or a negative correlation? Remember that the slope always has the same sign as r.
Question 13
I show you how to compute a residual in my Lesson 2, #1(j).  This is a two-step process.  You must first make the appropriate prediction, then compute the residual.
Question 14
I show you how to interpret a slope throughout Lesson 2, and give you a specific example in #1(f) of my book. 
Question 15
Not sure what they expect you to say here. "Because." Predictions aren't perfect. Is it an extrapolation? No. Perhaps consider the value of the correlation.