ICYMI Stat 1000: Tips for Assignment 1

Published: Thu, 09/24/15

Try a Free Sample of Grant's Audio Lectures
Don't have my book or audio?  You can download a free sample of my book and audio lectures containing Lesson 1:
Did you read my tips on how to study and learn Stat 1000?  If not, here is a link to those important suggestions:
Did you read my Calculator Tips?  If not, here is a link to those important suggestions:
Tips for Assignment 1
Study Lesson 1 in my study book (see my free sample above if you don't have it) to learn the concepts involved in Assignment 1.  Remember my advice in the tips above.  Don't start working on the assignment too soon.  Study and learn the lesson first, and use the assignment to test your knowledge.  Of course, always seek out assistance from my book, your course notes, etc. if you ever hit a question you don't understand, but try not to be learning things as you do an assignment.  Learn first, then put your learning to the test.

To type in formulas you are using and to show your numbers subbed into the formulas click the button in the toolbar that looks like the Sigma Summation symbol (you have to click the "..." other options button to see the sigma formula input button.  Then click the various buttons to make your fractions and enter the symbols.

Exception: Always do any JMP stuff open-book.  Have my tips in front of you, and let me guide you step-by-step through any JMP stuff.  JMP is just "busy" work.  The sooner you get it done and can move on to productive things like understanding the concepts and interpreting the JMP outputs, the better off you will be.
Question 1
This is a standard question about classifying variables, similar to my question 1 in Lesson 1.
Question 2
Remember, if you find the total of the second column (the frequency or count column) in a frequency table, that will tell you n, the sample size.

This deals with some aspects of quantitative distributions. 

Part (a)
They want a decimal, not a percent.  For example, if you figured out that 20 out of 30 are in the given interval, then 20 divide by 30 is 0.6667, not 66.6667%.  The proportion is 0.6667.  Make sure you round off correctly!  They want four decimal places, so if the fifth decimal place is 5 or more, round up.

A relative frequency or proportion is the relevant count divided by n, the total sample size.  Throughout this course, always leave your answer in decimal form, do not change it into a percent unless they specifically request a percent.

Part (b)

Remember that a frequency table is a precursor to a histogram.  Visualize the histogram (don't actually make a histogram, just picture it in your mind) to help answer the question about shape. 

Part (c)

You cannot actually compute the median, mean or quartiles because you do not have the actual data.  You don't need to.  As I discussed in Lesson 1, the shape of the distribution is enough to know if the mean is larger, smaller or the same as the median. 

Part (d)

You do know the sample size, n, (the total count in the Frequency column), so you can use the steps I teach in Lesson 1 to find the location of any quartile.  Then just make a running total of the counts in the intervals.  How much data is in the first interval? (The count or frequency as given in the second column.)  Now add the count in the second interval (for example if there are 3 scores in the first interval, and 7 scores in the second interval, that means there are 3+7=10 scores in total in the first two intervals.  Those must be the 10 lowest scores in the data set.  Continue adding the frequencies in each interval until you reach or exceed the count you are looking for that marks the location of the first, second or third quartile as desired.
Question 3
Part (a)
To make the side-by-side boxplots:

Open a "New Data Table" in JMP.

You will make two columns, but not the way you might think. DO NOT put CFL in one column and NFL in another!

Double-click Column 1 (or right-click and select Column Info) and name it Points Scored.  Type all 20 scores from the CFL first, then enter the 17 scores from the NFL data, giving you a total of 37 rows in the first column. 

Double-click the region to the right of Column 1 at the top to create Column 2 and name that column League.  Type CFL in the first 20 rows of that column (better yet, type it once, then copy and paste it into the next 19 rows; that way you ensure it is typed exactly the same in all 20 rows as is necessary).  Then type (or copy and paste) NFL in the remaining rows of column 2.

If you have done this correctly, you should now have two columns.  The first column shows the Points Scored of all 37 games you were given.  The second column shows the League each of those 37 games came from (CFL or NFL).

Make sure the column properties are correct!  Right-click at the top of each column and select Column Info to check what it says for Data Type and Modeling Type

For Points Scored, the Data Type should be Numeric and the Modeling Type should be Continuous.  If it is not, click the drop-down lists to change them.

For League, the Data Type should be Character and the Modeling Type should be Nominal.  If it is not, click the drop-down lists to change them.

To make the side-by-side boxplots:
Select "Analyze" then "Fit Y By X".  Highlight Points Scored and click "Y, Response".  Highlight League and click "X, Factor".  Click OK.  This should open a pop-up window with a bunch of dots arranged vertically in two columns on a graph for
CFL and NFL

If  that does not happen, you do not have the correct Data Type and Modeling Type for your data! Follow my instructions above to fix your column properties.

Now click the red triangle next to Oneway Analysis ... and select Quantiles.  Your side-by-side box plots should appear on the graph as well as a Quantiles output below that shows you the five-number summary among other things.  Click the red triangle again and select "Display Options" (down near the bottom of the menu), then deselect Grand Mean to get rid of the horizontal line in the graph showing the mean of all the scores (although, you are not asked to remove the Grand Mean line, so you don't have to if you don't want to). 
 
You are now ready to insert your boxplots into the box:
I confess that they have not made it very clear at all what you should do here, so this is what I have figured out.  If they clarify this or give you an easier approach, please let me know, and I will send revised tips.
  • If you are using Windows:
  • In the JMP output screen that shows your graphs, press "Alt" on your keyboard or click the thin blue line that is near the top of the window to get the toolbar icons to appear.  Select "File" then "Save As" to get a pop-up window.  Type in whatever name you want the file to have in the "File name" section. Click the "Browse Folders" arrow and select which folder you want to save the file in (I suggest you select "Desktop" so that the file will just appear right on your desktop home screen).  Finally, click the drop down arrow in the "Save as type" section and select "JPEG File".  Click "Save".  You should now have your file ready to upload into the assignment.
  • If you are using Apple/Mac:
  • You will need to take a screen shot of your output in order to upload it.  To take a screen shot hold down Command+Shift+4 and drag the cross-hairs over the image to capture it.  The image will save a .png file to your desktop by default.

  • To upload your file into the text box they provide:
    • Click the Resources option in the toolbar at the top of UMLearn, and select Locker.
    • Click the Upload Files button. 
    • In the pop-up menu that appears, click the Upload button down near the bottom of the pop-up screen, and select the file you wish to upload.  You can also drag and drop the file into this screen, if you prefer.
    • Click Save and the file should now appear in your list of files in your Locker.
  • Once the file has been uploaded, right-click the file in the locker and select Copy link location.
  • Now, return to the question you are answering, and in the answer box, in the toolbar at the top of the window where you type in all your answers for a question, click the toolbar icon on the far left called "Insert Image.  
    • In the pop-up window that appears, paste the link you copied from your Locker and click the Add button.
    • The uploaded file should appear in your answer box.  You can click the Preview icon in the bottom right corner of your answer box if you want to confirm what it looks like.
Part (b)
This is just a matter of identifying the values of the scores depicted by dots above and below the whiskers of the boxplots for CFL and NFL.
Question 4
Part (a)
Make sure you compute the five-number summary by hand, as I demonstrate in Lesson 1, question 4.

Part (b)
They want you to make the standard boxplot, then comment on the shape you see.  However, don't even think about whether there are outliers or not!  They want you to assume there were no outliers at all (they don't even ask you to think about outliers until the next part).  In other words, do the whiskers make the distribution appear skewed?

Part (c)
Now, they want you to use the 1.5 IQR Rule to identify the cut-offs for outliers. 

Part (d)
Now, count your outliers according to the limits found in part (c).

Part (e)
Now, they want you to make the outlier boxplot which will almost certainly cause you to change your opinion about skewness.  They are trying to show you how it is important to identify outliers first before you comment on the shape of a distribution.
Question 5
Parts (a) and (b)
This question should be done by hand (i.e. with your calculator, not with JMP).  Use the Stat Mode on your calculator to compute the Mean and Standard Deviation.  Don't you dare waste your time using the formulas to compute the mean and standard deviation.  That is what your Stat Mode on your calculator is for!

Check the Appendix at the back of my book to learn how to use the Stat Mode on your calculator.  Here is a link to a digital copy of that appendix:

Make sure you round the answers off to 4 decimal places before proceeding to answer the other parts of the question.  Always use four decimal places throughout this course unless specifically instructed to do otherwise.

Part (c)
Consider this:  Let's say you are taking a course, and your average mark so far is 65.  What will happen to your average if you score higher on the next test?  What if you score lower on the next test?  What would you have to get on the next test to keep your average 65?

Part (d)
There is no need to compute the standard deviation of these 8 scores!  Having decided what that new score must be in part (c), how much does that score deviate from the mean? If that is a larger deviation than the standard deviation you computed earlier, you have increased the overall standard deviation; if it is the same amount of deviation as earlier, you have not changed the standard deviation at all; if it has a smaller deviation, you have decreased your overall standard deviation. 

The closer a value is to the mean, the smaller its deviation from the mean.  Small deviations cause low standard deviations; large deviations cause high standard deviations.
Question 6
First, you need to know the scores attached to each letter grade.  An A+ is 4.5, A is 4, B+ is 3.5, B is 3, etc.

To compute your grade point average:
First, make a new column where you multiply each grade score by the number of credit hours.  For example, if you got a B+ in a 3 credit-hour course, you would multiply 3.5 by 3 to get 10.5 in this new column. Find the total of this new column and find the total number of credit hours.  Divide the total of the new column by the total number of credit hours to get the GPA.  Put another way, if you got a B+ in 3 credit-hour course, it is as though you scored 3.5 three separate times.  You could put your calculator in Stat Mode, and enter 3.5 in three separate times.  If you got an A in a 6 credit-hour course, you got 4.0 six times.  Enter 4.0 six separate times.  After you have entered all the data, your calculator will tell you the mean (your GPA).

An easy way to think of grade points is to consider the amount of credit hours as the frequency of that grade.  Gettting an A in a 3-credit hour course, is like getting an A 3 separate times.  Getting a C in a 6-credit hour course is like getting a C 6 separate times.  It is like finding the average of three A's and six C's.  The credit hours add weight to each score.