Stat 1000: Assignment 1 Tips (Classroom Lecture Sections)
Published: Wed, 01/30/13
My tips for Assignment 1 are coming below, but first a couple of announcements.
Please note that my first two-day review seminar for
Stat 1000 will be on Saturday, Feb. 2 and Sunday, Feb. 3, in room 100 St. Paul's College,
from 9 am to 6 pm each day. This seminar will cover the lessons in Volume 1 of my book.
For more info about the seminar, and to register if you have not done so already, click this link:
Please note that I am now taking registrations for my midterm exam
prep seminars. Please click this link for more info and to register, if
you are interested:
Make sure you do: Tips on How to Do Well in Stat 1000
Did you read my Tips on what kind of calculator you should get?
If you are taking the course by Distance/Online (Sections D01, D02, etc.), I have sent tips for Assignment 1 long ago. Check my archive:
Tips for Assignment 1 (Classroom Lecture Sections A01, A02, A03, etc.)
Don't have my book? You can download a free sample containing Lesson 1 at my website here:
Study Lesson 1 in my study book (if you have it) to learn the concepts involved in Assignment 1.
For the JMP 10 part of the assignment, here are some tips:
If you have not done so already, you need to download JMP to
your computer. Here is the direct link where you can get it (you need
to know your UMNET ID and password):
Once you have installed JMP 10 and opened it, you are shown a
menu with various buttons to click. You will almost always click "New
Data Table" to enter new data. That is the icon on the far left of the
top toolbar (it looks like a tiny little spreadsheet with a yellow star,
point your mouse at it and you should see the label "New Data Table"
pop up.
In the rare event they have given you a
JMP file with the data already entered in it, you will simply open that
file which would probably already open JMP for you. Just click the
"Open" icon on the same toolbar as the "New Data Table" icon, or, if you
already see the file in the "Recent Files" screen, simply double-click
that. If you happen to
enter data in yourself and save the file (a good idea), you can select
"Open" to open up the saved file.
Question 2 deals with some aspects of
quantitative distributions. Remember that a frequency table is a
precursor to a histogram. Visualize the histogram (don't actually make a
histogram, just picture it) to help answer the questions. Remember,
quartiles break the data up into 25% sections. You cannot actually
compute the median, mean or quartiles because you do not have the actual
data. You don't need to. The shape of the distribution is enough to
know if the mean is larger, smaller or the same as the median. And, you
do know the sample size, n, (the total count in the Frequency column),
so you can use the steps I teach in Lesson 1 to find the location of any quartile . Then just count through the intervals. You know how much data
is in the first interval, second interval, etc., so you know which
interval must contain the first, second or third quartile.
Question 3 is to be done by hand. Note that part
(b) wants you to mis-read the skew or symmetry by looking at the
whiskers' full length, regardless if there are outliers or not. They
give you the chance to properly describe the shape later, after you have
done a modified boxplot.
Do not use JMP for the stemplot in part (a). You
can just type the stemplot directly into the textbox they provide. Note that you are told to trim the leaves. That means that you cut away the second decimal place (don't round off, just cut it off as though it was never there in the first place). For example, 17.37 would be trimmed to 17.3. That means 3 is the leaf and 17 is the stem. Make sure you split the stems, and also make sure you include a note explaining that the leaves are decimal values.
Use
the vertical
line on your computer keyboard to separate the stem from the leaves
("SHIFT \" will give you " | "). Don't worry if your columns don't end
up perfectly lined up, just do the best you can. Be sure to label the
first line in your stemplot "Stem | Leaf", then enter all the stems and
leaves row-by-row underneath. Don't forget to comment on the shape of the distribution (peaks, symmetric, left-skewed, right-skewed, outliers.)
If you want to make sure your stemplot looks prettier, you
could also make the stemplot in your word processing program (such as
Word), using the method I outline above, and using spaces or tabs to
make sure everything lines up nicely. Don't forget to write in your
comment on the shape, too. Then save the file as a PDF and upload it
into the textbox.
To make the histogram in question 4(b):
First, enter the data into JMP manually:
Click the "New Data
Table" icon on the toolbar at top left in the JMP home screen. You are
automatically taken to an empty spreadsheet with one
column. Double-click "Column 1" and change its name to "Day", or
right-click "Column 1" and select "Column Info" and type in the name
"Day" and click OK. You will need to make a second column, too. Double-click the
region to the right of your "Day" column at the top to create "Column
2". Double-click "Column 2" and rename it "Price" or right-click "Column
2" and select Column Info and rename it "Price."
Now just type in the various days and prices for each day into the appropriate cells column using your "Tab" or down arrow button
to move to each proceeding cell. You can also hit "Enter" after each
piece of data to enter it and move to the next cell. I suggest you
enter like this:
Type in "1" in row 1 of the Day column to represent Day 1,
press "Tab" to be automatically moved into the "Price" column. Type in
the price for Day 1. Press "Tab" again to be automatically
moved to row 2 in the Day column. Continue typing in the data,
pressing "Tab" to move from cell to cell.
Once you have entered all the data down your columns, you are ready to make your histogram. In the toolbar at the
top, select Analyze then select Distribution. In the "Select Columns" part of the pop-up window, click the column you
want the histogram for ("Price" in this case) to highlight it, and click the Y, Columns
button. You should see the "Price" column appear in the section to the right of the "Y, Columns" button. Click OK.
It now opens yet another pop-up window called "Distributions"
where your histogram should appear. Your histogram appears sideways. They want to
see it the typical way,
so click the red triangle next to "Price" above the histogram and
select Histogram Options from the drop-down menu. Deselect "Vertical"
and it will turn it the proper way.
Now, move your cursor below the horizontal axis of the
histogram and double-click to see a pop-up window titled "X Axis
Specification". Type "18.8" for the Minimum and "22.2" for the Maximum, as they have requested. In the
box labeled "Increments," type in "0.5" In the box labeled "# Minor
Ticks", type 0, then click OK. (You should now see a histogram with a
scale counting by "0.5's", each bar is 0.5 units wide (one bar will be from
18.8 to 19.3, for example), and the scale counts up to 22.2. The minor ticks
subdivide the bars in your histogram; for example, if you set the
number of minor ticks as "1", you will see one tick between each mark on
the scale (i.e. the scale would now go by fives; if you set the number
of minor ticks as "2", you will get 2 ticks between each mark on the
scale.)
If you want to hide all the other parts of the output (but
they said you don't have to), click that same red triangle again and
deselect "Outlier Box Plot" and anything else that has a check mark next
to it. Click the red triangle again, select "Display Options" and
deselect "Quantiles" and "Summary Statistics" to make those parts
disappear. Conversely, you can make the Quantiles and Summary
Statistics disappear if you simply click the gray triangles (to the left
of the red triangles) next to their title bars. Click the gray
triangles again to make them reappear.
Press "Alt" on your keyboard or click the thin blue line that
is near the top of the window to get the toolbar icons to appear. Select "File" then "Save As" to get a pop-up window. Type in whatever name you want the file to have in the "File name" section. Click
the "Browse Folders" arrow and select which folder you want to save the
file in (I suggest you select "Desktop" so that the file will just
appear write on your desktop home screen. Finally, click the drop down
arrow in the "Save as type" section and select "PDF File". Click
"Save". You should now have your file ready to upload into the
assignment.
To upload your file into the text box they provide:
Click "HTML editor" below the text box to make a toolbar appear in the
text box. Click the toolbar option called "Link" and select
"Website/Uploaded File." In the pop-up window that appears, click the
button called "Find/Upload File" (it is at the bottom of the pop-up
window, you may have to enlarge the box or scroll down to see it).
Click the "Browse" button and find the histogram file you just saved.
Either double-click that file or select it and click "Open" and you
should see the path to that file appear in the Browse box. Click
"Upload File" and its name should appear in the "Uploaded Files" pop-up
window. Select the file in the list of "Uploaded Files" to highlight it
and click OK and you should see the file appear in the text box.
Question 4(c). To make a Time Series or Time Plot:
First, return to your data table where you entered in the Day and Price data for part (b). You should see the data file in the "Window
List" on the right in the JMP home screen. Double-click the name of the
file (perhaps it is called "Untitled" if you never saved it) to open up
the data table.
You are now ready to make the time series. Select Analyze in the
toolbar, then select Modeling in the drop-down list and finally select
time series. Select your time variable "Day" and click "X, Time ID" and select
your variable you are tracking "Price" and click "Y, Time Series". Click OK.
Just ignore that other pop-up menu asking about time lags or
autocorrelations or whatever, click OK and move on. None of that has
anything to do with the time series.
You should now be looking at your Time Series with "Day" on the horizontal axis and "Price" on the vertical axis. Click the red triangle next to "Time Series Price" and deselect
"Autocorrelation" and "Partial Autocorrelation" to remove those parts
of the output. Click the red triangle again, select "Graph" then
deselect "Mean Line". That removes the horizontal line in your time
series showing the mean points score.
They also ask you to describe the overall trend. You could
just type your description into the text box they provide after you have
uploaded the time series file. Alternatively, you can do
this by adding a note to the graph itself. Press "Alt" on your
keyboard, or click the thin blue line near
the top of the window that has the time series to reveal the
toolbar. Select the "Annotate" icon (it looks like a little white
notecard with a blue "T" on it and is near the right end of the
toolbar). Click a nice empty space in the output window below the time
series and drag to make
yourself a nice big box to type in your comments, being careful not to
place the box anywhere it will block out your graphs. Describe the
trend you see. One or two sentences is sufficient. If the trend is
consistent, say so. If there is a change in the trend, say so, and
indicate where the change occurred. (I am not saying there is a
change. I have no idea, I haven't made the graph.)
Press "Alt" on your keyboard or click the thin blue line that is near the top of the window to get the toolbar icons to appear. Select "File" then "Save As" to get a pop-up window. Type in whatever name you want the file to have in the "File name" section. Click
the "Browse Folders" arrow and select which folder you want to save the
file in (I suggest you select "Desktop" so that the file will just
appear write on your desktop home screen. Finally, click the drop down
arrow in the "Save as type" section and select "PDF File". Click
"Save". You should now have your file ready to upload into the
assignment.
To upload your file into the text box they provide: Click "HTML
editor" below the text box to make a toolbar appear in the text box.
Click the toolbar option called "Link" and select "Website/Uploaded
File." In the pop-up window that appears, click the button called
"Find/Upload File" (it is at the bottom of the pop-up window, you may
have to enlarge the box or scroll down to see it). Click the "Browse"
button and find the histogram file you just saved. Either double-click
that file or select it and click "Open" and you should see the path to
that file appear in the Browse box. Click "Upload File" and its name
should appear in the "Uploaded Files" pop-up window. Select the file in
the list of "Uploaded Files" to highlight it and click OK and you
should see the file appear in the text box.
Question 5 uses the CFL and NFL data they provide.
Open a "New Data Table" in JMP. You will make two columns, but not the way you might think. Do not put the CFL data in one column and the NFL data in another!
Double-click Column 1 (or right-click and select Column Info) and name it "Points Scored". Type all 20 scores from the CFL data followed by the 17 NFL scores giving you a total of 37 rows in the first column. Double-click the region to the right of Column 1 at the top to create Column 2 and name that column "League." Type CFL in the first twenty rows of that column (better yet, type it once, copy and paste it into the next nineteen rows; that way you ensure it is typed exactly the same in all twenty rows as is necessary). Then type (or copy and paste) NFL in the remaining rows of column 2.
To make the side-by-side boxplots: Select "Analyze" then "Fit Y By X".
Highlight "Points Scored" and click "Y, Response". Highlight "League" and click
"X, Factor". Click OK. This should open a pop-up window with a bunch
of dots arranged vertically on a graph for "CFL" and "NFL".
If that does not happen, return to the data table and double-click each column (or right-click and select column info). The Points Scored column better have Data Type as "numeric" and Modeling Type as "continuous." Change those settings if not. The League column column better have Data Type as "character" and Modeling Type as "nominal." Change those settings if not.
Now click the red triangle next to "Oneway Analysis ..." and
select "Quantiles." Your side-by-side box plots should appear on the
graph as well as a Quantiles output below that shows you the five-number
summary among other things. Click the red triangle again and select
"Display
Options" (down near the bottom of the menu), then deselect "Grand Mean"
to get rid of the horizontal line in the graph showing the mean of
all the scores. Click the red triangle once more, select Display
Options again, and select "Points Jittered" to spread out the data
points on the graph so you can identify all the data points.
You should be able to see the outliers on the graph. You can click a point that is an outlier and then return to
the data table (in the Window List of the JMP home screen)
and scroll down the table to see the corresponding row highlighted. You can now easily read off the scores that are outliers.
They ask you to compare the distributions according to shape,
centre and spread and identify the outliers. You could
just type your comments into the text box they provide after you have
uploaded the time series file. Alternatively, you can do
this by adding a note to the graph itself. Press "Alt" on your
keyboard, or click the thin blue line near
the top of the window that has the side-by-side boxplots to reveal the
toolbar. Select the "Annotate" icon (it looks like a little white
notecard with a blue "T" on it and is near the right end of the
toolbar). Click a nice empty space in the output window below the
quantiles and drag to make
yourself a nice big box to type in your comments, being careful not to
place the box anywhere it will block out your graphs. Answer their
question. One or two sentences is sufficient. Don't forget to identify
the actors or actresses whose ages are outliers.
To upload your file into the text box they provide: Click "HTML
editor" below the text box to make a toolbar appear in the text box.
Click the toolbar option called "Link" and select "Website/Uploaded
File." In the pop-up window that appears, click the button called
"Find/Upload File" (it is at the bottom of the pop-up window, you may
have to enlarge the box or scroll down to see it). Click the "Browse"
button and find the histogram file you just saved. Either double-click
that file or select it and click "Open" and you should see the path to
that file appear in the Browse box. Click "Upload File" and its name
should appear in the "Uploaded Files" pop-up window. Select the file in
the list of "Uploaded Files" to highlight it and click OK and you
should see the file appear in the text box.
Question 6 should be done by hand (i.e. with your
calculator, not with JMP). Use the Stat Mode on your calculator to
compute the Mean and Standard Deviation. Check the Appendix at the back
of my book to learn how to use the Stat Mode on your calculator. Here
is a link to a digital copy of that appendix:
Note that 6(c) is introducing a
key concept about changing the units in data. Be sure to read the
"Effect of Changing Units on Centre and Spread" section of my book in
Lesson 1 and see questions 17 and 18 for examples. As they say, once you know the mean and standard deviation from parts (a) and (b), you can convert them into the mean and standard deviation for part (c) using the conversion formula they gave you at the start. But do it properly! Remember, you apply the formula differently depending on whether you are converting a measure of centre or a measure of spread.
In 6(d),
consider this: Let's say you are taking a course, and your average
mark so far is 65. What will happen to your average if you score higher
on the next test? What if you score lower on the next test? What
would you have to get on the next test to have your average stay 65? For 6(e),
having decided what that 8th score must be in part (d), how much does
that score deviate from the mean? If that is a larger deviation than the
standard deviation you computed earlier, you have increased the overall
standard deviation; if it is the same amount of deviation as earlier,
you have not changed the standard deviation at all; if it has a smaller
deviation, you have decreased your overall standard deviation.
Question 7: First, you need to know the scores
attached to each letter grade. An A+ is 4.5, A is 4, B+ is 3.5, B is 3,
etc. To compute your grade point average:
First, make a new column where you multiply each grade score by the
number of credit hours. For example, if you got a B+ in a 3
credit-hour course, you would multiply 3.5 by 3 to get 10.5 in this new
column. Find the total of this new column and find the total number of
credit hours. Divide the total of the new column by the total number of
credit hours to get the GPA. Put another way, if you got a B+ in 3
credit-hour course, it is as though you scored 3.5 three separate
times. You could put your calculator in Stat Mode, and enter 3.5 in
three separate times. If you got an A in a 6 credit-hour course, you
got 4.0 six times. Enter 4.0 six separate times. After you have
entered all the data, your calculator will tell you the mean (your GPA).