Stat 1000: Tips for Assignment 1
Published: Wed, 09/26/12
Please note that my first midterm exam prep seminar for
Stat 1000 will be on Saturday, Oct. 6, in room 100 St. Paul's College,
from 9 am to 9 pm . For complete info about the seminar, and to register if you are interested, click this link:
Did you miss my Tips on How to Do Well in this Course? Click here
Did you miss my Tips on what kind of calculator you should get? Click here
If you are taking the course by classroom lecture (Sections A01, A02, etc.), click here for my tips for your Assignment 1.
Tips for Assignment 1 (Sections A01, A02, etc.)
Study Lesson 1 in my study book (if you have it) to learn the concepts involved in Assignment 1.
For the JMP 10 part of the assignment, here are some tips:
If you have not done so already, you need to download JMP to
your computer. Here is the direct link where you can get it (you need
to know your UMNET ID and password):
Once you have installed JMP 10 and opened it, you are shown a
menu with various buttons to click. You will almost always click "New
Data Table" to enter new data. That is the icon on the far left of the
top toolbar (it looks like a tiny little spreadsheet with a yellow star,
point your mouse at it and you should see the label "New Data Table"
pop up.
In the rare event they have given you a
JMP file with the data already entered in it, you will simply open that
file which would probably already open JMP for you. Just click the
"Open" icon on the same toolbar as the "New Data Table" icon, or, if you
already see the file in the "Recent Files" screen, simply double-click
that. If you happen to
enter data in yourself and save the file (a good idea), you can select
"Open" to open up the saved file.
Question 2 deals with some aspects of quantitative distributions. Remember that a frequency table is a precursor to a histogram. Visualize the histogram (don't actually make a histogram, just picture it) to help answer the questions. Remember, quartiles break the data up into 25% sections. You cannot actually compute the median, mean or quartiles because you do not have the actual data. You don't need to. The shape of the distribution is enough to know if the mean is larger, smaller or the same as the median. And, you do know the sample size, n, (the total count in the Frequency column), so you can use the steps I teach in Lesson 1 to find the location of Q3. Then just count back through the intervals. You know how much data is in the last interval, second-last interval, etc., so you know which interval must contain the third quartile.
Question 3 requires the use of JMP.
I assume they do not want you to use JMP for the stemplot in part (a). Note that you are making the stemplot for the POINTS, not the games. You can just type the stemplot directly into the textbox they provide. Use the vertical
line on your computer keyboard to separate the stem from the leaves
("SHIFT \" will give you " | "). Don't worry if your columns don't end
up perfectly lined up, just do the best you can. Be sure to label the
first line in your stemplot "Stem | Leaf", then enter all the stems and
leaves row-by-row underneath. Don't forget to comment on the shape of the distribution (peaks, symmetric, left-skewed, right-skewed, outliers.)
If you want to make sure your stemplot looks prettier, you could also make the stemplot in your word processing program (such as Word), using the method I outline above, and using spaces or tabs to make sure everything lines up nicely. Don't forget to write in your comment on the shape, too. Then save the file as a PDF and upload it into the textbox.
To make the histogram in part (b):
First, enter the data into JMP manually: Click the "New Data
Table" icon on the toolbar at top left in the JMP home screen. You are automatically taken to an empty spreadsheet with one
column. Double-click "Column 1" and change its name to "Game", or right-click "Column 1" and select "Column Info" and type in the name "Game" and click OK. You will need to make a second column, too. Double-click the
region to the right of your "Game" column at the top to create "Column
2". Double-click "Column 2" and rename it "Points" or right-click "Column
2" and select Column Info and rename it "Points."
Now just type in the various games and points for each game into the appropriate cells column using your "Tab" or down arrow button to move to each proceeding cell. You can also hit "Enter" after each piece of data to enter it and move to the next cell. I suggest you enter like this:
Type in "1" in row 1 of the Game column to represent Game 1, press "Tab" to be automatically moved into the "Points" column. Type in the points scored in Game 1. Press "Tab" again to be automatically moved to row 2 in the Game column. Continue typing in the data, pressing "Tab" to move from cell to cell.
Once you have entered all the data down your columns, you are ready to make your histogram. In the toolbar at the
top, select Analyze then select Distribution. In the "Select Columns" part of the pop-up window, click the column you
want the histogram for ("Points" in this case) to highlight it, and click the Y, Columns
button. You should see the "Points" column appear in the section to the right of the "Y, Columns" button. Click OK.
It now opens yet another pop-up window called "Distributions"
where your histogram should appear. Your histogram appears sideways. They want to
see it the typical way,
so click the red triangle next to "Points" above the histogram and
select Histogram Options from the drop-down menu. Deselect "Vertical"
and it will turn it the proper way.
Now, move your cursor below the horizontal axis of the histogram and double-click to see a pop-up window titled "X Axis Specification". In the box labeled "Maximum," type in "130." In the box labeled "Increments," type in "10." In the box labeled "# Minor Ticks", type 0, then click OK. (You should now see a histogram with a scale counting by tens, each bar is ten units wide (one bar will be from 80 to 90 for example), and the scale counts up to 130. The minor ticks subdivide the bars in your histogram; for example, if you set the number of minor ticks as "1", you will see one tick between each ten on the scale (i.e. the scale would now go by fives; if you set the number of minor ticks as "2", you will get 2 ticks between each ten on the scale.)
If you want to hide all the other parts of the output (but they said you don't have to), click that same red triangle again and deselect "Outlier Box Plot" and anything else that has a check mark next to it. Click the red triangle again, select "Display Options" and deselect "Quantiles" and "Summary Statistics" to make those parts disappear. Conversely, you can make the Quantiles and Summary Statistics disappear if you simply click the gray triangles (to the left of the red triangles) next to their title bars. Click the gray triangles again to make them reappear.
Press "Alt" on your keyboard or click the thin blue line that
is near the top of the window to get the toolbar icons to appear. Select "File" then "Save As" to get a pop-up window. Type in whatever name you want the file to have in the "File name" section. Click
the "Browse Folders" arrow and select which folder you want to save the
file in (I suggest you select "Desktop" so that the file will just
appear write on your desktop home screen. Finally, click the drop down
arrow in the "Save as type" section and select "PDF File". Click
"Save". You should now have your file ready to upload into the
assignment.
To upload your file into the text box they provide: Click "HTML editor" below the text box to make a toolbar appear in the text box. Click the toolbar option called "Link" and select "Website/Uploaded File." In the pop-up window that appears, click the button called "Find/Upload File" (it is at the bottom of the pop-up window, you may have to enlarge the box or scroll down to see it). Click the "Browse" button and find the histogram file you just saved. Either double-click that file or select it and click "Open" and you should see the path to that file appear in the Browse box. Click "Upload File" and its name should appear in the "Uploaded Files" pop-up window. Select the file in the list of "Uploaded Files" to highlight it and click OK and you should see the file appear in the text box.
Question 3, part (c). To make a Time Series or Time Plot: First, return to your data table where you entered in the Game and Points data for part (b). You should see the data file in the "Window List" on the right in the JMP home screen. Double-click the name of the file (perhaps it is called "Untitled" if you never saved it) to open up the data table.
You are now ready to make the time series. Select Analyze in the
toolbar, then select Modeling in the drop-down list and finally select
time series. Select your time variable "Game" and click "X, Time ID" and select
your variable you are tracking "Points" and click "Y, Time Series". Click OK.
Just ignore that other pop-up menu asking about time lags or
autocorrelations or whatever, click OK and move on. None of that has
anything to do with the time series.
You should now be looking at your Time Series with "Game" on the horizontal axis and "Points" on the vertical axis. Click the red triangle next to "Time Series Points" and deselect "Autocorrelation" and "Partial Autocorrelation" to remove those parts of the output. Click the red triangle again, select "Graph" then deselect "Mean Line". That removes the horizontal line in your time series showing the mean points score.
They also ask you to describe the overall trend. You could just type your description into the text box they provide after you have uploaded the time series file. Alternatively, you can do
this by adding a note to the graph itself. Press "Alt" on your keyboard, or click the thin blue line near
the top of the window that has the time series to reveal the
toolbar. Select the "Annotate" icon (it looks like a little white
notecard with a blue "T" on it and is near the right end of the
toolbar). Click a nice empty space in the output window below the time series and drag to make
yourself a nice big box to type in your comments, being careful not to
place the box anywhere it will block out your graphs. Describe the trend you see. One or two sentences is sufficient. If the trend is consistent, say so. If there is a change in the trend, say so, and indicate where the change occurred. (I am not saying there is a change. I have no idea, I haven't made the graph.)
Press "Alt" on your keyboard or click the thin blue line that is near the top of the window to get the toolbar icons to appear. Select "File" then "Save As" to get a pop-up window. Type in whatever name you want the file to have in the "File name" section. Click
the "Browse Folders" arrow and select which folder you want to save the
file in (I suggest you select "Desktop" so that the file will just
appear write on your desktop home screen. Finally, click the drop down
arrow in the "Save as type" section and select "PDF File". Click
"Save". You should now have your file ready to upload into the
assignment.
To upload your file into the text box they provide: Click "HTML
editor" below the text box to make a toolbar appear in the text box.
Click the toolbar option called "Link" and select "Website/Uploaded
File." In the pop-up window that appears, click the button called
"Find/Upload File" (it is at the bottom of the pop-up window, you may
have to enlarge the box or scroll down to see it). Click the "Browse"
button and find the histogram file you just saved. Either double-click
that file or select it and click "Open" and you should see the path to
that file appear in the Browse box. Click "Upload File" and its name
should appear in the "Uploaded Files" pop-up window. Select the file in
the list of "Uploaded Files" to highlight it and click OK and you
should see the file appear in the text box.
Question 4 is to be done by hand. Note that part (b) wants you to mis-read the skew or symmetry by looking at the whiskers' full length, regardless if there are outliers or not. They give you the chance to properly describe the shape later, after you have done a modified boxplot.
Question 5 uses the Oscar Winners data. First click the "Resources" tab and click the Oscar Winners jmp file to open the data directly in JMP. Alternatively, you can download the file to your computer as they instruct if you wish to save the file and open it with JMP later.
Once you have opened the Oscar Winners file in JMP, first make sure the variables are correctly formatted. Select the "Award" column and right-click to get the menu and select "Column Info". The Data Type should be Character and the Modeling Type should be Nominal. If that is not correct, fix that in the drop-down menus. Select the "Age" column and right-click to get the menu and select
"Column Info". The Data Type should be Numeric and the Modeling Type
should be Continuous. If that is not correct, fix that in the drop-down
menus.
To make the side-by-side boxplots: Select "Analyze" then "Fit Y By X".
Highlight "Age" and click "Y, Response". Highlight "Award" and click
"X, Factor". Click OK. This should open a pop-up window with a bunch
of dots arranged vertically on a graph for "Actor" and "Actress".
If that does not happen, you probably did not follow my instructions
above to make sure the "Award" column is Character and Nominal. In
that case, close the window and make the appropriate adjustments in your
data table.
Now click the red triangle next to "Oneway Analysis ..." and select "Quantiles." Your side-by-side box plots should appear on the graph as well as a Quantiles output below that shows you the five-number summary among other things. Click the red triangle again and select "Display
Options" (down near the bottom of the menu), then deselect "Grand Mean" to get rid of the horizontal line in the graph showing the mean age of all the performers. Click the red triangle once more, select Display Options again, and select "Points Jittered" to spread out the data points on the graph so you can identify all the data points.
You should be able to see the outliers on the graph. As they suggest, you can click a point that is an outlier and then return to the Oscar Winners data table (in the Window List of the JMP home screen) and scroll down the table to see the corresponding performer highlighted. Alternatively, you can sort the data in the spread sheet by age. In the data table screen, select "Tables" in the toolbar and select "Sort". Highlight "Age" and click "By" and click OK. You are now given a new data table with the performers sorted in order of ascending age. You can now easily read of the performers with an outlying age (either much younger or much older than the others).
They ask you to compare the distributions according to shape, centre and spread and identify the outliers. You could
just type your comments into the text box they provide after you have
uploaded the time series file. Alternatively, you can do
this by adding a note to the graph itself. Press "Alt" on your
keyboard, or click the thin blue line near
the top of the window that has the side-by-side boxplots to reveal the
toolbar. Select the "Annotate" icon (it looks like a little white
notecard with a blue "T" on it and is near the right end of the
toolbar). Click a nice empty space in the output window below the quantiles and drag to make
yourself a nice big box to type in your comments, being careful not to
place the box anywhere it will block out your graphs. Answer their question. One or two sentences is sufficient. Don't forget to identify the actors or actresses whose ages are outliers.
To upload your file into the text box they provide: Click "HTML
editor" below the text box to make a toolbar appear in the text box.
Click the toolbar option called "Link" and select "Website/Uploaded
File." In the pop-up window that appears, click the button called
"Find/Upload File" (it is at the bottom of the pop-up window, you may
have to enlarge the box or scroll down to see it). Click the "Browse"
button and find the histogram file you just saved. Either double-click
that file or select it and click "Open" and you should see the path to
that file appear in the Browse box. Click "Upload File" and its name
should appear in the "Uploaded Files" pop-up window. Select the file in
the list of "Uploaded Files" to highlight it and click OK and you
should see the file appear in the text box.
Question 6 should be done by hand (i.e. with your calculator, not with JMP). Use the Stat Mode on your calculator to compute the Mean and Standard Deviation. Check the Appendix at the back of my book to learn how to use the Stat Mode on your calculator. Here is a link to a digital copy of that appendix:
Note that part (c) in question 6 is introducing a key concept about changing the units in data. Be sure to read the "Effect of Changing Units on Centre and Spread" section of my book in Lesson 1 and see questions 17 and 18 for examples. In part (d), consider this: Let's say you are taking a course, and your average mark so far is 65. What will happen to your average if you score higher on the next test? What if you score lower on the next test? What would you have to get on the next test to have your average stay 65? For part (e), having decided what that 8th score must be in part (d), how much does that score deviate from the mean? If that is a larger deviation than the standard deviation you computed earlier, you have increased the standard deviation; if it is the same amount of deviation as earlier, you have not changed the standard deviation at all; if it has a smaller deviation, you have decreased your standard deviation.
Question 7: First, you need to know the scores attached to each letter grade. An A+ is 4.5, A is 4, B+ is 3.5, B is 3, etc. To compute your grade point average:
First, make a new column where you multiply each grade score by the number of credit hours. For example, if you got a B+ in a 3 credit-hour course, you would multiply 3.5 by 3 to get 10.5 in this new column. Find the total of this new column and find the total number of credit hours. Divide the total of the new column by the total number of credit hours to get the GPA. Put another way, if you got a B+ in 3 credit-hour course, it is as though you scored 3.5 three separate times. You could put your calculator in Stat Mode, and enter 3.5 in three separate times. If you got an A in a 6 credit-hour course, you got 4.0 six times. Enter 4.0 six separate times. After you have entered all the data, your calculator will tell you the mean (your GPA).
Study Lesson 1 in my study book (if you have
it) to learn the concepts involved in Assignment 1. This lesson will
also set you up for Assignment 2.
Never use JMP to answer a question unless they specifically tell you to. Whenever
they do tell you to use JMP, never go out of your way to click red
triangles to add things to the graph (like put titles on histograms, or
label axes). Whatever JMP gives by default is all they require unless
they specifically request you add something to the output or remove
something from it. Of course, I will always give you specific steps to
add/remove anything they do require.
In the questions asking you what you are or are not allowed in exams, please note that a "slide rule" is just an ancient type of pathetic calculator. If you are allowed a calculator, you are allowed a slide rule (but, then again, if you use a slide rule, you probably still use clay tablets to write on).
Anytime a question wants you to "fill in the
blanks" with key vocabulary terms, go to the appropriate section of your
textbook (remember you have an online version of the textbook in Stats
Portal if you selected the electronic option on your book list), and you
will find the exact sentence they are giving you with the obvious word
they want you to type in.
The question asking you what graphs to use is really asking
you is the data "quantitative" or "categorical". If you are collecting
two sets of quantiative data, then you would use a back-to-back stemplot
or side-by-side boxplot to compare them (and no other graphs). If you
are collecting quantitative data as time goes by, you would use a time
series or timeplot and nothing else. By "how old are students' cars", assume you would say the cars are 0, 1, 2, 3, ... years old.
To type in the split stemplot they request, use the vertical
line on your computer keyboard to separate the stem from the leaves
("SHIFT \" will give you " | "). Don't worry if your columns don't end
up perfectly lined up, just do the best you can. Be sure to label the
first line in your stemplot "Stem | Leaf", then enter all the stems and
leaves row-by-row underneath. Don't forget to comment on the shape of the distribution (peaks, symmetric, left-skewed, right-skewed, outliers.)
Ignore any references to "Crunchit!". You
are using JMP 10 in this course. The assignment is just an old
assignment that they forgot to update. Use JMP 10 anytime they tell you
to use computer stuff.
For the JMP 10 part of the assignment, here are some tips:
If you have not done so already, you need to download JMP to your computer. Here is the direct link where you can get it (you need to know your UMNET ID and password):
Once you have installed JMP 10 and opened it, you are shown a
menu with various buttons to click. You will almost always click "New
Data Table" to enter new data. That is the icon on the far left of the top toolbar (it looks like a tiny little spreadsheet with a yellow star, point your mouse at it and you should see the label "New Data Table" pop up.
In the rare event they have given you a
JMP file with the data already entered in it, you will simply open that
file which would probably already open JMP for you. Just click the "Open" icon on the same toolbar as the "New Data Table" icon, or, if you already see the file in the "Recent Files" screen, simply double-click that. If you happen to
enter data in yourself and save the file (a good idea), you can select
"Open" to open up the saved file.
Question 9:
To copy and paste data into JMP: First, of
course, click the hyperlink to the data they have given you. Now, select and copy the given data set. Now, open JMP and click
"New Data Table". A pop-up window should appear showing a spreadsheet with one column labeled "Column 1". In the toolbar of this pop-up window select "Edit" then "Paste with
Column Names". That pastes all the data in and names the column
appropriately.
If you have done this correctly, you should now be looking at a column labeled "tuition" and a whole bunch of numbers representing various tuitions lined down the rows of that column.
Click the column heading "tuition" to select the column (the column name cell should be highlighted). Right-click and select "Column Info" in the menu that appears. Make sure the Data Type is Numeric and the
Modeling Type is Continuous, using the drop-down menus to fix that if
necessary. Click OK.
To make a histogram: In the toolbar at the
top, select Analyze then select Distribution. In the "Select Columns" part of the pop-up window, click the column you
want the histogram for ("tuition" in this case) to highlight it, and click the Y, Columns
button. You should see the "tuition" column appear in the section to the right of the "Y, Columns" button. Click OK.
It now opens yet another pop-up window called "Distributions" where your histogram should appear. Your histogram appears sideways. If they want to
see it the typical way,
click the red triangle next to your variable above the histogram and
select Histogram Options from the drop-down menu. Deselect "Vertical" and it will turn it the proper way. They did not request this, so you aren't obligated to do that. However, that would be a good idea if you wanted to properly read if the distribution is left-skewed, right-skewed or symmetric. But, they didn't ask you to describe the distribution.
Press "Alt" on your keyboard or click the thin blue line that is near the top of the window to get the toolbar icons to appear. Select "File" then "Save As" to get a pop-up window. Type in whatever name you want the file to have in the "File name" section. Click the "Browse Folders" arrow and select which folder you want to save the file in (I suggest you select "Desktop" so that the file will just appear write on your desktop home screen. Finally, click the drop down arrow in the "Save as type" section and select "PDF File". Click "Save". You should now have your file ready to upload into the assignment.
Question 10
For the pole-vault question: You will have to enter the data manually into JMP. Click the "New Data Table" icon to get a fresh spreadsheet to enter new data.
Click the link to the wikipedia data and be sure to scroll down to the Womens' outdoor pole-vault data.
To enter data into JMP manually: Click "New
Data Table" and you are automatically taken to an empty spreadsheet with
one column. If you ever need two or more columns, simply double-click
the space to the right of "Column 1" to create "Column 2". You can
repeat this to create "Column 3", etc. You can then type in the data,
using "enter" or "tab" or your arrow buttons on your keyboard to move
from one cell to the next.
In this particular pole-vault question, double-click "Column
1" and name it "Year". Click OK. Double-click the space to the right
of Column 1 to create Column 2. Name that column "Height". Type in the
data you have been given. Only type in the years and heights (type the heights in metres only, do not include the m for metres, and do not include the height in feet and inches at all; for example, for 1991, type in 4.05 as the height), the rest of the columns given in wikipedia are irrelevant. Be sure to highlight each column and right-click and select "Column Info" like you did in question 9 and confirm that the "Data Type" is "Numeric" and the "Modeling Type" is "Continuous" for both columns.
To make a Time Series or Time Plot: Select Analyze in the
toolbar, then select Modeling in the drop-down list and finally select
time series. Select your time variable "year" and click "X, Time ID" and select
your variable you are tracking "height" and click "Y, Time Series". Click OK.
Just ignore that other pop-up menu asking about time lags or
autocorrelations or whatever, click OK and move on. None of that has
anything to do with the time series.
To change the vertical scale, double-click the region on the vertical axis between the label (height) and the actual scale on the axis to get a pop-up window called "Y Axis Specification". Type 3.5 in the Minimum box, 5.2 in the Maximum box, and 0.1 in the Increment box. Click OK.
Press "Alt" on your keyboard or click the thin blue line that is near the top of the window to get the toolbar icons to appear. Select "File" then "Save As" to get a pop-up window. Type in whatever name you want the file to have in the "File name" section. Click
the "Browse Folders" arrow and select which folder you want to save the
file in (I suggest you select "Desktop" so that the file will just
appear write on your desktop home screen. Finally, click the drop down
arrow in the "Save as type" section and select "PDF File". Click
"Save". You should now have your file ready to upload into the
assignment.