Stat 2000: Tips for Assignment 11
Published: Sun, 03/25/12
The final exam seminar will be on Sunday, April 15, 2012 from 9:00 am to 9:00 pm.
Please click
this link for more information about the seminar and to register if you
are interested:
If you ever want to look back over a previous tip I have sent, do note that all my tips can be found in my archive. Click this link to go straight to my archive:
Did you miss my Tips on How to Do Well in this Course? Click here
Did you miss my Tips for Assignment 10? Click here
If you are taking the course by Distance/Online (Sections D01, D02, etc.), click here for my tips for your Assignment 11.
If you are taking the course by classroom lecture (Sections A01, A02, etc.), click here for my tips for your Assignment 11.
Tips for Assignment 11 (Sections A01, A02, etc.)
There is no Assignment 11 for the classroom lecture sections.
Study Lesson 8 in my book, if you have it, to prepare for this assignment.
In question 1, the joint distribution is simply the joint proportions found by
dividing the appropriate cell count by the Grand Total, and the marginal
distribution is the marginal proportions found by dividing the
appropriate row or column total by the Grand Total. I discuss this in more
detail at the start of Lesson 8 in newer editions of my book.
Question 2 is playing with
the concepts I discuss in question 4 of Lesson 8.
In question 3, the hypotheses are a matter of deciding are they doing a test for independence or a test for homogeneity, as I discuss in newer editions of my book.
Question 4. To enter this
data in JMP. Click New Data Table. You will need a total of three columns.
Double-click Column 1 and name it "Music" and change the Data Type to
"Character" and the Modeling Type to "Nominal". Double click the space to the
right of the Music column to create a new column. Name that column "Movies" and
change the Data Type to "Character" and the Modeling Type to "Nominal". Double
click the space to the right of the Movies column to create a new column. Name
that column "Count" and keep the Data Type as "Numeric" but change the Modeling
Type to "Nominal".
Each row in the JMP data table is used to
enter the information for a particular cell of the two-way table. The first row
will represent the 1,1 cell; the second row will represent the 1,2 cell; etc.
For example, your 1,1 cell gives you the observed count for the young adults who
prefer Contemporary music and Action movies. In the JMP data table, in row 1
type "Contemporary" in the Music column, "Action" in the Movies column, and type
the given observed count in the "Count" column. Type the info for the 1,2 cell
into the second row of your JMP table. That is the observed count for the young
adults who prefer Contemporary music and Comedy movies, so you will type
"Contemporary" in the Music column, "Comedy" in the Movies column and the
observed count in the Count column. In the third row you will type Contemporary
in the Music column, Drama in the Movies column, and the observed count for the
1,3 cell in the Count column. Continue in this fashion all the way to the
sixteenth row where you will type "Rock" in the Music column, "Horror" in the
Movies column, and the observed count for the 4,4 cell in the Count
column.
You will notice that the first two columns
of the JMP table are used to specify which row and column of the two-way table
you are talking about, and the third column enters the observed count for that
particular cell.
Once you have entered in all the observed
counts, select Analyze, Fit Y By X. Select "Movies" and click "Y, Response",
select "Music" and click "X, Factor", and select "Count" and click "Freq".
Click "OK". Click the red triangle next to "Contingency Analysis of Music by
Movies" at the top and deselect "Mosaic Plot" to remove that from the output.
You now see a Contingency Table (or two-way table) and the "Tests" below it.
Click the red triangle next to Contingency Table and make sure that all that is
select is "Count", "Expected" and "Cell Chi Square" to display those values in
each cell of the table. Note the Pearson ChiSquare is the test statistic for
the problem (in the last row of the "Tests" output) and the Prob>ChiSq is the
P-value for that test.
When they ask what two
cells contribute most to the test statistic, they are asking which two cells
have the largest chi-square values.