Project Chi-Square

Problem I.  
The following data represents the starting gate positions and how the horses finished in terms of Win (first place), Place (second place), or Show (third place). The data (all races ran in 2008) was compiled at
  in Maywood, IL.

Gate

1

2

3

4

5

6

7

8 9
Win 215 200 181 147 138 80 64 35 47
Place 194 210 164 152 138 81 76 30 59
Show 141 177 159 151 151 119 96 48 66
Total Starts 1,105 1,105 1,105 1,105 1,103 1,095 1,067 799 419

A.  Show all four steps for each hypothesis test needed to test the claim.  At a 0.05 level
     of significance, test for the existence of this dependency between starting gate position and
     whether a horse finishes "win (first)" or "Not win (not first)".

Gate

1

2

3

4

5

6

7

8 9
Win 215 200 181 147 138 80 64 35 47
Not Win 890 905 924 958 965 1,015 1,003 764 372

Step 1:   H0:   Variables are independent 
               H1:    Variables are dependent, claim.

Step 2: 

 

Use your TI-83 calculator determine the Chi-Square test statistic 
and its corresponding p-value

  1. Press the "MATRIX" button or the "2nd", " " buttons.
  2. Use the arrow keys to highlight the "EDIT" command.
  3.  With "1:" highlighted, press "ENTER" to go to matrix [A].
  4. Enter the dimensions of matrix [A], 2 rows by 9 columns by entering "2" and pressing "ENTER", followed by "9" and pressing "ENTER.
  5. Enter the data row by row pressing "ENTER" after each value.
  6. Press the "2nd" and "Mode" (QUIT) keys to complete the data entry.
  7. Press the "STAT" button.
  8. Use the arrow keys to highlight the "TESTS" command.
  9. Use the down arrow key to highlight "C: -Test" and press "ENTER".
  10. Press "ENTER" to accept [A], informing the calculator that the data is found in matrix [A].
  11. Press "ENTER" to accept [B], informing the calculator that the calculations are to be put into [B].
  12. Press "ENTER" to accept "Calculate".
  13. The p-value is approximately 5.827827*10^-42. 

 

Step 3:

Step 4:
Reject the Null Hypothesis.
The data supports the claim.  There appears to be dependency between starting position and winning.

B.  If a dependency exists between "winning (first)" and "starting position".  Find 95% confidence
      intervals for the percentage of winning for each gate.

Using the TI-83 to Calculate Confidence Intervals for Proportions.

1.  Press the "STAT" button.
2.  Use the right arrow to highlight "TESTS".
3.  Use the down arrow to select "A:1-PropZInt..." and press "ENTER".
4.  Enter 215 as the value for "x" and press "ENTER".
5.  Enter 1055 as the sample size, n, and press "ENTER".
6.  Enter 0.95 as the "C-Level" and press "ENTER".
7.  As "Calculate is highlighted, press "ENTER".
     The 95% confidence interval (1-PropZInt) is (.17123, .21791).
     Repeat for the remaining gates.

 

Gate 95% confidence interval 
1 0.17123 < P < 0.21791
2 0.15829 < P < 0.20370
3 0.14198 < P < 0.18562
4 0.11301 < P < 0.15306
5 0.10559 < P < 0.14464
6 0.05765 < P < 0.08847
7 0.04573 < P < 0.07423
8 0.02961 < P < 0.05800
9 0.08236 < P < 0.14306

C.  Which starting position is most probable to produce a winner?
     
Hint:  Which intervals overlap the interval with the largest lower limit?  Gates 1, 2 and 3.

D.   Show all four steps for each hypothesis test needed to test the claim.  At a 0.05 level
      of significance, test for the existence of a dependency between starting gate position and
      whether a horse finishes "in the money" (win, place or show" or "out of the money".)

Gate

1

2

3

4

5

6

7

8 9
In the Money 550 587 504 450 427 280 236 113 172
Not In the Money 555 518 601 655 676 815 831 686 247

Step 1:   H0:   Variables are independent 
               H1:    Variables are dependent, claim.

Step 2: 

Step 3:

Step 4:
Reject the Null Hypothesis.
The data supports the claim.  There appears to be dependency between starting position and finishing
"in the money".

E.  If a dependency exists between "finish in the money (win-place-show)" and "starting position". 
     Find a 95% confidence intervals for the percentage of "finishing in the money" for each gate.

Gate 95% confidence interval 
1 0.46826 < P < 0.52722
2 0.50108 < P < 0.56064
3 0.42674 < P < 0.48548
4 0.37827 < P < 0.43621
5 0.35838 < P < 0.41587
6 0.22987 < P < 0.28155
7 0.19628 < P < 0.24608
8 0.11726 < P< 0.16559
9 0.36340 < P < 0.45760

F.  Which starting position is most probable to produce a horse "finishing in the money"?
      Hint:  Which intervals overlap the interval with the largest lower limit?   Gates 1 and 2

Problem II.  
The following data represents the starting gate positions and how the horses finished in terms of Win (first place), Place (second place), or Show (third place). The data (all races ran in 2008) was compiled at
  in Crete, IL.

Gate

1

2

3

4

5

6

7

8 9 10
Win 263 275 244 241 297 252 166 117 80 24
Place 288 278 223 253 281 243 163 124 75 24
Show 267 248 293 208 245 232 185 155 99 32
Total Starts 1,955 1,955 1,955 1,955 1,951 1,931 1,744 1,393 915 507

A.  Show all four steps for each hypothesis test needed to test the claim.  At a 0.05 level
     of significance, test for the existence of this dependency between starting gate  position and
     whether a horse finishes "win (first)" or "Not win (not first)".

Gate

1

2

3

4

5

6

7

8 9 10
Win 263 275 244 241 297 252 166 117 80 24
Not Win 1,692 1,680 1,711 1,714 1,654 1,679 1,578 1,276 835 483

Step 1:   H0:   Variables are independent 
               H1:    Variables are dependent, claim.

Step 2: 

Step 3:

Step 4:
Reject the Null Hypothesis.
The data supports the claim.  There appears to be dependency between starting position and winning.

B.  If a dependency exists between "winning (first)" and "starting position".  Find 95% confidence
     intervals for the percentage of winning for each gate.
     Hint:  See lesson on confidence intervals for proportions .

Gate 95% confidence interval 
1 0.11940 < P < 0.14965
2 0.12525 < P < 0.15608
3 0.11016 < P < 0.13946
4 0.10870 < P < 0.13785
5 0.13692 < P < 0.16817
6 0.11548 < P < 0.14553
7 0.08141 < P < 0.10896
8 0.06943 < P < 0.09856
9 0.06913 < P < 0.10573
10 0.03371 < P < 0.07280

C.  Which starting position is most probable to produce a winner?
      Hint:  Which intervals overlap the interval with the largest lower limit? 
      Gates 1,2, 3, 4, 5 and 6.

D.   Show all four steps for each hypothesis test needed to test the claim.  At a 0.05 level
      of significance, test for the existence of a dependency between starting gate position and
      whether a horse finishes "in the money" (win, place or show" or "out of the money".)

Gate

1

2

3

4

5

6

7

8 9 10
In the Money 818 801 760 702 823 727 514 396 254 80
Not In the Money 1,137 1,154 1,195 1,253 1,128 1,204 1,230 997 663 427

Step 1:   H0:   Variables are independent 
               H1:    Variables are dependent, claim.

Step 2: 

Step 3:

Step 4:
Reject the Null Hypothesis.
The data supports the claim.  There appears to be dependency between starting position and finishing
"in the money".

E.  If a dependency exists between "finish in the money (win-place-show)" and "starting position". 
      Find a 95% confidence intervals for the percentage of "finishing in the money" for each gate.

Gate 95% confidence interval 
1 0.39655 < P < 0.44028
2 0.38792 < P < 0.43152
3 0.36714 < P < 0.41036
4 0.33781 < P < 0.38034
5 0.39992 < P < 0.44375
6 0.35488 < P < 0.39810
7 0.27333 < P < 0.31612
8 0.26059 < P < 0.30797
9 0.24858 < P < 0.30661
10 0.12606 < P < 0.18952

F.  Which starting position is most probable to produce a horse "finishing in the money"?
      Hint:  Which intervals overlap the interval with the largest lower limit?   Gates 1, 2, 3 and  5

Click on the hand to return to the Gen. Ed. Stat. projects.