Performing hypothesis testing on two proportions in StatCrunch

6/15/2018

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to use StatCrunch to perform hypothesis testing on two proportions. Here's our problem statement: A simple random sample of front seat occupants involved in car crashes is obtained. Among 2763 occupants not wearing seatbelts, 31 were killed. Among 7830 occupants wearing seatbelts, 19 were killed. Use a 5% significance level to test the claim that seatbelts are effective in reducing fatalities. Complete Parts A through C below.

Part A

OK, Part A asks us to test the claim using a hypothesis test, and the first part of the hypothesis test is forming our null and alternative hypotheses. We're doing a hypothesis test on proportions, and so our parameter in our hypothesis is going to be proportions, which we see here. But which of these answer options is the right one? Well, let's figure that out.

By definition, the null hypothesis is always a statement of equality. So right off the bat, we know that answer option A, C, and F are incorrect. So we're having to choose between answer option B, D, and E, because all of these answer options have the null hypothesis as a statement of equality. To determine the correct alternative hypothesis, we need to go and look at the claim. What is the claim that's being made? Back here in the problem statement, we can see we're testing the claim that seatbelts are effective in reducing fatalities.

OK, so we have two groups: One group who was not wearing seatbelts, the other group who was wearing a seatbelts. And it says here that we're supposed to consider the group not wearing seatbelts as the first sample and the group wearing seatbelts as the second sample. This is the same order in which they're listed here in the problem statement. That's great, because now we see that there's no semblance of equality that's being made in the claim; it's just one is greater than the other. So we can adopt the claim as our alternative hypothesis.

The group wearing seatbelts — if the claim is that the seat belts are going to be effective in reducing fatalities, and that means the group wearing the seatbelts is going to have a lower proportion of deaths than the group not wearing a seat belt. So p2 (the group wearing the seatbelts) will be less than p1. And that's what we see right here. So I'm going to check that answer. Excellent!

Now we're asked to identify the test statistic. And to do that I'm going to pull up StatCrunch. Notice there's no icon or any data that you have to dump into StatCrunch. And because there's no icon to click on, it's often a good idea for you to keep a copy of StatCrunch open, just in case you need to access it like you do here but you don't have any data. We don't need any data for the data table. We just need the functionality of StatCrunch. So for that, I'm going to go to Stat –> Proportion Stats –> Two Sample –> With Summary.

Here in the options window, I’m going to list my summary stats. And they're listed here in the problem statement. Again, we're asked to make the first sample the group that's not wearing seatbelts and the second sample the group that is wearing seatbelts. That's the same order that they're listed here. The number of successes is the part of the whole that we're looking at. So for that first sample, it's going to be 31. I know it's really weird to consider that people dying is considered success, but try not to think of it that way. Try to think of it as you're looking for the part of the whole that you're trying to examine. And since we're looking at fatality rate, we want to look at the number of deaths. I put the total number from that group in here. And I'm going to do the same thing for the second sample.

Now down here I want to make sure this radio button for hypothesis test is selected. This is the default selection, so we're already there. And now I want to make sure that these fields match the hypotheses that we established over here. Notice how the formatting is a little different. Here in StatCrunch you've got p1 minus p2. Over here you've got p1 equals p2. If I just subtract p2 from both sides here, I get p1 minus p2 equals 0, so that's okay. I leave that alone, and then when I make sure that this inequality sign matches. And now I'm all ready to go.

I hit Compute! and here in my results window, the second number to the end of my table is my test statistic. I'm asked to round to two decimal places. Well done!

And now I'm asked for the P-value. The P-value is right next door, the last number listed in the column. Notice when we have this “< 0.0001" listed here, that's practically zero. So I can just put zero in here. Excellent!

And now I'm asked to make a conclusion based on the hypothesis test. We do that by comparing the P-value with a significance level. We’re asked to test at 5%. The P-value is zero, so that's going to be lower than any significance level that we would use for testing. So we're definitely less than the significance level. That means we're inside the region of rejection, so we're going to reject the null hypothesis. And because we reject the null hypothesis, there is sufficient evidence. Good job!

Part B

Well, so much for Part A. Now Part B asks us to test the claim by constructing an appropriate confidence interval. I could go back here into StatCrunch and go back to the menu options, but it's much quicker if I go back to my results window, click on Options, and then Edit, go right back to the options window. Then all I have to do is flip this button, and now I'm going to be testing for the confidence interval.

The question is “What's the appropriate confidence level for an appropriate confidence interval?” To do this, we need to go back and look at our alpha (significance level), which is 5%. Normally in constructing a confidence interval, we would take 1 minus alpha, but here in this case, because we're looking at two samples, we need to select 1 minus 2 alpha. So I need to take twice alpha. So twice 5% is 10%, subtract that from 1, I get 90%. This is the appropriate confidence level for my appropriate confidence interval. I press Compute! and then here at the end of my table I see the lower and upper limits that I need to put into my answer field. I'm asked to round to three decimal places. Well done!

Now we're asked to make a conclusion from the confidence interval. And to do this, we always look for where is zero with respect to our confidence interval. Zero is outside the confidence interval. It's not inside the confidence interval, so our confidence interval limits do not include zero. And so therefore, because zero is not inside the confidence interval, there's going to be a significant difference between the two proportions. So they're not equal. And which side of zero is my confidence interval on? Well, zero is over here to the left, so all of these values here are positive. And because all these values are positive, that means this difference is always going to be positive.

Well, what does that mean? If this difference is always positive, that means p1 is always going to be greater than p2. And so, go back and look at what are these correspond to. p1, remember, was the proportion of deaths from people who did not wear the seatbelts. p2 is the proportion of people who died and they were wearing the seatbelts. So wearing a seatbelts leads to lower death, lower numbers of death, or fatality rate. So the fatality rate is higher for those not wearing the seatbelts. Nice work!

Part C

And now Part C asks, “What did this suggest? What did the results suggest about the effectiveness of seatbelts?” If we go back and we look at what we've actually concluded from the hypothesis test and the confidence interval — remember that for proportions, they don't necessarily match up and when they don't match up you want to go with the hypothesis test — in this case they actually are matching up. Both the hypothesis test and the confidence interval lead to the conclusion that the fatality rate is higher for those not wearing the seatbelts. And so we have a pretty good statistical case for suggesting that the use of seatbelts is associated with a lower fatality rate than not using the seatbelt. So I'm going to select that answer. Excellent!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

4 Comments

Austin link

11/18/2020 06:31:08 pm

You do a perfect job of explaining these problems, i’m a college student taking statistics and my teacher doesn’t do nearly as well as you for walking us through these problems. You make it short and concise and i’m so glad to have come across your website.

Tyler

12/10/2020 01:23:13 am

Love this!!! I was having the biggest problems with StatCrunch and this saved my life. Thank you!!!

11/15/2021 11:09:34 pm

I appreciate this ! Literally saved me 5-6 hours of hard work just from StatCrunch!

Alexis G

2/21/2023 12:05:55 pm

You do a AWESOME job explaining these. My stats professor does not walk us through examples or problems and does not show us how to utilize stat crunch so I was insanely lost on what do even do. THANK YOU!!!!!!!!