Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to conduct mean hypothesis testing on IQ and lead level data. Here's our problem statement: Listed in the data table are IQ scores for a random sample of subjects with medium lead levels in their blood. Also listed are statistics from a study done of IQ scores for a random sample of subjects with high lead levels. Assume that the two samples are independent simple random samples selected from normally distributed populations. Do not assume that the population standard deviations are equal. Complete Parts A and B below. Use a 10% significance level for both parts. Part A OK, Part A asks us to test the claim that the mean IQ scores for subjects with medium lead levels is higher than the mean for subjects with high lead levels. And the first thing we're asked to do is to provide the null and alternative hypotheses. We're also instructed to assume that Population 1 consists of subjects with medium lead levels and Population 2 consists of subjects with high lead levels. So the null hypothesis is going to be a statement of equality, which it always is by definition. So we're going to be looking at Answer options B and D. And then to select the one with the proper alternative hypothesis, we look at this assumption that we were given here at the very end of our problem statement. Population 1 is — has the subjects with the medium lead levels; Population 2 are the subjects with the high lead levels. And we're testing the claim that the medium lead levels have a higher mean score than the high level. So the medium lead levels, [which] are Population 1, will have a higher mean score than the high lead levels, [which] are Population 2. So 1 will be greater than 2, and that's what we see here with Answer option B. The mean of Population 1 is greater than the mean of Population 2, so that's what we'll select for our answer. Good job! Now the next part of Part A asks us to provide the test statistic, and we can do this inside StatCrunch. Notice when we look at our data here, they're going to make us work a little bit here. We've got summary statistics for the high lead level population, or — excuse me, the high lead levels sample. But here for the medium lead level sample, we've got actual data. So we can get around this pretty easy. All we have to do is just calculate summary statistics for the medium lead level group, and then we've got summary stats for both groups. And we can use those summary stats to conduct our hypothesis test. So the first step I'm going to do is put this data for that first sample here in StatCrunch. I'm going to resize this window so we can see better everything that's going on. Now here in StatCrunch, I'm going to get my summary stats by going to Stat --> Summary Stats --> Columns. I select that column of data. I want to calculate the sample size, sample mean, and the sample standard deviation. And here's my numbers right here. So I'm going to move this results window down to the bottom. Now we have everything we need to calculate the test statistic by performing our hypothesis test. So to do that, I'm going to go to Stat, I go to T Stats (because we don't know what the population standard deviation is), Two Samples (because we have two independent samples), and we have With Summary (because we don't have actual data for both samples, but we do have summary statistics for both of the samples). Here in Sample 1 that — remember that was defined as the medium lead level group, and those are the summary stats that we just calculated a moment ago. So I'm going to put those numbers here. Actually the sample mean is 90.81. And notice I'm rounding to three decimal places because here those equivalent values that were given here and these summaries are rounded to three decimal places. I do the same thing with the standard deviation. And now I put in the sample statistics for the second sample. Now I scroll down here, make sure this button for Hypothesis test is selected --- that is the default, and that is what we want. This area here needs to match what we got over here with our non alternative hypothesis, so I need to change this inequality sign. And now I'm ready. I hit Compute!, and here's my test statistic, second to last number there in my results window. I'm asked around to two decimal places. Good job! Now the next part asks for the --- the next part is asking for the P-value. That's the last value there in the data table right next door to the test statistic. I'm asked to round to three decimal places. Excellent! OK, now I'm asked to state the conclusion from my test. A P-value of over 30%, we're using a significance level of 10% --- 30% is over 10%, so we're outside the reason of rejection, which means we fail to reject the null hypothesis. And every time we fail to reject the null hypothesis, there is not sufficient evidence. So we don't want Answer option D because that says there is sufficient evidence. We want Answer option A because we failed to reject the null hypothesis, and whenever we do that, there's not sufficient evidence. Good job! Part B Now Part B of this problem asks us to construct a confidence interval, which we can do reasonably well enough. Go back to your options window here, and I'm going to scroll down and switch this radio button down to confidence interval. And I need to put in a confidence level. Normally we would take our significance level and subtract it from 100%, but here we've got two alpha that we have to subtract because we've got two samples, so we're looking at, yeah, 20% that we subtract from 100%. So that gives us a confidence level of 80%. And there's my lower and upper limits right there in the results window. So all I have to do is just transfer those numbers over. Well done! Part C And now for the final question: Does the confidence interval support the conclusion of the test? Well, let's take a look at where's zero in our confidence interval. Is it inside or outside? Zero is inside our confidence interval, and so therefore it's possible that these two values could be the same. It's possible that the difference here could be zero. And if the difference is zero, that means these two could be the same. Well, if they're the same, then that's exactly what we set up here in our null hypothesis. So if they're the same, then that means the null hypothesis could be true. And if the null hypothesis is true, we don't want to reject it. So we're going to fail to reject the null hypothesis. And that's exactly the conclusion that we made right here. So yes, the confidence interval does support the conclusion of the test because zero is found inside the confidence interval. Fantastic!
And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.
3 Comments
Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to choose the best multiple regression model. Here's our problem statement. The accompanying table shows results from regressions performed on data from a random sample of 21 cars. The response variable is CITY (fuel consumption in miles per gallon). The predictor variables are WT (weight in pounds), DISP (engine displacement in liters) and HWY (highway fuel consumption in miles per gallon). If only one predictor variable is used to predict the CITY fuel consumption, which single variable is best and why? Solution OK, to solve this problem, we first need to take a look here at the table of regression equations that we have to select from. And if you notice we've got all sorts of different options here, but the ones that are going to be used are the ones here at the bottom that have only one predictor variable in them, because here in the problem statement we're only looking at the ones that have one predictor variable.
So to predict the best model, we're going to have to balance out optimum values for three different items here. The first is the P-value, the second is the adjusted R-squared value, and the third is the number of variables. The number of variables has already been taken care of for us from the restriction here in the problem statement. So all we have to do then is balance [the] P-value and adjusted R-squared value for the three possibilities here that have only one predictor variable. Well, all of the equations have the same P-value, and it's the best P-value you could possibly have — zero. So we can't use the P-value to make a determination of which model is the best. So we have to look to [the] adjusted R-squared value. And the reason why you want to use the adjusted R-squared value and not the R-squared value is because the adjusted R-squared value is adjusted for the differing numbers of variables in the different models. That tends not to be a big deal with what we're looking at here because we're restricted to just one predictor variable. But normally you don't have that restriction. And so looking at the adjusted R-squared value is always preferred over the R-squared value. So here we've got 0.696, 0.64 --- these are kind of in the same ballpark. And then right here, 0.92 --- [a] much better adjusted R-squared value for this last model here. So this is the one that we're going to select. It has the best combination of a small P-value, which is zero, and a large adjusted R-squared value, which you can see there in the table [is] 0.92. Good job! And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. |
AuthorFrustrated with a particular MyStatLab/MyMathLab homework problem? No worries! I'm Professor Curtis, and I'm here to help. Archives
July 2020
|
Stats
|
Company |
|