Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to create a variance sampling distribution probability distribution table. Here's our problem statement: Three randomly selected households are surveyed. The number of people in the households are 2, 4, and 9. Assume that samples of size n = 2 are randomly selected with replacement from the population of 2, 4, and 9. Listed below are the nine different samples. Complete Parts A through C.
OK, Part A says, "Find the variance of each of the nine samples. Then summarize the sampling distribution of the variances in the format of a table representing the probability distribution of the distinct variance values." Well, a lot of students, when they first see a problem that looks like this, the first impulse is to freak out because they have no clue what it is they're supposed to be doing. But let's walk you through this. It's telling you what to do if you just take everything a step at a time.
So the first thing to do, it says, find the variance of each of the nine samples. Now I prefer to work these types of problems in Excel. But I've had a number of you who've requested to see this in StatCrunch. So I'm going to work this problem in StatCrunch. That means the first thing I need to do here is dump my data into StatCrunch. So let's go ahead and do that. So here's my data here in StatCrunch. And I'll resize this window so we can see a little better what's going on here.
And now again, the first thing we need to do here, it says the find the variance for each of the nine samples. Well, notice here your samples are in rows. So the first sample you've got went to the first house, and there's two people in the household. Then you went to a second house and there's two people in that household. So there's your first sample. And you did that nine times to get nine samples here.
So the variance for each of these samples --- variance is a sample stat. So we're just going to go to Stat --> Summary Stats --> Rows (because here our sample information is in rows). We don't want to include the sample number in our calculation. So I'm going to select just x1 and x2. And then down here we're looking for the variance. So we don't need all the default selections there; just the variance to select. I hit Compute!, e voila! Here are my sample variances listed in the order in which they appeared in the table.
In order to make the probability distribution table, however, the next step says, "Summarize these variances in the format of a table." So in order to get that probability distribution table out, we need to order these values here. And the easiest way to do that is come up here and just hit on the little button up here. And voila! Notice how it reordered everything from lowest to highest, smallest to greatest. If I hit that button again, it'll toggle to greatest to lowest, and it just keeps toggling back and forth every time I press that button.
So now I've got everything I need to make my table. S-squared is the variance, and these are the different variance values here. So the first variance value is listed as a zero. So that's how I'm going to select from the dropdown here. The next number that appears on the table is a 2. So that's going to be the next number here. And then just so on and so forth on down the line.
Now to get the probabilities for each of those variance values, again, look at my table. Remember probability is the part over the whole. So what's the part that has zero? Well, how many zeros do I have? I've got 3 zeros out of 9 values totals. So there's nine samples total, so that's the whole. So 3 over 9 is my probability, but 3 over 9 reduces to one third. So that's what I'm going to put in my answer field. Next, how many twos do we have in our table? We've got 2 number twos in the table out of 9 total. So that's 2 over 9. And the same thing with 12 and a half. And I've got the same thing with 24.5. Nice work!
OK, now Part B is saying that we need to compare the population variance to the mean of the sample variances and then choose the correct answer below. So first, let's get these values here.
The population variance? Well, to get that in StatCrunch, we're going to have to put the population here into StatCrunch because all it loaded was the actual samples. We need the population. So here in the problem statement, it says that the population is 2, 4, and 9. So I'm going to come back here in StatCrunch, and I'm going to label this population. So I know what I'm looking at --- two, four and nine. And now I can come up here to Stat --> Summary Stats --> Columns, select that population column, get the variance for that value, hit Compute!, and here's my population variance, which in this case is 13.
I need to compare that to the mean of the sample variances. Well, the sample variances are right here, but I can't calculate in StatCrunch any sample statistic unless those numbers are here in the data table. So I could copy these numbers over into the data table, or I could do the much easier route by going back to my options window and come down here to this box, and I check it for Store in data table. What this does when you check this box is instead of putting the results in a separate window like we normally see, it'll put the results in the data table. We can then run calculations off of those numbers in StatCrunch.
So notice here how the values that were in the table have now been put here from that separate window into the data table. Now I can just go to Stat --> Summary Stats --> Columns, and select that new column. I want the mean of the sample variances, so I must select the mean here for my sample statistic, and out comes the value that I'm going to be comparing with.
Now, technically, you know in an ideal world, these two numbers should be the same, because the variance is an unbiased estimator. However, because our sample size here is fairly small, you may seem some disparity with these numbers. So even though the problem statement is telling you to calculate each of these numbers and then compare them to choose the correct answer, if you read through each of these answer options, you're going to find that none of them apply to these specific numbers. So the best way to answer this question is to have memorized that list of biased and unbiased estimators and then answer accordingly.
Here we're talking about variances. Variance is an unbiased estimator. So the sample of --- the mean of the sample distribution should be the same as the population parameter. So in this case, we want to select the one where they're the same, so "equal to the mean of the sample variances." Well done!
Part C asks us a similar question but in a different way. "Do the sample variances target the value of the population variance." And again, you want to not just look at these numbers here, because your sample size for purposes of these problems is going to be pretty small. So you want to not look at the numbers so much as you want to have that list of biased and unbiased estimators memorized.
Variance is an unbiased estimator, so it's going to target --- the sample is going to target the population, and therefore it makes a good estimator. So this is the answer option that best reflects that. Excellent!
And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below, and let us know how good a job we did or how we can improve. And if your stats teacher was boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.
Frustrated with a particular MyStatLab/MyMathLab homework problem? No worries! I'm Professor Curtis, and I'm here to help.