Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to use a nonstandard normal distribution to evaluate pregnancy and premature birth data. Here's our problem statement: The lengths of pregnancies are normally distributed with a mean of 266 days and a standard deviation of 15 days. Part A: Find the probability of a pregnancy lasting 309 days or longer. Part B: If the length of pregnancy is in the lowest 2%, then the baby is premature. Find the length that separates premature babies from those who are not premature. Part A OK, Part A is asking us to find the probability that a pregnancy will last 309 days or longer. To do this, I'm going to use the normal distribution calculator in StatCrunch, and I know I need the normal distribution calculator because here in the problem statement it says my data is normally distributed. So the first thing I need to do is open up StatCrunch. And let's resize this window so we can see everything a little bit better. Now in StatCrunch, I go to Stat --> Calculators --> Normal. Here in my Normal calculator, I need to establish the mean and the standard deviation, because the defaults here are for the standard Normal distribution and we have a nonstandard Normal distribution. Here in the problem statement, it says the mean is 266 days. So I need to put that in here. And the standard deviation is 15. And then I'm asked for the probability that a pregnancy will last 309 days or longer. Well, look at how this is ordered here. Probability is P, x is my random variable --- that's going to be the 309 days --- but it needs to be 309 days or longer, which means that this is greater than or equal to 309. So I got to flip that around. And notice here on the other side of the equals sign is my probability. This is a probability in decimal form. I'm asked to round to four decimal places, so I just type that in here. Fantastic! Part B Now, Part B asks for the number of days in which a birth would be considered premature. So babies who are born on or before how many days are considered premature? And to do that, I go back to my distribution calculator. And this is the number we're looking for now. So we get rid of that premature birth from the problem statement. We're looking for the lowest 2% of the distribution. That's the left tail of the distribution. So I need to change this to 2%. And now we just switch this around so I can get the left side and not the right side of my distribution. Now we're at the left tail of the distribution. This is the lowest 2%, and this boundary here for the tail is the value we're looking for. Rounding to the nearest integer gives me 235. Nice work!
And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.
7 Comments
Constructing & interpreting a standard deviation probability distribution with a 3-number population7/16/2019 Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to construct and interpret a standard deviation probability distribution with a three-number population. Here's our problem statement: Assume a population of 2, 6, and 7. Assume that samples of size n = 2 are randomly selected with replacement from the population. Listed below are the nine different samples. Complete Parts A through D below. Part A OK, Part A asks us to find the value of the population standard deviation, sigma. Well, that's easy enough to do. Once we put --- first, I'm going to put my samples here into StatCrunch. I don't actually need the samples to calculate the population standard deviation, but they're going to come in handy for the rest of the problem. So let's just go ahead and do that now. And I'm going to resize this window so we can see better what's going on. To calculate the population standard deviation, I need to have the population here in the data table in StatCrunch. The population are these three numbers here listed in the problem statement --- 2, 6, and 7. So I'm going to go to this next free column here in StatCrunch, and I'm going to label this column "Population." And then I'm going to put in the actual numbers for my population: 2,6, and 7. Now I can take the standard deviation of this column to find the population standard deviation. I do that by going to Stat --> Summary Stats --> Columns. Here in the options window, I'm going to select the column where that data is located. And then many students make the mistake of selecting this first standard deviation option here under the statistics window. The reason why that's a mistake is because this standard deviation is for samples. We're asked to calculate the population standard deviation, which means you need the unadjusted standard deviation. And to get that, you have to scroll down towards the bottom of the list here and select Unadjusted standard deviation. This is what you use when calculating standard deviation for a population. So I hit Compute!, and out comes my population standard deviation. I'm asked around to three decimal places. Good job! Part B Now Part B asks us to develop the probability distribution for the standard deviation of each of the nine samples. To do that, we're going to go back here to StatCrunch. And notice where the samples are actually located. They're located in rows. So the first sample is a 2 and a 2 located here in the first row. The second sample is located here in the second row. The third sample located here in the third row, and so on and so forth. So to calculate my standard deviation for these samples, I'm going to go to Stat --> Summary Stats --> Rows. Normally we select Columns, but here I'm going to select Rows because my data, the samples, are in rows. Here in the options window, I'm going to select the columns where my sample data is located. And then I'm going to go down here under Statistics, and the statistic we're calculating is standard deviation. This is for samples, so I can select that first standard deviation there in the statistics list. And I press Compute!, and out comes a window with all of my standard deviations for each of the nine samples there in my dataset . What we're looking for is a probability distribution. So we have the numbers we need here. We just need to assemble them into a distribution. The easiest way to do that is to sort these numbers, and I can do that very easily. If I come up here and click on the little arrow here in the title to the column in my results window, notice how everything's now sorted. And the default is to sort first by largest to --- I mean, excuse me, from lowest to highest number. If I want to start the other way, I just click again on that arrow, and now it's sorted from largest to lowest. But here in the promise statement, we see it says, “Use ascending order," which means from lowest to highest. So I'll click this again. Each time I click this, notice it's just toggling back and forth between those two settings. This is the setting I want, from lowest to highest. So to create my distribution, I'm going to look first at the number that is listed here first. So the first number here is 0. So in this dropdown here, I'm going to select 0. And then the probability is the part over the whole. So the part is how many zeros do I have, which are three, and the whole is how many numbers do I have total, which is nine. I've got nine numbers total. Don't look at this last number and think you've got nine, because see here, everything's been sorted so it's in a different order. So I've got nine numbers total. Three is the part that they're zero. So I've got three out of nine, or three over nine, which reduces to one over three. The next number in the dataset is the 0.7. So I'm going to select that here. How many do I have? I have two of them, so I've got two out of nine. The next number is 2.82, so I'm going to select that. I've got two of them. So that's two over nine. And then the last number 3.53, so I'm going to select that. And I have two of those --- two divided by nine. And that's really all there is to it. Well done! Part C Now Part C asks us to find the mean of the sampling distribution of a sample standard deviation. So what that means is to take these numbers that we used to create our sample sampling distribution and then find the mean of these numbers. Well, I can do that easily in StatCrunch if these numbers were listed in the data table, but they're not. They're here in an actual window. So to put them in the data table, I'm going to go back to my options window, and I'm going to check this box next to Store in data table. This will tell StatCrunch to put the results in the data table, where we can then perform further calculations on it, instead of in a separate window. So now I've got my numbers here in the data table. And now I can just calculate the standard deviation. I go up to Stat --> Summary Stats --> Columns, select that newly created column, standard deviation for the samples, and there's my mean --- excuse me, I want this. I'm off in another world here. I want the mean, the mean of the sampling distribution. So there is my mean value for the sampling distribution, 1.571. We were asked to round to three decimal places. Fantastic! Part D And now the last part, Part D asks, "Do the sample standard deviations target the value of the population standard deviation?" Sometimes these problems ask you to compare numbers. So in this case you might have been asked to compare the population value, which here is 2.16, with the mean of the sample values, which is 1.571. They don't equal to each other, and therefore we don't --- the sample doesn't target the population. The statistic doesn't target the parameter. Well, because of this, the samples that we're using here are so small, sometimes that doesn't play out quite that way. And for an unbiased estimator, you get numbers that are actually pretty much the same because you're using small samples.
So I always advise students when answering these types of questions about, you know, are your statistics targeting your parameters? Do the samples target the population? Just go by memorizing a list of biased and unbiased estimators. And we talk about those in the lecture video. So sample --- the standard deviation is going to be an unbiased estimator, so it doesn't target the population parameter. So the answer options that say that it's unbiased are going to be incorrect, because standard deviation is a biased estimator. What did I just say? Did I just say it was unbiased? Gee, I am in another world. Standard deviation is a biased estimator. And so we want to select --- here we've got Answer option A and Answer option C. A biased estimator means that the sample does not target the population. So we're going to want to select here Answer option C. Fantastic! And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to construct and interpret a relative frequency distribution from categorical data. Here's our problem statement: Among fatal plane crashes that occurred during the past 70 years, 620 were due to pilot error, 85 were due to other human error, 308 were due to weather, 267 were due to mechanical problems, and 386 were due to sabotage. Construct the relative frequency distribution. What is the most serious threat to aviation safety, and can anything be done about it? Part 1 OK, I actually think this problem is a little bit easier to solve in Excel, but I'm going to solve it in StatCrunch. In StatCrunch, we're actually going to have it do everything for us. Now we can actually calculate this stuff out, and I could take my calculator and do all the old school calculations, and that's a legitimate way to approach this problem. But I'm lazy. I'm going to have StatCrunch do everything for me. So I know it's tempting to say, "OK, we're going to make a bar plot, and we've got data here, so why don't we just press With Data?” Well, this isn't actual data. This is actually a summary of the data. We don't have the actual individual counts. We just have a summary of the counts. So because this is a summary, I'm going to select With Summary after selecting Bar Plot. The categories are the cause, and the counts are in the frequency column. I'm going to select Relative frequency. Or if I wanted to, since these are in percent form here, let's just go ahead and just select Percent. And then all we have to do is just take the numbers straight over. I'm going to tick Value above bar. This is what's going to give me the numbers that I need on my graph to put here into the answer fields in my assignment. And that's all I need to do. Hit Compute!, and out comes this wonderful little bar graph. Now notice how that, first of all, everything's just really small as far as the typeface goes. You can barely see it. And there's actually a zoom feature that we can use. But first I want to make sure that we get this in the right order, which I didn't do previously. So back in my options menu, I need to make sure I order by worksheet. And that way it'll put in the same order that we have here in our assignment. Now I've got this ordered correctly. Now to get the zoom feature, there's these three bars that you see here in the lower left. Go ahead and click on that, and then hit Zoom. And now, when I click on the zoom tool, I can zoom in on area, I can move this around, and now there's the number I need to put in for pilot error, 37.2. I hit the X to go back. Whoops, that's not what I wanted to do. I want to go back this way. And now I just go ahead and just do the same thing for each of the following categories. And it really is that simple. Now to do this the old school way, I'd have to take the sum of the numbers that are listed here, and I could do that easily enough in StatCrunch. And then once I have that sum, I go ahead and divide by the --- each of these counts by the total sum, and that gives me the same percentages that you'll see here. And I can show you that in a moment as soon as I get done with all of this business. So we've got two more numbers to put in. I'm going to put that in here. And we've got one more. Excellent! Now to illustrate what I was showing you before, just very quickly, if you go up to Stat --> Summary stats --> Columns, at the frequency, we want to get the sum. The sum is 1666. So if I take that 1666 and divide it into each one of these numbers, I'm going to get the same numbers out. So if I took the first number, 620 for pilot error, and I divided by 1666, notice we get the same 37.2% that is the correct answer. And we can do the same thing for each of the others in succession. But like I said, I'm lazy. I just let StatCrunch do all that calculating for me. Part 2 The next part of the problem asks, “What is the most serious threat to aviation safety, and can anything be done about it?" Well, the most serious threat is going to be the category with the largest percentile. And that's going to be the 37.2% related to pilot error. Can you do something about pilot error? Yeah, you could probably train your pilots better. So let's see, we got --- yeah, right here, Answer option D: "Pilot errors are the most serious threat. Pilots could be better trained." So I go ahead and select that one. Excellent!
And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below, and let us know how good a job we did or how we can improve. And if your stats teacher is boring, or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to determine the appropriate level of measurement for Olympic years. Here's our problem statement: Determine which of the four levels of measurement (nominal, ordinal, interval, ratio) is most appropriate for the data below: Years in which an Olympics was held. Solution - Example 1 OK, we're given four different answer options here, each one corresponding to the different levels of measurement. And what's interesting here is that we've got definitions of each of the different levels to help us select the right answer. So we look at the definitions here --- the ordinal level of measurement --- and this says that "the data can be ordered but differences cannot be found or are meaningless." Well, we can find differences between different years, so that's obviously not going to be the right answer. "The nominal level of measurement is most appropriate because the data cannot be ordered." Well, yeah, the data actually can be ordered. That's the whole point of having years. We can order them from low to high or high to low. That's the interval level of measurement. It "is most appropriate because the data can be ordered" --- that's true. "Differences can be found and are meaningful" --- that's true. And "there's no natural starting zero point" --- that's true. The zero point for years is just an arbitrarily chosen value, so it's just something that's accepted by convention. There's no natural point for zero, and so the interval level of measurement is what we have. And when you see years, you need to think interval level of measurement because the two pretty much go together. The final answer --- ratio level of measurement --- would not be correct because it says here "the data can be ordered," which is true. "Differences can be found or are meaningful" --- that's true. "There is a natural starting point," and that's what we don't have with the years. So the correct answer here is the interval level of measurement. Fantastic! Solution - Example 2 Let's go through one more example just to illustrate what we've got here. So now we've got the number of houses that people own. Well, the number of houses people own, would that be the ratio level of measurement? Yeah, probably, because look, the data can be ordered, the differences can be found and are meaningful. I mean, you've got one person who's got two houses, and one person's got one. That extra house --- that's a meaningful difference. There is a natural starting zero point. It's like you got zero houses. That's a natural place to start counting something. So ratio level of measurement is what we would select here. Fantastic!
And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below, and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to perform mean hypothesis testing on the number of words per page. Here's our problem statement: A simple random sample of 10 pages from a dictionary is obtained. The number of words defined on those pages are found with the results: n = 10, x-bar = 55.3 words, s = 16.6 words. Given that this dictionary has 1,456 pages with defined words, the claim that there are more than 70,000 defined words is equivalent to the claim that the mean number of words per page is greater than 48.1 words. Use a 0.05 significance level to test the claim that the mean number of words per page is greater than 48.1 words. What does the results suggest about the claim that there are more than 70,000 defined words, identify the null and alternative hypotheses, test statistic, P-value, and state the final conclusion that addresses the original claim. Assume that the population is normally distributed. Part 1 OK, the first part of this problem asks us for the null and alternative hypotheses. The null hypothesis is by definition a statement of equality, so Answer option C is not going to be the right answer because this null hypothesis is not a statement of equality. Of the three answer options that remain, let's look at our alternative hypothesis. And normally we get that from the claim. The claim here from our problem statement --- well, there's two claims: There's an original claim, and then there's a modified claim. And it looks like we're going to be using the modified claim to define our null and alternative hypothesis. And that is that the mean number of words per page is greater than 48.1 words. So we take the one that says greater than 48.1. And here it is right here, Answer option B. Nice work! Part 2 OK, the second part of this problem asks us for the test statistic . And to find the test statistic, we're going to pop out StatCrunch. So I'll take this window, and I'm going to resize it so that we can see better what's going on here. OK, inside StatCrunch, I go to Stat --> T Stats (because I don't know the population standard deviation) --> One Sample (because we have just the one sample) --> With Summary (because we don't have any actual data). Here in the options window, I need to put in my sample statistics from the problem statement. We have those right here, so I'm just going to take that information and stick it in here. The sample mean is x-bar; that's the 53 --- excuse me, 55.3. And then the sample standard deviation is going to be s; that's the 16.6. Sample size is going to be n, and that's 10. Check for this radio button for Hypothesis test. That's the default selection. We want to keep that because we're performing a hypothesis test. We want to make sure this area matches what we selected here in the previous part of the problem. So we need to change this claimed value from zero to 48.1. And then I need to make sure that this inequality sign matches what we have over here for our alternative hypothesis. And now I've got everything I need. I press Compute!, and here in my results window, the second to last value is the test statistic. I'm asked to round to two decimal places. Excellent! Part 3 Now the next part of the problem asks for the P-value. We've already done all the work to calculate it. Look back here at the results window. It's that last value there in the table, right next door to our test statistic. We're asked to round to three decimal places. Nice work! Part 4 And now the last part of this problem asks us to state the final conclusion. To do this, we're going to compare our P-value with our significance level we have here in the problem statement. It says, "Use a 5% significance level." Here, we've got a P-value of over 10%. 10% is well above 5%, so we're outside the region of rejection. And when you're outside the region of rejection, you fail to reject the null hypothesis. Every time you fail to reject, there is not sufficient evidence. And here our original claim was that there was more than 70,000 defined words. Excellent!
And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to distinguish an observational study from an experiment. Here's our problem statement: Determine whether the description corresponds to an observational study or experiment. Research is conducted to determine if there is a relation between Parkinson's disease and childhood head trauma. Does the description correspond to an observational study or an experiment? Solution OK, the key difference between an observational study and an experiment is that an observational study is just what the name says. You're just looking at what's there. You're just observing. You're not actually . . . you're not inserting anything, any sort of change, into the variables that you're looking at. An experiment, on the other hand, requires a treatment. There's something that you're doing so that you can observe a change in what you're observing here.
The problem statement says that research is being conducted, and a lot of people, when they think about research, they think about that. Especially when you're dealing with medical things, they associate that with experiments because they're thinking about some sort of drug testing, or we're testing out some sort of procedure to treat the disease. But in reality here, look at what's actually being said in the statement. There's nothing in here that says anything about a treatment. It just says research is conducted. Well, for all we know, that could mean all we're doing is simply collecting data from people who have Parkinson's disease and seeing which of them had childhood head trauma. And then we're taking that data and running a statistical analysis to see if there's a correlation between those two variables. That is actual research that could be conducted. So we don't know what's going on here. And there's no indication that there's a treatment going on here. So this doesn't qualify as an experiment. This qualifies as an observational study because, again, there is no treatment here. We're just taking the people who have Parkinson's disease and seeing if they had childhood head trauma. We're just looking to see what's there. We're not actually inserting any sort of treatment to observe any sort of change. Excellent! And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. |
AuthorFrustrated with a particular MyStatLab/MyMathLab homework problem? No worries! I'm Professor Curtis, and I'm here to help. Archives
July 2020
|
Stats
|
Company |
|