IntroHowdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to find the regression equation and best predictive value for earthquake data. Here's our problem statement: Fifty matched pairs of magnitude-depth measurements are randomly selected from 10,594 earthquakes recorded in one year from a location in Southern California. Find the best predicted depth of an earthquake with a magnitude of 1.3. Use a significance level of 5%. Part 1 OK, the first part of this problem asks us to find the regression equation. To do this, I'm going to take the data and dump it into StatCrunch. So here's my data. I'm going to click on this icon and open my data in StatCrunch. I'm going to resize this window so we can see everything. Now, inside StatCrunch, I'm going to go to Stat --> Regression --> Simple Linear. In my options window, I'm asked to identify the columns with my x- and y-variables. Typically in these problems, the x-variable will be the one that's listed first; the y-variable will be the one that's listed second. But you can also tell that this is the way that the variables are supposed to go because, when you look in the problem statement, you're asked to find a prediction for the depth. The variable for which we find the prediction is always the Y, because it's what comes out of our regression equation. So we see that we have the right variables selected. And I just come down here and press Compute!, and here I get my results window. Inside the results window, the regression equation appears here near the top. Now, this regression equation is sandwiched in between a whole bunch of other stuff. So I like to look down here at my parameter estimates table, and if you'll notice the numbers here are the same as the numbers up here. So I like to just look down here to get the numbers to put into my answer fields. I'm asked to round to one decimal place. Well done! Part 2 Now the second part of our problem asks us to find the best predicted depth given a magnitude of 1.3. There are two options for making a prediction. One is to use the regression model, and the second is to use the mean value of the y-values. The regression equation is preferred; however, the regression model may not be suitable for use. If it's a bad model because there's no correlation between the variables, then we don't want to use it to make predictions, and in that case, we'll use the mean value.
The way that we know whether or not the model is good is to test the R-value (our correlation coefficient). So here from our data, we see our correlation coefficient is 0.19. That's not very stellar. In fact, it's probable that this is going to be a very bad model. But there's an objective way to identify that. We have to compare with our critical R-value. Here in my problem statement, there's a link to a critical value table. So I click on that link, and here the table I need to identify how many samples I have. I have 50 pairs of sample data, so I'm going to go down to the line with 50, and we were told to use a significance level of 5%, so as you can see, that's this column right here. So I go down to find the number next to 50, and here we go --- 0.279. The R-value from our data (0.19) is less than this critical value, so we haven't exceeded this threshold. That means we are in the Land of No Correlation. And that means we should not be using this regression model to make predictions. If this R value were greater than the critical R-value, then we could use it to make predictions. But that's not the case here. We have a bad model, so we're not going to use it to make predictions. Instead, we're going to use the mean value of the y-values. So to do that, I'm going to go up to Stat --> Summary Stats --> Columns, select the column with my y-values, select the mean is the statistic to be calculated, hit Compute!, and there's my best predicted value. I'm asked to round to one decimal place. Good job! And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't care to help you learn stats, go to aspiremountainacademy.com, where you can find out more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.
5 Comments
Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to evaluate the correlation of data values that produce box patterns. Here's our problem statement: Refer to the accompanying scatterplot. The four points in the lower left corner are measurements from women, and the four points in the upper right corner are from men. Complete Parts A through E below. Part A OK, Part A says, “Examine the pattern of the four points in the lower left corner (from women) only and subjectively determine whether there appears to be a correlation between x and y for women. Choose the correct answer below.” We look here at these four points that are referenced here. These are the ones for women. We see that they form a box pattern. The correlation we're asked to find is a linear correlation; if you look at all of your answer options, they are all asking about a linear correlation. Well, the points here are not forming a straight line. Therefore, we can subjectively say there's not likely to be a linear correlation. So there does not appear to be a linear correlation because the points do not form a line. Excellent! Part B Part B says, “Examine the pattern of the four points in the upper right corner (from men) only and subjectively determine whether there appears to be a correlation between x and y for men. Choose the correct answer below.” Well, we can see that these data points up here forming a same pattern as for the women. So we're going to conclude the same answer that we concluded for the women. There does not appear to be a linear correlation because the points do not form a line. Good job! Part C Part C says, “Find the linear correlation coefficient using only the four points in the lower left corner for women. Will the four points in the upper right corner for men have the same linear correlation coefficient?” OK, the first part here is to find the linear correlation coefficient for the data points corresponding to the women. That's these four data points here in the lower part of the graph. To find the linear correlation coefficient, I'm going to use StatCrunch. So here I have StatCrunch open. Notice there's no icon in this problem statement that allows me to dump the individual data points into StatCrunch. That means I'm going to have to transfer that information into StatCrunch by hand. I'm going to call this first column X, and I'm going to call this second column Y. And we can blow this graph up a bit so we can get a better look at the ordered pairs for our data points. So this first one here looks to be (3, 1), so here in StatCrunch I'll put 3 1. This is (4, 1), and this looks to be (3, 2), and this one looks to be (4, 2). OK, so now I've got those four data points in StatCrunch. Now I come up to Stat –> Regression –> Simple Linear. I’m going to tell StatCrunch were to find my x- and y-values. And then the other defaults are fine for our purpose, so I'm going to press Compute!, and here comes my results window. Notice our correlation coefficient is zero, so that's what I'm going to put here in my answer field. Excellent! “Do the four points in the upper right corner have the same correlation coefficient?” Well, those four points are forming the same pattern as the four on the bottom. So it's good to conclude that, yes, they're going to form the same pattern. Therefore they have the same correlation coefficient. Nice work! Part D Now Part D says, “Find the value of the linear correlation coefficient using all eight points. What does that value suggest about the relationship between x and y? Use alpha equals 0.05.” OK, to do this I'm going to have to put in the remaining four points from the men into my data set. So back here in StatCrunch, I'm going to put the four points into StatCrunch. This first one looks to be (9, 9), which means this next one is (10, 9). Then we have (9, 10), and this last point is (10, 10).
Now to get back to my options window, I simply go to my results window, press the Options button, select Edit, and here I'm in my options window again. I've already told StatCrunch where to find my x- and y- values, so all I need to do is repeat the calculation by pressing Compute!, and now I have a new correlation coefficient. You can see right here I'm asked to round to three decimal places. Well done! Now we're asked, “Using alpha equals 5%, what does R suggest about the relationship between x and y?” Well, to help us understand the answer to this question, I prepared a brief PowerPoint presentation that helps us run through what we need to know in order to evaluate whether or not the correlation is strong enough to be considered useful. We need to evaluate the critical R value. That comes from a table of critical R values, so here I have a sampling of such a table. There's actually one of these tables, I believe, in the insert to your textbook. Be that as it may, the way that we use the table is first we have to identify the number of data points, or pairs of data, that we have in our data set. We've got eight dots here in our scatter plot, so we have eight data points. So the first thing we do is identify the total number of data points in our data set. And then we read the value to the right of that based on our alpha level. So our alpha level from the problem listed here is five percent, so that means we're going to have a critical R value of 0.707. So our critical R value is 0.707. We need to compare that with the R value obtained from the actual data. In this case, that's going to be 0.979. When we compare these two values, we find that our R value is greater than our critical R value, and therefore because the actual R value is greater than the critical R value, we've exceeded that threshold that we need for making a conclusion of correlation. And therefore we can reject the null hypothesis that the correlation coefficient is zero and conclude that there is sufficient evidence to support a claim of linear correlation. That's the answer I'm going to select here from my different options. Well done! And now Part 2 asks, “Based on the preceding results, what can be concluded? Should the data from women and the data for men be considered together, or do they appear to represent two different and distinct populations that should be analyzed separately?” To answer these questions, I'm going to go back here to my scatter plot and look to see where my data lie. All of the women — the data points for all the women — are congregated down here. However, all the men — the data points for the men — are congregated in a separate grouping up here. There's no mixing of the data points. There's no meshing of them together there on the graph. So they're occupying distinct areas or regions of the graph. This suggests that they are two distinct populations and should therefore be considered separately. I'm going to select that answer from my options here. Nice work! And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below to let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't care to help you learn stats, go to aspiremountainacademy.com, where you can find out more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. Finding the mean absolute deviation (MAD) for hypothesis testing with the count five method4/20/2018 Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to find the mean absolute deviation (or MAD) for hypothesis testing with the Count Five Method. Here's our problem statement: The data show the weights of samples of two different sodas. Test the claim that the weights of the two sodas have the same standard deviation using a Count Five test. Part A Part A says, “For the first sample find the mean absolute deviation (or MAD) of each value. Do the same for the second sample.” OK, to do this we're gonna need to manipulate the data a little bit because as you see here in the problem statement, to calculate the MAD, I need to take the absolute value of the difference between each sample value and the mean sample value. Any kind of data manipulation is best done in Excel. StatCrunch can accommodate it, but it's really clunky. And because of that, I'm going to use Excel because Excel is just easier for me to use when I have to manipulate data. So I'm gonna dump my data here in Excel. If there's sufficient interest in learning how to do this in StatCrunch, I can certainly make a video to that end. But I don't want to just make the video if there's not a whole lot of interest with it because it is really clunky to use in Excel — excuse me, StatCrunch. (Where are you going? Yeah, we want you right there.) OK, so my window is here where it needs to be. And now the first step is to calculate x-bar, which is the mean value for all of my x-values. The x-values are located here in Column A, so I'm going to click on this first cell right after the last value there in Column A. And I'm going to calculate my average and put it there. To do that, I type in the Average function in Excel, open my parentheses, select my cells that I want to put in, close the parenthesis, press Enter, and there's the average value. Then over here in Column C, this is where I'm going to calculate my MAD values. So this is the MAD for the x, and according to the formula I have to take the absolute value of the difference between each x and x-bar. So over here I put equals, here's my Absolute Value function (abs), select my x, minus, x-bar is down here at the bottom, and then I'm gonna press F4 (because I want these dollar signs to appear so that this number stays constant when I copy the formula down for all my other x- values), close this out, hit Enter, and there's my first value. I'm asked to round to four decimal places, so I'm going to put this value and reduce it down to four decimal places (if I can find my number button). OK, there's four decimal places. And I'm gonna copy this on down for each of the different x-values, so I just drag the right corner with the left button on my mouse — drag it down — that copies it for all of my values. And now these are the values I need to put in my answer field here, so I'm going to use the Tab button on my keyboard to move between adjacent answer fields as I put each one of these numbers in. Nice work! And now I'm asked to do the same thing for the y- values, so I just repeat the process that we just went through for the y-values. Here at the bottom of my y-column, which is Column B, I'm going to calculate the average. But instead of typing it in, I could just copy the formula from the adjacent cell over. And then I need to calculate the MADs from the y-variables. To do that, I'm gonna copy this first formula here, and then I'm gonna adjust it because here it's using the average for my x-values (which is the A Column) and the average that I want to use for the Y (which is in the B column) so I have to change that A to a B. And now I can go ahead and copy this formula down for the remainder of my values, and I'm all set. I put these into my answer field. Well done! Part B OK, Part B asks us to find our test statistics, which for the Count Five Method are c1 and c2. The way we acquire the values for the test statistics is given here in the problem statement. So we let c1 be the number of MAD values in the first sample that are greater than the largest MAD value in the second sample, and then c2 is the same thing flip-flopped. So over here, we're actually gonna put in the maximum value for my Xs and then the maximum value for the Ys. And to do that, I could actually just do by inspection; there's only ten data values for my x-column, so I could just inspect this and get by eyesight the maximum value. But I don't really want to take any chances, so I'm just gonna use the Max function for Excel. Did you see there? I'm gonna do the same thing with the Y — just copy this over. There's my maximum value for the Y. So c1 is the number of MAD values in the first sample (so that's column C) that are greater than the largest MAD value for the second sample. So we're looking for all the numbers in Column C that are greater than this number in column F. So the first one is not greater, there's the second one, ... not the third one, ... no ... no ... no ... and looking at all the available values none of them are greater. So therefore c1 is going to be zero. I do the same thing with c2. I'm going to come here and use the maximum value for the Xs and I'm going to look at the MAD values for the Y and figure out how many are greater. So this one is not greater, this one's not greater, this one's not greater, this one's greater. So I got 1, 2, ... looks like it’s gonna be two. Nice work! Part C OK, Part C says, “If the sample sizes are equal, use a critical value of 5. If not, use the formula below to find the critical value.” Many students when encountering this problem, they just look at this hairy equation and think “Oh my gosh, I must have to use this.” But the instructions say if the sample sizes are equal, use a critical value of 5. Well we've got 10 x-values, and we've got 10 y-values, so the sample sizes are the same. That means the critical value is going to be five. I don't have to use that formula at all. Well done! Part D Finally, Part D asks us to resolve the hypothesis test using the Count Five Method. And to do that, we compare our critical value with our test statistics. So the first one says if c1 is greater than or equal to the critical value, then we make this conclusion. Well, is c1 greater than the critical value? c1 is 0, the critical value is 5, 0 is not greater than 5, so we're not going to make this conclusion. If c2 is greater than or equal to the critical value, then make this conclusion. Well, c2 is 2, the critical value is 5, 2 is not greater than 5, so we're not making that conclusion.
Then it says, “Otherwise, fail to reject the null hypothesis.” Well, that's the conclusion we're gonna make because none of our test statistics are greater than the critical value. So we fail to reject. Excellent! And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn’t want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to use StatCrunch to construct a confidence interval for two matched pair means of hospital admissions. Here's our problem statement: Data on the numbers of hospital admissions resulting from motor vehicle crashes are given below for Fridays on the 6th of a month and Fridays on the following 13th of the same month. Assume that the paired sample data is a simple random sample and that the differences have a distribution that is approximately normal. Construct a 95% confidence interval estimate of the mean of the population of differences between hospital admissions. Use the confidence interval to test the claim that, when the 13th day of a month falls on a Friday, the numbers of hospital admissions for motor vehicle crashes are not affected. Part 1 OK, here we have our data, and here we have the first part of our problem asking us to construct a confidence interval. To do this, I'm going to dump the data into StatCrunch. I'm going to come here and resize my window so we can see everything. Great! Now I'm going to construct my confidence interval on the data that was given. To do that, I go up to Stat –> T Stats –> Paired. Notice there's no extra menu option here with the parent option. That's because StatCrunch was coded under the assumption that if you want to run hypothesis testing or construct a confidence interval with matching pairs in your data set that you already have actual data to supply. So there's no summary option here. Here in my options window, I found it's best practice when identifying between two samples in the options window that the first sample should always be the first grouping or the grouping that's mentioned first in the problem statement (in this case, that's Friday the 6th). And then Sample 2 will be the second grouping or the grouping mentioned second (in this case, Friday the 13th). You're asked to construct a confidence interval, so I'm going to click the radio button here on confidence interval. And normally in these types of problems we’re given a significance level, or alpha, and then it's assumed then that we'll use that information to compute or calculate the confidence level that is appropriate for a confidence interval. However, in this case we're actually given a confidence level; 95% is specified here in the problem statement. So I'm just going to use that specified confidence level for my interval. That's the default selection, so I don't need to make any changes there. I press Compute!, and here is my results window. If I scroll over here, at the end of the table I can see the lower and upper limits that I'll need to use for my confidence interval. So I'm going to put those in here. Good job! Part 2 Now the second part of the problem asks us to interpret the confidence interval with respect to the claim. The value that we want to check for in our confidence interval is going to be zero. If there's no difference between the two samples, or between the sample means rather, then zero should be inside the confidence interval. And if zero is inside the confidence interval, that means it's potentially the value of the mean difference between the two means. If the difference between the means is zero, that means that the means of that are the same, and so there's no real difference between the two groupings.
However, if zero is not inside the confidence interval, then that means there is some difference between the groupings. And that means one of those groupings is going to have a greater mean value than the other. And which one of those two groupings it is depends on which side of the confidence interval zero appears, whether it's to the left to the right. Here we're simply asked about the claim. So we check to see if zero is inside our confidence interval. Here's our confidence interval, and zero is not inside the confidence interval. That means there is going to be some difference between the means. So the claim is that just because, you know, the hospital admissions falls on the 13th day of a month which is a Friday and that therefore the motor vehicle crashes are not affected. Well, the number of hospital admissions is affected because there's some difference there between the mean values. If they weren't affected, then that would mean that the mean value difference would be zero. But it's not zero, because zero is not inside the confidence interval. Therefore, there is some difference, and so we we can't actually reject the claim that they're not affected because we have statistical evidence that they are affected. So among the answer options given here, I want to select this one: Yes, we can reject the claim because zero is not included in the confidence interval. We have statistical evidence that suggests there is some difference here between the mean values for the groupings. Well done! And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. Using StatCrunch to perform hypothesis testing on two independent sample means of body temperatures4/13/2018 Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to use StatCrunch to perform hypothesis testing on two independent sample means of body temperatures. Here's our problem statement: A study was done on body temperatures of men and women. The results are shown in the table. Assume that the two samples are independent simple random samples selected from normally distributed populations and do not assume that the population standard deviations are equal. Complete Parts A and B below. Use a 1% significance level for both problems. Part A OK, Part A says, “Test the claim that men have a higher mean body temperature than women.” We're first asked to determine the null and alternative hypotheses. Remember that in order to determine the null and alternative hypotheses, we must first consider the claim the claim being made. Here it is that men have higher mean body temperature than women. We notice from our sample statistics table listed here next to the problem statement that the men are being assigned Group 1. So all of the statistics and parameters that have a 1 subscript will be those for the men. The women are assigned a subscript 2, so all of the statistics and parameters that have a subscript 2 will be for the women. If the claim is that the men have a higher mean body temperature than the women, then that means that mu-1 will be greater than mu-2. There's no semblance of equality with that statement, and so we can adopt that claim as our alternative hypothesis. The null hypothesis is by definition a statement of equality. So we're looking for the answer option where the null hypothesis has the two population parameters equal to each other and the alternative hypothesis has mu-1 greater than mu-2. Looking over my answer options, I see that is going to be here answer option C. I select that option and check my answer. Fantastic! Now we're asked identify the test statistic, and to do that I'm going to run a hypothesis test inside StatCrunch. so here's StatCrunch. Inside StatCrunch, I’m going to go to Stat –> T Stats (because we're looking at comparing means without knowing the population standard deviation) –> Two Sample (because we're looking at two samples that are independent — very important to distinguish between dependent and independent samples — these samples here are independent; there's no real relationship between any one man in the first group and any one woman in the second group) –> With Summary (because we're not given actual data, just summary statistics). Here in my options window, I'm going to put in the summary stats that were given there next to the problem statement. So Sample 1 is for the men, and Sample 2 is for the women. Notice this box next to “Pool variances” is unchecked by default. This is what we want, so we're going to leave that alone. The default radio button selection is also for the hypothesis test, so I'm going to leave that. Make sure that this alternative hypothesis matches what we selected previously, and away we go. Hit Compute!, and here we have our results window. The test statistic is always the second-to-last value in that table at the bottom of the results window. Nice work! Now we're asked for the P-value. The P-value is always the last value in that same table in the results window. Excellent! Now we are asked to state the conclusion for the test. The simplest way to do that is compare the P-value with the significance level alpha. We're asked to use a 1% significance level. Our P-value is 8.7%, and it’s easy to see that 8.7% is greater than 1%. So therefore, the area of the P-value is larger than the area for the significance level and cannot fit inside it because it's larger. Therefore, we are outside the region of rejection and we fail to reject the null hypothesis. Always when we fail to reject the null hypothesis, there is not sufficient evidence, so I select that answer option. Good job! Part B Now, Part B asks us to construct a confidence interval. I can go through the menu options in StatCrunch and input once again all of these summary stats, or I can take a shortcut by using the current results window and clicking on this button in the upper left-hand corner called Options, then in the drop-down that follows click Edit. Here I have all of the same summary stats, just switch this radio button to the confidence interval.
I need to make sure I have the appropriate confidence level here. I'm asked to use a 1% significance level. Normally in constructing the confidence interval, we would then select the complement of 1%, which is 99%, for our confidence interval level. However, in this case we have two independent samples, so the complement we're looking for is not 1 minus alpha but 1 minus 2 alpha. So we take twice the alpha to give us 2%, and we subtract that from 100% to give us the appropriate level for our confidence interval. We do this because we're looking at the difference between two independent samples and not just one sample. Hit Compute!, and now we have upper and lower limits for a confidence interval. Nice work! Finally, we're asked, “Does the confidence intervals support the conclusion of the test?” Well, the conclusion we had from our hypothesis test was that we fail to reject the null hypothesis because there's not sufficient evidence to support the claim that men have a higher mean body temperature than women. If there's not sufficient evidence to support the claim that men have a higher mean body temperature than women, that means it's possible that men could have the same mean body temperature as women. So what we need to look for is in our confidence interval could it be that the difference between these two is zero. If the difference between these two is zero, then that means that they could potentially be the same. So looking at this confidence interval for the difference between these two population parameters, is zero inside the confidence interval? The answer is yes, which means the difference could very well be zero. And if the difference is zero, then the two population parameters are the same. That means men and women have the same mean body temperature. Here we see zero is included in the confidence interval. So that means it is possible they're the same. So the answer here would be yes, it does support the conclusion of our hypothesis test because the confidence interval contains zero. The confidence interval contains zero. Therefore, they could be the same. Therefore, we have to reject that claim; we don't have sufficient evidence to support the claim that there's a difference between the two groups and that the one group has a higher mean body temperature than the other. I check my answer. Excellent! And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't care to help you learn stats, go to aspiremountainacademy.com, where you can find out more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to use StatCrunch to perform hypothesis testing on two proportions of referee calls. Here's our problem statement: Since an instant replay system for tennis was introduced at a major tournament, men challenged 1429 referee calls with the result that 417 of the calls were overturned. Women challenged 747 referee calls, and 230 of the calls were overturned. Use a 1% significance level to test the claim that men and women have equal success in challenging calls. Complete Parts A through C below. Part A OK, Part A says, “Test the claim using a hypothesis test,” and the first part asks us to determine the null and alternative hypotheses. To do this, we need to consider the claim that's being made. From the problem statement, the claim is that men and women have equal success in challenging calls. That means that the proportion of men who challenge calls and get them overturned is the same as for the women. So normally we would adopt that claim as our alternative hypothesis. However, equality by definition belongs with the null hypothesis. So the null hypothesis has to say that the two proportions are equal. Well, we can't then use the claim for our alternative hypothesis because that is the null hypothesis. So that means we have to take the compliment of the claim. The complement of being equal to is being not equal to, and so we're looking for the option where the null hypothesis has the two proportions equal and the alternative hypothesis has the two proportions not equal. And that's going to be answer option A. I check my answer. Nice work! Now we’re asked to identify the test statistic. And to do that, I'm going to pull up StatCrunch because it's easiest to get this information by using statistical software like StatCrunch. Inside StatCrunch, I go to Stat –> Proportion Stats (because I'm dealing with proportions) –> Two Sample (because I'm comparing two different samples: one for men and one for women) –> With Summary (because I don't have actual data, just summary statistics). In the options window, there are fields to put in summary statistics for both of the samples. Ideally, it doesn't really matter which one you label Sample 1 and Sample 2 because in theory as long as you're consistent everything should work out. However, my experience in working these homework problems from Pearson teaches me that it's often best practice to use Sample 1 for whatever of our samples is mentioned first. So here in the problem statement, the men are mentioned first. And so we're going to make that Sample 1. So the number of successes here would be the number of calls that is overturned; that’s 417. The total number of observations — that's the sample size. And I'm going to put in the same information for the women. My radio button for Hypothesis Test is already selected. Notice we're doing a test on the difference. So whereas the hypotheses that were listed here in our answer fields from the first part of the problem — they're not listed as differences; it's just one is on one side of the equals or inequality sign and the other is on the other — here it's actually organized to where it's a difference, and that's OK. If the null hypothesis is that both of the proportions are equal, if I subtract p2 from each side (so I bring it over to the left and get this quantity here on the left side), I'm left with zero on the right. So I'm just gonna leave this first field alone. And then the inequality sign for my alternative hypothesis needs to match, and it does. I press Compute!, and out comes my results window. If I scroll over here to get the right side of this table, the next-to-last value in that table is always my test statistic. So I'm going to put that here in my answer field. Fantastic! The P-value comes from that same table. It's the last value listed. Excellent! Now we’re asked to evaluate the P-value with respect to the significance level. We're asked to use a 1% significance level. 43.6% is definitely greater than 1%, so the P-value is going to be greater than the significance level. That means the area of the P-value is larger than the area of the significance level, so we can't fit the P-value area inside the area for the significance level. So we're outside that critical region; therefore, we're outside the region of rejection, and we fail to reject the null hypothesis. Because we fail to reject the null hypothesis, there's not sufficient evidence. I check my answer. Well done! Part B Now Part B asks us to “test the claim by constructing an appropriate confidence interval.” Back here in StatCrunch, I could go back through all the menu options, or I could just go up here to the top left-hand corner of my results window and press the Options button. In the drop down menu that appears, I press Edit, and I'm back to my options window. I don't have to re-input all of these values that I put in earlier. All I have to do is come down here, select the radio button for Confidence Interval, make sure my confidence level matches, and I press Compute! Out comes my results window. If I scroll over here to the right, I can see my upper and lower limits for my confidence interval. So I'm going to put those here my answer field. Excellent! Now, what do we conclude from the confidence interval? Well, when you're evaluating confidence intervals on two samples, typically you're going to be looking to see whether or not zero is included in the confidence interval. Here in the confidence interval we've constructed, zero is inside the confidence interval. Because zero is inside the confidence interval, that means that the two proportions that we're comparing could be the same. If the difference between these two proportions is zero, that means they're equal to each other. If they're equal to each other, that means they’re the same. So because the confidence interval limits include zero, there does not appear to be a significant difference. If they weren't equal, then there would be a significant difference, but zero is inside the confidence interval. Therefore, they could be the same. So then if they're the same, there's no significant difference between the two proportions, and therefore if we're going to ask about rejecting the claim that men and women have equal success, well, there's not sufficient evidence to warrant rejection of that claim because we actually have evidence that suggests they could be the same. I check my answer. Well done! Part C Now Part C asks, “Based on the results, does it appear that men and women have equal success in challenging calls?” Well, from what we concluded right here with our confidence interval, we concluded that they're probably the same. So that's what I'm going to select here. Nice work!
And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. Using StatCrunch to perform hypothesis testing on standard deviations of aircraft altimeters4/7/2018 Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to use StatCrunch to perform hypothesis testing on standard deviations of aircraft altimeters. Here's our problem statement: Test the given claim. Assume that a simple random sample is selected from a normally distributed population. Use either the P-value method or the traditional method of testing hypotheses. Company A uses a new production method to manufacture aircraft altimeters. A simple random sample of new altimeters resulted in the errors listed below. Use a 5% level of significance to test the claim that the new production method has errors with a standard deviation greater than 32.2 feet, which was the standard deviation for the old production method. If it appears that the standard deviation is greater, does the new production method appear to be better or worse than the old method? Should the company take any action? Part 1 OK, the first part of this problem is asking us to identify the null and alternative hypotheses. Remember that the null hypothesis is by definition a statement of equality. So right off the bat, we can eliminate answer options B, C, and F. Now we have to choose between answer options A, D, and E. To do that, we need to form the alternative hypothesis. Typically, the alternative hypothesis reflects the claim that's being made. What is the claim being made here? Well, in our problem statement, we see that the claim is that the new production method has errors with a standard deviation greater than 32.2. So we want the answer option where sigma is greater than 32.2. That's going to be answer option A. I check my answer. Nice work! Part 2 Now we're asked to find the test statistic. To do that, I'm going to use StatCrunch. I'm going to select this icon right here and open my data set that's given to me in StatCrunch. Now that my data set is here in StatCrunch, I’m going to go into Stat –> Variance Stats (because this is the only option StatCrunch has for testing standard deviations and hypothesis testing) –> One Sample (because I have only one sample) –> With Data (because I have an actual data set). I select the column where my data is located. In the hypothesis test area, I'm going to make sure these fields match the hypotheses that I selected from the previous part of the problem. But remember that because we're testing on variance, this is sigma squared, and our hypotheses have just sigma. So we need to take this value of 32.2 and square it to put here into the null hypothesis field. So I take out my calculator, 32.2 squared is 1036.84. And now I make sure that the inequality sign for my alternative hypotheses match. Now I'm all ready to hit Compute!, and here's my answer option. Earlier when doing confidence intervals, you need to take the square root of the upper and lower limits that come out of the results window here. For hypothesis testing, that is not necessary. Just take the test statistic. It's the second to last value here in the table, and I put that in my answer field. Fantastic! Part 3 The next part of our problem asks us to find the critical values. To do that, I'm going to pull up the chi-square calculator. So I go to Stat –> Calculator –> Chi-square. I'm using the chi-square calculator because this is the distribution for hypothesis testing on standard deviations. My degrees of freedom is 1 less than my sample size. I have 12 values in my sample data set, so this is going to be 11. I want to find the critical value, so I need to clear out this default value in this field. My inequality sign needs to match my alternative hypothesis, which was “greater than.” And then here in the probability field, I need to insert the significance level that I'm using to test, because the area for the critical region is the significance level for your hypothesis test. Here in the problem statement, we see we are instructed to use a 5% level of significance. So here in this probability field, I'm going to put 0.05. I press Compute!, and here is my critical value, which is this value here bounding the critical region. Notice there's only one critical value because I have a one-tailed test, a right tailed test. Nice work! Part 4 Now we're asked to evaluate or resolve the hypothesis test. Our test statistic is 35.84, so here on my distribution curve a value of 35.84 would put me somewhere around here. So I'm definitely inside the critical region, which means I'm inside the rejection region. And therefore I'm going to reject the null hypothesis. So my test statistic is greater than the critical value; that's the only way to get inside the critical region for a right-tailed test. Therefore, I'm going to reject the null hypothesis. Because I'm rejecting the null hypothesis, there is sufficient evidence to support the claim. I check my answer. Nice work! Part 5 And now, the final part of the problem asks us to translate the results of the hypothesis test into real-world terms. We've rejected the null hypothesis. That means we're supporting the claim that our standard deviation is greater than 32.2. 32.2, as we learned from the problem statement, is the standard deviation for the old production method. So if we have sufficient evidence to support the claim that our standard deviation is now greater than what it was before, that means we have more variation in our process than we did before.
So the variation appears to be greater than in the past, so the new method appears to be worse. The new production method is worse than the old one because you have more variation. More variation means there's more room for error to creep in, and you're going to be producing more defective products. So there will be more altimeters that have errors. And because your process is now worse, the company, yes, should take immediate action to reduce the variation. I check my answer. Nice work! Well done! All that good stuff! And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is bored or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. Intro Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to use StatCrunch to perform hypothesis testing on means of course evaluation scores. Here's our problem statement: A data set includes data from student evaluations of courses. The summary statistics are sample size n = 94, sample mean x-bar = 3.57, and sample standard deviation = 0.55. Use a 5% significance level to test the claim that the population of student course evaluations has a mean equal to 3.75. Assume that a simple random sample has been selected. Identify the null and alternative hypotheses, test statistic, P-value, and state the final conclusion that addresses the original claim. Part 1 OK, so the first part of the problem asks us to identify the null and alternative hypotheses. To do this, we first need to think about the claim that's being made. If we go back to the problem statement, we can see that the claim is that the population of student course evaluations has a mean equal to 3.75. So down here in our answer options, we're going to first form our null hypothesis the null hypothesis. By definition, this is a statement of equality. So right off the bat, we can eliminate answer option B, because the null hypothesis here says “not equal to.” Now we need to choose between answer options A, C, and D. To do that, we're going to have to select the correct alternative hypothesis. Generally, the alternative hypothesis reflects the claim being made. However, in this case, because the claim has a semblance of equality to it — it says the mean is equal to 3.75, and that equality by definition belongs with the null hypothesis — we can't take the claim and turn it into our alternative hypothesis. So what we have to do is take the complement of the claim as our alternative hypothesis. Here we say the mean is equal to 3.75. The complement of being equal to is being not equal to, so that's going to be our alternative hypothesis: Mu is not equal to 3.75. This is answer option C. I check my answer. Well done! Part 2 Now the second part of the problem asks us to determine the test statistic. To do that, I'm gonna pull up StatCrunch, because statistical software like StatCrunch makes hypothesis testing really easy. To get the test statistic, I'm first going to go into Stat –> T Stats (because I don't know what the population standard deviation is — I do have a sample standard deviation but not the population standard deviation — so that means I'm using a Student-t distribution) –> One Sample (because I'm only given one sample) –> With Summary (because I don't have actual data just summary statistics). In the options window, I'm going to put the summary statistics that were given to me in the problem statement. Then in the field for hypothesis testing, I'm going to make sure that this matches the null and alternative hypothesis that I previously selected. I press Compute!, and out comes my results window with my results. Here in the table, the test statistic is always the second-to-last item in that table. So I'm just going to put that here in my answer field. Nice work! Part 3 The third part of our problem asks us to determine the P-value. Again, I go to my results window, and in that table the P-value is always the last value listed in that table. Well done! Part 4 Now the last part of our problem asks us to “state the final conclusion that addresses the original claim.” To do that, we can either use the test statistic or the P-value. It's easier to use the P-value, so that's the route I'm going to take.
To use the P-value to state our final conclusion from a hypothesis test, we need to compare the P-value with the significance level for the hypothesis test. In the problem statement, we were instructed to use a 5% significance level. So I compare the P-value, 0.2% with 5%. Because 0.002 is less than 0.05, I'm inside the rejection region, and therefore I'm going to reject the null hypothesis. Because I reject the null hypothesis, there is sufficient evidence. This final field needs to match the alternative hypothesis that we selected earlier. Here the alternative hypothesis says that the mean for the population is not equal to 3.75, so I select that here. Now I check my answer. Excellent! And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video. |
AuthorFrustrated with a particular MyStatLab/MyMathLab homework problem? No worries! I'm Professor Curtis, and I'm here to help. Archives
July 2020
|
Stats
|
Company |
|