Blog Posts

Evaluating reported research results for falsified data

3/22/2019

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to evaluate reported research results for falsified data. Here's our problem statement: A researcher was once criticized for falsifying data. Among his data were figures obtained from six groups of mice with 25 individual mice in each group. These values were given for the percentage of successes in each group: 53%, 58%, 63% 46%, 48%, 67%. What's wrong with those values?

Solution

OK, so we're looking at six different groups and each of these groups have 25 members to them. So what would the percentage be, let's say, if we're looking at, say, you know, looking at one out of the 25? What would the percentage be? Well, here if I can just do that in my calculator, 1 out of 25 gives me 4%. If I had 2 out of the 25, then I take 2 divided by 25 and get 8%. If I had 3 out of the 25, I get 12%. So we can see that we're --- we're looking at multiples of 4 here. And these numbers that we see here that are listed as the reported percentages, these are not multiples of 4. So therefore they cannot possibly be correct.

I could also look at this backwards and say, "Yeah, you know what? What part of the 25 corresponds with 53%?" So I'd take my 53%, and I multiply that by the 25. See, I've got 13 and a quarter. Well, I understand the 13, but what are you doing with a fourth of a mouse? It doesn't make sense. And you get the same thing looking at the other numbers.

So all the percentages of this session should be multiples of 4. So we're going to take that as our answer. I check my answer. Nice work!

And that's how we do it at Aspire Mountain Academy. Be sure and leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher was boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

0 Comments

Performing linear regression analysis of automobile weights and highway fuel consumption

3/19/2019

0 Comments

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to perform linear regression analysis of automobile weights and highway fuel consumption. Here's our problem statement: The table lists weights (in lbs) and highway mileage amounts (in mpg) for seven automobiles. Use the sample data to construct a scatterplot. Use the first variable for the x axis. Based on the scatter plot, what do you conclude about a linear correlation?

Part 1

OK, so here we've got our data. And the first part asks us to construct a plot. We can work this in StatCrunch or Excel, and I'll actually work it both ways and then let you decide which way you think is easier for you.

So first we're going to look at working this problem in StatCrunch. So to construct my scatter plot in StatCrunch, I'm first going to select the data and dump it into StatCrunch. And now here I've got my data in StatCrunch, so I'm going to select Stat --> Regression --> Simple linear. The problem statement says we want the first variable for the x axis. Typically that's what you would select anyway, so we're going to select that. And the y variable will of course be the second variable. I come down here and hit Compute!.

And here in my results window, the scatter plot is actually on the second page. Notice how up here at the top it says "1 of 2." This is the first of two pages to my results window. So if I click down here on this button at the bottom right, I can get to the second page. There's my scatter plot along with a line of best fit. So it's really easy to select where my line of best fit is going to be located. I check my answer. Fantastic!

Part 2

And now the second part of this problem asks, “Is there a linear relationship between weight and highway mileage?” Well, to look at that, I need to go back to this first page and identify the linear correlation coefficient. My R value, which you see is located right up here at the top --- so -0.9979. So I would need to take the absolute value of this and then compare that with the critical R value. So the critical R value is going to come from the table that's listed there in your textbook. It's also in the insert to your textbook. There's no actual link here in the problem statement to the critical R value table.

And that's OK, because look at the value that we have --- 0.9979. I mean, this is --- this is almost 1. So this is an excellent R value, and we don't need to check it against a critical value because anytime you get an R value that's greater than 0.97, you're going to have linear correlation. You don't need to check it with the R table. You can check it if you want, but it's going to come out to be the same thing. You're going to conclude there is a linear relationship.

So the answer here we're going to click is going to be yes, there is a linear relationship. But notice there's two options here for yes. So let's see what the rest or second part says. "As the weight increases, highway mileage decreases." "As the weight increases, the highway mileage increases." Well, we have a negative correlation here. So as the weight goes up, what's --- what direction is the line going? It's going down. So as the weight goes up, the highway miles per gallon goes down. So we're going to want this answer option here. Good job! And that's how we do it in StatCrunch.

Now to get the same thing out of Excel, I come back here and let's say I'm just going to dump my data in Excel. Probably the easiest way for me to do this is I could open it in Excel, but I'm just gonna copy and paste here. It's much easier for me to go through here. So here's my data in Excel.

And now to get this actually doing it in Excel, first I want to get the scatter plot. So to do that, I'm going to select my data here, and then I'm going to come up to Insert and then here under Charts I'm going to select the scatter plot. I want this first option for my scatter plot, and boom! There's your scatter plot.

To check for the linear relationship, what I'm gonna do is I'm going to left click on one of these data points, and then while my cursor is still over the data point, I'm going to right click on my mouse so I get this wonderful little menu. And I'm going to select Add trendline. Notice in the trendline option that the default selected is the linear, and that's what we want, so we're going to leave that alone. I'm going to come down here, and I'm going to select Display equation, and Display R squared values is what I really want. So notice we've got this little area here. And I'm going to move this over. If I double click inside, select everything, I can actually increase the font size so I can make it more readable.

So here we've got an r squared value of 0.996. The value I need to check for our linear correlation is actually R; I get an R squared. So if I take the square root of R squared, I'm left with R. So if I take the square root of 0.996 --- I can do it with my calculator here --- there's my R value: 0.997, which again, this is what we saw previously. And because this value was greater than 0.97, we don't need to check with the R critical value. We're going to assume that there's linear correlation here.

So that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below. Let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

0 Comments

Identifying statistical values related to hypothesis testing of two population proportions

3/15/2019

4 Comments

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to identify statistical values related to hypothesis testing of two population proportions. Here's our problem statement: In a large clinical trial, 390,245 children are randomly assigned to two groups. The treatment group consisted of 193,395 children given a vaccine for a certain disease, and 34 of those children developed the disease. The other 196,850 children were given a placebo, and 136 of those children developed the disease. Consider the vaccine treatment group to be the first sample. Identify the values of n1, p1, p-hat1, q-hat1, n2, p-hat2, q-hat2, p-bar, and q-bar.

Part 1

OK, the first thing we're asked to provide is the sample size for the first sample. The problem statement says that we're considering the vaccine treatment group as the first sample, so the sample size is given there in the problem statement. That's the total number in that first group that we're given the vaccine, which here is the 193,395. So I'm just going to put that here in my answer field. Well done!

Part 2

Next we're asked to provide p-hat1. p-hat1 is the proportion of those in the first sample --- that's the subscript 1 --- that is a probability of quote unquote success. And here success is going to be defined as actually developing the disease. I know that sounds backwards. It's like, we want to --- we're developing a vaccine, right? You want to have success be not getting the disease, but it actually is easier to work the problem statistically if you just flip flop your definition of success. So here we're going to say success is going to be actually developing the disease.

Well, the probability --- that's all we're looking at here is a proportion or probability, which would then be the part over the whole. The part that developed the disease is 34 in that first sample, and over the whole, which is the 193,395. So I'm going to whip out my calculator here and just do a simple little division here. Take the 34, which is the part, divided by the whole, 193395.

And notice here the scientific notation that's given here in my calculator, this "e minus 4." This is really as a notation here. It's saying this number that's out listed here is being multiplied by 10 to the -4. So if you were to write this out in decimal form, this decimal point would really need to be moved 4 places to the left. That's the -4 here.

So it's asking here in the problem statement for us to provide an answer rounded to eight decimal places. So I'm going to put in one, two, three, and then that first number --- see, then my decimal place has been moved over four places. And then I finish putting the number in. This is going to get tricky. Let's count the numbers here to make sure we've got --- one, two, three, four, six, seven, eight. Oh, look at that. Actually I need to come up one because the zero here, which is my eighth character, the 5 next to it means I'm going to be rounding up. So I'll put that 1 there on the end. I check my answer. Well done!

Part 3

q-hat is just the compliment of p-hat. So to get that, we're just going to subtract this value from 1. Instead of typing this number in again, I could say 1 minus and then type this number in again, but I don't want to have to do the typing. I'm a little lazy. So I'm just going to make that number negative and then add it to 1. It gives me the same effect. So now I need to take this number. That's eight decimal places.

Part 4

And now I need to do the same thing with the second sample. So the size of the second sample, those who were given the placebo, is listed here in the problem statement. That's the 196850.

Part 5

And then the proportion of success, which again is going to be developing the disease --- so I got to take the 136 that actually develop the disease there in the second sample and divide by that sample size. So again I get a really small number. Well done!

Part 6

And now we're looking for q-hat2, which of course is going to be the complement of p-hat2. So again, I'm just going to make this number, which is p-hat2 --- make that negative. Add it to 1. Boom. There's my q-hat2. Excellent!

Part 7

Now I'm looking for p-bar. p-bar is going to be the pooled proportion. So remember when we were doing the individual p-hat. So p-hat1, I just take the 30 (the part which developed the disease, the 34) divided by the whole. I did the same thing with p-hat2; I took the part that developed the disease, the 136, and divided by its whole. And now we're going to have to combine those together. So now I'm going to have to take both the parts and add them together to get a pooled part. And then I take that pooled part and divide it by a pooled whole. So I'm going to take both of these holes and add them together. And here I get my pooled proportion, the p-bar. Well done!

Part 8

And last but certainly not least, I'm asked for q-bar, which again, is just going to be the complement of p-bar. I'm going to take this value here for p-bar in my calculator and make it negative, add the 1, and there's my value for q-bar. I'll put that here in my answer field. And it's easy to get lost with the number. It's got eight decimal places, so it's really easy to get lost. I got eight. I check my answer. Oh, it did not like that! I mistyped it somewhere. Where did I mistype this? Nine, nine, six, four, three, eight. That's what I hate about these, this eight decimal place thing, is it's hard to keep track of all these little numbers. Excellent!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below. Let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

4 Comments

Testing a probability table for distribution properties

3/12/2019

0 Comments

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to test a probability table for distribution properties. Here's our problem statement: Groups of adults are randomly selected and arranged in groups of three. The random variable x is the number in the group who say they would feel comfortable in a self driving vehicle. Determine whether a probability distribution is given. If a probability distribution has given, find its mean and standard deviation. If a probability distribution is not given, identify the requirements that are not satisfied.

Part 1

OK, so this first part of the problem is asking us whether we have a probability distribution listed in the table. In order for these, uh, this distribution table to qualify as a probability distribution, three requirements need to be met. First is that we're actually listing actual probability values. And to test that, we look at each of the values that are listed here in the probability column. And the most important property of the probability that we're going to be looking for here is that it needs to be a value between 0 and 1, inclusive. So we look at each one of these values here, and they're all between 0 and 1. So therefore the first requirement is met.

The second requirement is that there must be a matched value for each possible random variable. So if we look at our random variable, which in this case is the x, we have a probability value listed for each one of those random variable values. So the second requirement is met.

The third requirement is that each of these probabilities listed in the probability column must add up to equal 1. And by 1, I mean 1 exactly. So you're not getting like 0.99 or 1.01; it's gotta be 1 exactly. So there's two ways to do this. I find it easier to do this in Excel. So I'm just going to dump the data here in Excel. And it always brings up whatever other copy of Excel I have open. When it does this, I'm just gonna make this easy. We're just going to paste you. So notice here when I actually have the data here in Excel, if I just select these columns with the probability, down here at sums it for me automatically. This is why I say it's easiest to do this in Excel. All I have to do is dump the data in, then get the column selected, and then here I get the sum, and it's 1 exactly. So the third requirement is met.

Now if I wanted to do this in StatCrunch --- StatCrunch sometimes gets a little clunky. This particular operation is not that bad. I can actually put my window here and see what's going on. So here in StatCrunch, I'm going to go to Stat --> Summary Stats --> Columns and then sum up that probability column. We see we get the same thing. It's equal to 1 exactly. So we do have a probability distribution here. So I'm going to select that answer here from among my choices. Well done!

Part 2

And now the second part of this problem asks me to find the mean of the random variable x. This is not where you're just adding up 0, 1, 2, 3 and then dividing by 4. That's not going to give it to you. What will give it to you is when you actually calculate the --- what it's asking for is the mean of the distribution. And so that's easiest to do in StatCrunch.

The way we're going to do this, just go to Stat --> Calculators --> Custom. In the options window, our values are going to be our random variable, which in this case is the x, and the weights are going to be our probabilities. I press Compute!, and out comes this wonderful little results window here with the mean listed right here for you. It's already calculated for you. So I'm going to go ahead and stick that in my answer field. It says to round to one decimal place. Nice work!

The last part of his problem is asking us for standard deviation. And again that's listed right over here in our results window. So I'm going to go ahead and stick that value here in the answer field. Nice work!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

0 Comments

Understanding the assumptions underlying one-way ANOVA hypothesis testing

3/8/2019

1 Comment

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to understand the assumptions underlying one-way ANOVA hypothesis testing. Here's our problem statement: The company data table contains chest deceleration measurements in g (where g is the force of gravity) from samples of small, midsize, and large cars. Shown are the technology results for analysis of variance of this data table. Assume that a researcher plans to use a 5% significance level to test the claim that the different size categories have the same mean chest deceleration in the standard crash test. Complete Parts A and B below.

Part A

OK, Part A says, "What characteristic of the data specifically indicates that one-way analysis of variance should be used?" Well, let's go ahead and take a look at our data. I click on this icon here; it shows us our data. Here's our population of data, or rather the sampling that they took. And we see that there's three sizes here --- small, midsize, and large. So everything's being categorized according to the one category, which we would say is size. But there's three different options to choose from. And that's really the key characteristic here, that you've got different categories of the same type that you're trying to compare. Of course there's more than two of them. So that combination calls for one-way ANOVA testing.

So what are our options here? "The measurements are characterized according to the one characteristic of size." Yep. That's pretty much it. But let's check the other options just to make sure. "There are three samples of measurements." Well, yeah, three lends to one-way ANOVA testing, but just having three samples alone is not enough. The key characteristic is categorization according to one characteristic. "The population means are approximately normal." Well, we hope they are, but they may not be. "Nothing specifically indicates that one-way analysis of variance should be used." Yeah, that's not true. So we're going to go with Answer option A here. Good job!

Part B

Now Part B says, "If the objective is to test the claim that the three size categories have the same mean chest deceleration, why is the method referred to as analysis of variance?" Well, it's because we're analyzing a variance. And the variance that we're analyzing is a common population variance. So we're estimating the population variance, usually from two different directions. And that's what the method is actually based on.

So let's see what our answer options say here. Yeah, right here. "Estimates of common population variance." Boom. That's what we're looking for. Excellent!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

1 Comment

Finding the coefficient of variation for parking meters

3/5/2019

1 Comment

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to find the coefficient of variation for parking meters. Here's our problem statement: Listed below are amounts in millions of dollars --- millions of dollars, I like the sound of that! --- collected from parking meters by a security service company and other companies during similar time periods. Do the limited data listed here show evidence of stealing by the security service company's employees?

Part 1

OK, the first part of the problem asks us for the coefficient of variation for the amount collected by the security service company. Of course, the next part's going to ask for all the other companies, so let's go ahead and dump our data into StatCrunch. I can resize this window so we can get a better look at everything. Excellent.

Now I'm going to go into Stat --> Summary Stats --> Columns. I know I'm going to have to do both my columns, so let's just go ahead and just select both of them. And notice how I selected both of them, by holding down the Ctrl button on my mouse while I clicked the other option that I wanted. And then the statistic I want is the coefficient of variation. So I select that, hit Compute!, and boom! Here's my coefficients of variation.

Notice that coefficients of variation are listed as percentages. So there's no need to adjust your decimal point to, you know, uh, you know, change the number out to its proper form. It already is in the proper form. It's asking for a percent here. So we're asked to round to one decimal place, so I'm going to do that. Good job!

Part 2

Now the second part asks for the coefficient of variation for the amount collected by the other companies. Of course, we've already calculated it, as you can see here. Again, one decimal place, so I need to round my number to [that]. Well done!

Part 3

Now the last part of this problem asks, "Do the limited data listed here show evidence of stealing by the security service company's employees? Consider a difference of greater than 1% to be significant." Well, what's the difference between our coefficients of variation here? It's about 3% three percentage points. So that's definitely greater than one percentage point. So yes, there is a significant difference in the variation. Excellent!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just as I want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

1 Comment

Evaluating graphics for data distortion

3/1/2019

1 Comment

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to evaluate graphics for data distortion. Here's our problem statement: The graph to the right uses cylinders to represent barrels of oil consumed by two countries. Does the graph distort the data, or does it depict the data fairly? Why or why not? If the graph distorts the data, construct a graph that depicts the data fairly.

Part 1

OK, the first part of this problem is asking, "Does the graph distort the data? Why or why not?" Our answer options here are either yes or no. And "yes, because the graph incorrectly uses objects of volume to represent the data." The other yes option says, "Because 3D objects always distort the data in graphs." Well, 3D objects generally distort the data and graphs, but that's not always the case. There are some exceptions to the rule.

But the rule is that, yeah, typically 3D objects are going to distort the data. They incorrectly represent a 3D object on a 2D surface. So some of the proportions get a little, uh, I guess you would say out of proportion. And that's why it's often deceptive to represent 3D --- use 3D objects to represent, a one --- it's actually a one dimensional value is what it is. Like the number of barrels, that's just one dimension. But you've got three dimensions when you're looking at volume. So that's where the distortion comes. You're using multiple dimensions to represent only one dimension, which is along the single number line. So yes, the graph does distort the data because they incorrectly use objects of volume to represent the data. Nice work!

Part 2

Now the second part of this problem says, If the graph does not depict the data fairly (which it doesn't), which graph below does?" OK, a bar graph is much more fair in its comparison. Look at the answer options we get here.

Answer option D is obviously wrong because the graph does not depict the data fairly. So if we scroll back up here, look at our Answer options A, B, and C, and notice Answer option C --- the graph does not start at zero. This is another technique that people use to distort the data to make it appear like the difference between, uh, you know, multiple options is different than what it actually is. So we don't want the non-zero axis here.

So it's going to be A or B. And if you just look at the values here, 21.4 for A, 5.5 for B. So A should be much larger than B. And we see that here in Answer option A. Answer option B, the values are flip flopped. Nice work!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we do in or how we can improve. And if your stats teacher is boring or just as I want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

1 Comment

Constructing and using a proportion sampling distribution table

2/26/2019

1 Comment

Intro

Howdy! I'm Professor Curtis of Aspire Mount Academy here with more statistics homework help. Today we're going to learn how to construct and use a proportion sampling distribution table. Here's our problem statement: A genetics experiment involves a population of fruit flies consisting of three males named Alex, Bart, and Christopher and one female named Debbie. Assume that two fruit flies are randomly selected with replacement.

Part A

OK, Part A of this problem says, "After listing the possible samples and finding the proportion of females in the sample, use a table to describe a sampling distribution of the proportion of females." OK, so the first thing we need to do here is make a table of the possible outcomes in our sample space and then look at the proportions of females in each of those individual outcomes. We can then sort that information in order to produce our probability table. It would probably be easier to do this in Excel because Excel has much better sorting functionality then StatCrunch. So let's just go ahead and use Excel for that.

And let's just list the different possible outcomes in our sample space. So we're selecting two fruit flies from among the four that are in the population, and we're selecting with replacement. So the first one we select goes back in to possibly be selected again. So let's just say the first one we pick out is Alex, we put them back in, and then we pick him again. Or we could pick Alex for the first one and then Bart for the second one, or Alex for the first one and then Christopher for the second one, or we could pick Alex for the first one, and Debbie for the fourth one.

Our second one --- really fourth one here in the series. So ... that's the possible subset. And I look at the pattern here. So I've got the first one repeated four times, and then I've got each one listed in sequence. So I could actually, if I wanted to, I could just repeat that pattern three more times. Whoops. One for each of the individual fruit flies. And then here I'm just going to put in B four times, and then C four times, and then D four times. This is the pattern that we've got established here, right?

So now I look at each row. Each row is a sampling, and I'm going to say, "OK, what's the proportion of females?" That's what we're looking for here, the proportion of females in each one of these samples. Well, Alex is male. In fact, the only female we've got, here is Debbie. So the proportion of females here is going to be 0%. Zero. Zero. Here 50% is female --- Alex is male, Debbie's female, one half is 50%.

And I just go through and mark the others the same fashion. So whoops, that's 100%. So now I've got all my, oh, I've got all my proportions out. So now I'm going to select everything and I'm going to come up here to Data and select Sort. I'm going to sort on column C, smallest to largest. Boom, baby! So now it's really easy to get what I need because all I got to do is --- notice zero is the first number, zero here, the first number in our table. We've got nine zeros out of 16 total. So the probability is the part over the whole; the part is 9, the whole is 16. I do the same thing for each of the numbers in sequence. So I've got six .. for 50%. 6 over 16 can actually reduce to 3 over 8. And then of course there's only one value for that last option there. So I check my answer. Fantastic!

Part B

Now Part B says, "Find the mean of the sampling distribution." The mean is actually best found in StatCrunch. We could actually do it here in Excel, but it's --- it's just that I'm lazy. So let me go ahead and open up StatCrunch, pop the window out so we can actually move around and do something with it. Alright, so here in StatCrunch, I'm going to actually transfer these numbers over here. So it's 0, 5, 1, and then I'm going to actually label these. We could label these Proportion --- oh, I got my caps lock on --- Proportion and Probability. Right. The probability here is 9 over 16, so ... will this calculate it for me? Oh damn. Ugh! Stupid computer. Where's my calculator? Hey, there's my calculator. The actual number here --- 0.5625. And three eighths? I should probably know that one. But again, I'm lazy. And of course, one sixteenth. Gotcha.

OK, now I've got my probability table here in StatCrunch, so now I can just go up to Stat --> Calculators --> Custom. The values are the proportions, and the weights are the probabilities. E voila! 25%. It's so easy. Now, nice work!

Of course, you know, I can actually get the same thing in Excel. I mean, if you really wanted, I can come back here in Excel, I could actually put in that probability. This would actually calculate it for me. Ooh, yeah, I like that, 9 divided by 16. Oh yeah, baby! 1 divided by 8, and 1 divided by 16. Oh yeah, baby! So here in Excel, what we would do is I'm going to take, and I'm going to multiply each of those proportions by its corresponding probability, and then copy of that down so I can get the same thing there. Then I'm going to sum that up. And there's my mean. Again, a little calculation intensive. I'm lazy. I like StatCrunch to kind of do everything for me. But if you wanted to stay in Excel, there's the way you would actually get the same answer.

Part C

Now Part C asks, "Is the mean of the sampling distribution from Part B equal to the proportion of --- population proportion of females? If so, does the mean of the sampling distribution of proportions always equal the population proportion?" Well, you should recall from lecture --- well, if you're watching, you know, Aspire Mountain Academy lecture videos, you'll know you should know this. If you're not, well, I hope that your instructor pointed this out to you. It is in the textbook.

There is, you know, a listing of population parameters that are biased and a listing that are unbiased. And proportion is one of those population parameters that is an unbiased estimator. So we would expect that the mean of the sampling distribution would be the same as the population parameter, in this case, proportion. And we see that that's so; the population has only for members to it. There's four fruit flies in the population. Three of them are male, one of them is female. 1 out of 4 is 25%. So yeah, the mean of the sampling distribution does equal the population proportion, and that's because the proportion is an unbiased estimator. So it looks like this answer option is the one we want. Excellent!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below. Let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

1 Comment

Performing mean hypothesis testing on a politician's claim of survey results

2/22/2019

3 Comments

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to perform mean hypothesis testing on a politician's claim of survey results. Here's our problem statement: Assume that adults were randomly selected for a poll. They were asked if they favor or oppose using federal tax dollars to fund medical research using stem cells obtained from human embryos. Of those polled, 483 were in favor, 395 were opposed, and 120 were not sure. A politician claims that people don't really understand the stem cell issue and their responses to such questions are random responses equivalent to a coin toss. Exclude the 120 subjects who said that they were unsure, and use a 1% significance level to test the claim that the proportion of subjects who respond in favor is equal to 50%. What does the result suggest about the politician's claim?

Part 1

OK, that was a mouthful. Let's get into this. So the first part of our problem asks us to identify the null and alternative hypotheses. The null hypothesis is always a statement of equality, so we're not going to select Answer option C. And then among the three answer options that remain, we can select the correct one by choosing the correct alternative hypothesis.

The alternative hypothesis typically reflects the claim unless the claim has some sort of semblance of equality to it, in which case we'll take the compliment. Here the claim --- we look at the problem statement --- it says we're testing the claim that the proportion of subjects who respond in favor is equal to 50%. So the proportion is equal to 50% is the claim. But that is the null hypothesis, because equality by definition belongs to the null hypothesis. So we have to take the compliment of that. The compliment of being equal to is being not equal to. So we want Answer option A. Nice work!

Part 2

Alright, the next part of this problem asks us to identify the test statistic. To do that, we're going to have to run a hypothesis test. And the easiest way to do that, for me anyway, is to go into StatCrunch. So I open up StatCrunch, and I'm going to pop this pop out button here so that the window pops out of the window with the problem statement. And now I can move this around. I can resize the window, and I can do all sorts of wonderful little things with this.

OK, so to do our hypothesis test, we're going to go to Stat --> Proportion Stats (because we're dealing with proportions) --> One Sample (because we have only one sample) --> With Summary (because we don't have actual data, just summary stats). Number of successes --- well, what is the success? We're testing the claim that the proportion of subjects who respond in favor is equal to 50%, so responding in favor is going to be our definition of success. How many were in favor? Well, here it says 483 were in favor. So I put that up here.

Number of observations --- this is the total. So I'm going to pull out my --- hey, where did my calculator go? Guess I'll have to --- let me look for my calculator. Oh, there it is. It magically appeared. OK, so here's the calculator. We're going to take the total 483 we're going to add it to the 395 that were opposed, and we don't have to include the 120 because it said to exclude the 120 who said they were unsure. So just those two together. That gives me my total, which doesn't look right. 483 plus 395 is 1273? Uh, I don't think so. Let's try this again. 483 plus 395 is 878. That looks better. I don't know what happened with that. I'll have to check this out.

OK! So that's all we need there. And I'm running a hypothesis test, and these fields match. We selected over here for our null and alternative hypothesis. So I'm all set to go. And here in the table, second to last number in that table as always, is my test statistic. And I'm asked to round to two decimal places. So that is what I'm going to do. Excellent!

Part 3

Now the next part asks for the P-value. The P-value as always is next to the test statistic, so the last value there in my results window. Fantastic!

Part 4

"Identify the correct conclusion." Well, our significance level is 1%. The P-value we have is three tenths of a percent. So the P-value is less than the significance level. That means we're inside the region of rejection, and that means we're going to reject the null hypothesis. Every time you reject the null hypothesis, there is always sufficient evidence. So this is the answer option we want. Well done!

Part 5

And now the last part of this problem asks, "What does the result suggest about the politician's claim?" Well, look here. We rejected the null hypothesis. What's the null hypothesis? The null hypothesis says that the proportion is equal to 50%. Well, we're rejecting that, meaning we're saying this is not true. It's something other than 50%. But the politician --- what did the politician claim? The politician claimed that responses to such questions are random responses equivalent to a coin toss. A coin toss is 50/50 heads or tails. So the politician is claiming that the proportion is 50%, but our hypothesis test resulted in a rejection of that claim. And so therefore we're saying the politician doesn't know diddly squat, which for most politicians is actually pretty spot on.

So let's see here. What are our answer options? "The results suggest the politician is doing his best?" Eh, I don't think so! "The results suggest the politician is correct." I don't think so. "The results suggest the politician is wrong." Oh, yes, I love it! It's ... it's ... it's so right to be so wrong. Yes. And finally "the results are inconclusive." No, they're very conclusive. So here we're going to check our answer. Well done!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just as I want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

3 Comments

Finding the regression equation and best predicted value for bear chest size

2/19/2019

1 Comment

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to find the regression equation and best predicted value for bear chest size. Here's our problem statement: The data show the chest size and weight of several bears. Find the regression equation, letting chest size be the independent (or x) variable. Then find the best predicted weight of a bear with a chest size of 48 inches. Is the result close to the actual weight of 430 pounds? Use a significance level of 5%.

Part 1

OK, the first part of the problem asks us for the regression equation. To do this, I'm going to go ahead and take this data, and I'm going to dump it into StatCrunch. StatCrunch! Yes! We love StatCrunch. Yes. Alright, here we go. I'm going to resize this window so we can see everything's just a little bit better. Actually, let's do it this way. Right. I'll get you first because I don't really see you as much as I need to see you.

Alright, here we go. Regression equation --- so here's my data. I'm going to click on Stat --> Regression --> Simple linear. The x variable is usually the one that's mentioned first. But just to make sure, let's check out the problem statement. It says to make chest size the independent (or x) variable, and so that is the one that's mentioned first. So we're going to select that the y variable or the dependent variable which is, of course, the other variable that we have to select from, and that's all I need to do. Hit Compute!, and StatCrunch will do everything for me as far as the heavy lifting goes.

Here is my regression equation, right up here at the top. It's kind of jumbled among tons lots of stuff, so it's a little harder for me to see the numbers. So I go through and look at the parameter estimates table down here. Notice these numbers here are the same numbers that you see up here, so I just go ahead and just use the numbers here from the parameters estimate table. Don't forget your negative signs. If you have a negative sign there, don't forget to put that in. And we want round to one decimal place.

Oh, it did not like that! What did I do wrong? Oh, I typed in the wrong number. Looking at the results, I'm looking at the wrong number. It would help to look at the right number! Waking up, making sure that you people out there in YouTube Land are awake! Alright, that gives it to us. Nice work!

Part 2

Alright, now the next part of this problem asks, "What is the best predicted weight of a bear with a chest size of 48 inches?" Well, the first question we need to ask when looking for a prediction is "Can we use the regression equation? Will the regression equation give us a reliable estimate or prediction?" So to do that, we need to compare R squared values with the critical R squared value.

So if I click on this link right here, I have a table of critical R values. Our sample size here is 6, so that's going to be looking at this column here. And I believe --- yes, right here, a significance level of 5%. So I want to look at the value in this first column for the row where we've got 6, and I get 0.811. That's the bar that we have to clear in order to use the regression equation.

Over here, the regression equation, my R squared value is 0.995. That's outstanding! That's practically 1. It's hard to get much better than that. And yet look at when we compare with the critical R value, 0.811, we're actually greater than that. So we're clearing the bar. It's like a hurdle or a pole vault jump, and we're trying to clear the bar. And we've cleared the bar because 0.99 is greater than 0.81. So that means the regression equation is good to be using for predictions.

To use this for a prediction, I can either plug it in myself and do it old school style with my calculator, or I can come up here to Options, click on Edit, scroll down here to Prediction of y, put in my x value for the prediction (which here it says it wanted a chest size of 48), and a significance level of 5% so that matches here at the level of 95% confidence, hit Compute!, and I'm going to expand this out. And if you scroll down to the bottom, look at this! It actually calculated it out for me. So there's my predicted value right there, which is a whopping 468 pounds. Wow. Round to one decimal place. And that's a --- that's a big bear! That's a big anybody! Jeez. Look at that! Fantastic!

Part 3

Alright, is the result close to the actual weight of 438 pounds? Well, I don't know. Anything over 400 is pretty much all in the same category, I would think. But the difference is a good 30 pounds. 30 pounds! So they're probably not in the same neighborhood. So I would say the result is close --- no the result is not very close. I like that one; the result is very close. No, the result is exactly the same? Let's go with --- let's go with Answer option B. Excellent!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below, let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

1 Comment

<<Previous

Forward>>

Intro

Solution

Intro

Part 1

Part 2

Intro

Part 1

Part 2

Part 3

Part 4

Part 5

Part 6

Part 7

Part 8

Intro

Part 1

Part 2

Intro

Part A

Part B

Intro

Part 1

Part 2

Part 3

Intro

Part 1

Part 2

Intro

Part A

Part B

Part C

Intro

Part 1

Part 2

Part 3

Part 4

Part 5

Intro

Part 1

Part 2

Part 3

Author

Archives

Stats

Company

Support