Blog Archives

Using selection with replacement to approximate selection without replacement

3/29/2019

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to use selection with replacement to approximate selection without replacement. Here's our problem statement: In a study of helicopter usage and patient survival, among the 54,115 patients transported by helicopter, 172 of them left the treatment center against medical advice, and the other 53,943 did not leave against medical advice. If 40 of the subjects transported by helicopter are randomly selected without replacement, what is the probability that none of them left the treatment center against medical advice?

Solution

OK, we're asked for a probability, and probability is a part over the whole. But we're looking at 40 subjects here and the probability that none of them left the treatment center. So this is going to be the probability that the first one didn't leave and the probability that the second one didn't leave and the probability that the third one didn't leave, so on and so forth until we get to the 40th subject. Well, and, when we're dealing with probability, calculations means multiplication. So we're taking this probability that the randomly selected person did not leave the treatment center against medical advice and we're multiplying it by itself 40 times.

But that's when you're selecting with replacement here. Technically we would need to look at selecting without replacement, because once the person leaves, it's not like they come back into the hospital to get selected again. So we're going to have to --- if we want to do this technically correct, we will be selecting without replacement. Well, that means we're going to have to take this 53,943 and divide it by 54,115 --- there's your first probability. But then we got to multiply that by 53,942 over 54,114 to account for the one person that is no longer available to be selected. And we'd have to continue that on for 40 different numbers that we're going to multiply together. Gee, that's some pain.

What we're going to do instead is we're actually going to use selection with replacement to approximate selection without replacement. And what allows us to do this is that the number of people that we're looking at is actually within 5% of the whole. So if we take, say, the 40 subjects that we're looking at out of the total that we can select from, so you would get a number here that's way less than one --- well, 1%. So it's definitely going to be less than 5%, which means we can use selection with replacement to approximate selection without replacement.

So again, the selection with replacement, it's just going to be the part over the whole. So the 172 that left the treatment center is the part that left, but we want the probability that none of them are leaving. So we want the 53,943 that did not leave divided by the whole. So there's a probability that one randomly selected patient is not going to leave the hospital. I want this multiplied by itself 40 times for the 40 subjects that we're selecting from the --- from the total pool of candidates. So then there is our probability. I round it to three decimal places. Fantastic!

Now this value that we get here is not that different from if we were to actually go and do the actual calculation itself in Excel. And actually it wouldn't take me long to set it up in Excel. We can run through this very quickly. So if I've got the total number of patients, and I'm selecting from the 54,115, and I'm going to select one of them from the 53,943. So then the probability is going to be the part over the whole. So that's my probability for one.

And now if I just go ahead and say I'm going to take one away from you, I'm going to take one away from you, and I'm going to select the same probability here. And we wanted to do that for the 40 patients that were looking to select. So if I just bring this down, I get the 40. And there are my individual probabilities.

Now to get the final probability, I've got to multiply all of those probabilities together, and look at the number that we get. It's not appreciably different from what we saw earlier. This is 0.880395, and this is 0.880435. So we can see that, you know, even though it's not the same number, it's close enough to where we can use it as an approximation. Again, you have to be within 5% in order for that approximation of the whole, because once you leave that 5% range, then the difference between your two numbers becomes great enough to be significant. So anyway, that's why we can actually use this selection with replacement to approximate selection without replacement. It makes the calculation much, much easier.

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

1 Comment

Performing hypothesis testing on means of right and left hand reaction times

3/26/2019

0 Comments

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to perform hypothesis testing on means of right and left hand reaction times. Here's our problem statement: Several students were tested for reaction times (in thousands of a second) using their right and left hands. Each value is the elapsed time between the release of a strip of paper and the instant that it is caught by the subject. Results from five of the students are included in the graph to the right. Use a 5% significance level to test the claim that there is no difference between the reaction times of the right and left hands.

Part 1

OK, the first part of this problem is asking us for the hypotheses for the test. So in this first dropdown blank here, it says let μ-d be the blank of the right and left hand reaction times. μ-d is going to be the mean of the differences. That's how μ-d is defined. The null hypothesis will always carry an equal sign. So we select that there. And then for the alternative hypothesis, we look to our claim. The claim that we're testing here, it says, "Test the claim that there is no difference" --- in other words, that they're the same. Well, we can't really set this to be equal, because equality by definition belongs to the null hypothesis. So then we have to take the compliment of that. And the complement of being equal to is being not equal to. So now we check our answer. Good job!

Part 2

Now the next part asks us for the test statistic. And to do that, we're going to access the data here, and we're going to dump the data into StatCrunch. I resize the window here so we can get a better look at what's going on. And now with the data here in StatCrunch, I'm going to go up to Stat --> T Stats --> Paired (because for every left hand there, there's a corresponding right hand for the same student).

Here in my options window, the first sample is just going to be the variable that's listed first. And the second sample is going to be the variable that's listed second, I look down here, and the default selection is for hypothesis testing, so I don't have to change that. And then I make sure these fields match what we got here earlier for our null and alternative hypothesis, and they do. So now I press Compute!, and here in my results window is my test statistic, the second to last value listed there in the results window. So I'm just going to go ahead and type that in. It says round to three decimal places --- 72.418. Well done!

Part 3

Now the next part of the problem asks for the critical values. Notice the critical values are not listed here on the results window from the hypothesis testing that we just did. And so we're going to have to go to our distribution calculator in order to calculate the critical values. And the distribution that you want to look at is the t value --- is the t distribution. Because look at the values that were listed here. So your test statistic was a t score, and the critical value they're asked for is a t score.

So we're going to go up to Stat --> Calculators --> T. Here in my distribution calculator, the first thing I want to look at is the alternative hypothesis because this tells me, "Do I have a one-tail test or a two-tail test?" And the alternative hypothesis is not equal to, so not equal to means we have a two-tail test. So I'm going to click the Between option, and I'm going to have two critical values: One on the left, one on the right.

Now to get the correct critical value, I need first to enter in the corresponding value for degrees of freedom. We've got five data points you can see here, or you could actually --- if you wanted to, you could go back and count them here in your StatCrunch window. So we've got five pairs. Degrees of freedom is one less than that. So I'm going to put 4 here for my degrees of freedom. And then for --- remember the Between option here in StatCrunch lists the area in between the tails. So the area of the tails, the significance level, which here it says we want to use 5% --- so the area in between has to be the complement of 5%. So I subtract that from 1 and get 95%. So there's 95% in between the tails. Then I just press Compute!, and here are my critical values which I can now put here in my answer field. And it wants three decimal places. Nice work!

Part 4

Now this last part of the problem is asking us for a conclusion. To conclude our hypothesis test, notice there's nothing here with P-values. So we're going to have to do it straight with the test statistic and the critical value, which is easy enough to do here in StatCrunch in the results window. Notice we have our critical values listed here. So those values are marking the boundaries of our tails there in our distribution. And what we're going to compare it with is the test statistic.

Here's the test statistic: -2.4. So -2.4 is going to put me here in the central region, because notice the boundary here is negative 2.7. So -2.4 is gonna put me just inside that central region in between the tails. So since I'm not in the tail, I am therefore not in the critical region. I'm not in the region of rejection, and therefore there's not sufficient evidence. We're going to fail to reject the null hypothesis. And whenever you fail to reject the null hypothesis, there's not sufficient evidence. Of course, the claim here is that there is no difference between the reaction times. Excellent!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

0 Comments

Evaluating reported research results for falsified data

3/22/2019

0 Comments

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to evaluate reported research results for falsified data. Here's our problem statement: A researcher was once criticized for falsifying data. Among his data were figures obtained from six groups of mice with 25 individual mice in each group. These values were given for the percentage of successes in each group: 53%, 58%, 63% 46%, 48%, 67%. What's wrong with those values?

Solution

OK, so we're looking at six different groups and each of these groups have 25 members to them. So what would the percentage be, let's say, if we're looking at, say, you know, looking at one out of the 25? What would the percentage be? Well, here if I can just do that in my calculator, 1 out of 25 gives me 4%. If I had 2 out of the 25, then I take 2 divided by 25 and get 8%. If I had 3 out of the 25, I get 12%. So we can see that we're --- we're looking at multiples of 4 here. And these numbers that we see here that are listed as the reported percentages, these are not multiples of 4. So therefore they cannot possibly be correct.

I could also look at this backwards and say, "Yeah, you know what? What part of the 25 corresponds with 53%?" So I'd take my 53%, and I multiply that by the 25. See, I've got 13 and a quarter. Well, I understand the 13, but what are you doing with a fourth of a mouse? It doesn't make sense. And you get the same thing looking at the other numbers.

So all the percentages of this session should be multiples of 4. So we're going to take that as our answer. I check my answer. Nice work!

And that's how we do it at Aspire Mountain Academy. Be sure and leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher was boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

0 Comments

Performing linear regression analysis of automobile weights and highway fuel consumption

3/19/2019

0 Comments

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to perform linear regression analysis of automobile weights and highway fuel consumption. Here's our problem statement: The table lists weights (in lbs) and highway mileage amounts (in mpg) for seven automobiles. Use the sample data to construct a scatterplot. Use the first variable for the x axis. Based on the scatter plot, what do you conclude about a linear correlation?

Part 1

OK, so here we've got our data. And the first part asks us to construct a plot. We can work this in StatCrunch or Excel, and I'll actually work it both ways and then let you decide which way you think is easier for you.

So first we're going to look at working this problem in StatCrunch. So to construct my scatter plot in StatCrunch, I'm first going to select the data and dump it into StatCrunch. And now here I've got my data in StatCrunch, so I'm going to select Stat --> Regression --> Simple linear. The problem statement says we want the first variable for the x axis. Typically that's what you would select anyway, so we're going to select that. And the y variable will of course be the second variable. I come down here and hit Compute!.

And here in my results window, the scatter plot is actually on the second page. Notice how up here at the top it says "1 of 2." This is the first of two pages to my results window. So if I click down here on this button at the bottom right, I can get to the second page. There's my scatter plot along with a line of best fit. So it's really easy to select where my line of best fit is going to be located. I check my answer. Fantastic!

Part 2

And now the second part of this problem asks, “Is there a linear relationship between weight and highway mileage?” Well, to look at that, I need to go back to this first page and identify the linear correlation coefficient. My R value, which you see is located right up here at the top --- so -0.9979. So I would need to take the absolute value of this and then compare that with the critical R value. So the critical R value is going to come from the table that's listed there in your textbook. It's also in the insert to your textbook. There's no actual link here in the problem statement to the critical R value table.

And that's OK, because look at the value that we have --- 0.9979. I mean, this is --- this is almost 1. So this is an excellent R value, and we don't need to check it against a critical value because anytime you get an R value that's greater than 0.97, you're going to have linear correlation. You don't need to check it with the R table. You can check it if you want, but it's going to come out to be the same thing. You're going to conclude there is a linear relationship.

So the answer here we're going to click is going to be yes, there is a linear relationship. But notice there's two options here for yes. So let's see what the rest or second part says. "As the weight increases, highway mileage decreases." "As the weight increases, the highway mileage increases." Well, we have a negative correlation here. So as the weight goes up, what's --- what direction is the line going? It's going down. So as the weight goes up, the highway miles per gallon goes down. So we're going to want this answer option here. Good job! And that's how we do it in StatCrunch.

Now to get the same thing out of Excel, I come back here and let's say I'm just going to dump my data in Excel. Probably the easiest way for me to do this is I could open it in Excel, but I'm just gonna copy and paste here. It's much easier for me to go through here. So here's my data in Excel.

And now to get this actually doing it in Excel, first I want to get the scatter plot. So to do that, I'm going to select my data here, and then I'm going to come up to Insert and then here under Charts I'm going to select the scatter plot. I want this first option for my scatter plot, and boom! There's your scatter plot.

To check for the linear relationship, what I'm gonna do is I'm going to left click on one of these data points, and then while my cursor is still over the data point, I'm going to right click on my mouse so I get this wonderful little menu. And I'm going to select Add trendline. Notice in the trendline option that the default selected is the linear, and that's what we want, so we're going to leave that alone. I'm going to come down here, and I'm going to select Display equation, and Display R squared values is what I really want. So notice we've got this little area here. And I'm going to move this over. If I double click inside, select everything, I can actually increase the font size so I can make it more readable.

So here we've got an r squared value of 0.996. The value I need to check for our linear correlation is actually R; I get an R squared. So if I take the square root of R squared, I'm left with R. So if I take the square root of 0.996 --- I can do it with my calculator here --- there's my R value: 0.997, which again, this is what we saw previously. And because this value was greater than 0.97, we don't need to check with the R critical value. We're going to assume that there's linear correlation here.

So that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below. Let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

0 Comments

Identifying statistical values related to hypothesis testing of two population proportions

3/15/2019

4 Comments

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to identify statistical values related to hypothesis testing of two population proportions. Here's our problem statement: In a large clinical trial, 390,245 children are randomly assigned to two groups. The treatment group consisted of 193,395 children given a vaccine for a certain disease, and 34 of those children developed the disease. The other 196,850 children were given a placebo, and 136 of those children developed the disease. Consider the vaccine treatment group to be the first sample. Identify the values of n1, p1, p-hat1, q-hat1, n2, p-hat2, q-hat2, p-bar, and q-bar.

Part 1

OK, the first thing we're asked to provide is the sample size for the first sample. The problem statement says that we're considering the vaccine treatment group as the first sample, so the sample size is given there in the problem statement. That's the total number in that first group that we're given the vaccine, which here is the 193,395. So I'm just going to put that here in my answer field. Well done!

Part 2

Next we're asked to provide p-hat1. p-hat1 is the proportion of those in the first sample --- that's the subscript 1 --- that is a probability of quote unquote success. And here success is going to be defined as actually developing the disease. I know that sounds backwards. It's like, we want to --- we're developing a vaccine, right? You want to have success be not getting the disease, but it actually is easier to work the problem statistically if you just flip flop your definition of success. So here we're going to say success is going to be actually developing the disease.

Well, the probability --- that's all we're looking at here is a proportion or probability, which would then be the part over the whole. The part that developed the disease is 34 in that first sample, and over the whole, which is the 193,395. So I'm going to whip out my calculator here and just do a simple little division here. Take the 34, which is the part, divided by the whole, 193395.

And notice here the scientific notation that's given here in my calculator, this "e minus 4." This is really as a notation here. It's saying this number that's out listed here is being multiplied by 10 to the -4. So if you were to write this out in decimal form, this decimal point would really need to be moved 4 places to the left. That's the -4 here.

So it's asking here in the problem statement for us to provide an answer rounded to eight decimal places. So I'm going to put in one, two, three, and then that first number --- see, then my decimal place has been moved over four places. And then I finish putting the number in. This is going to get tricky. Let's count the numbers here to make sure we've got --- one, two, three, four, six, seven, eight. Oh, look at that. Actually I need to come up one because the zero here, which is my eighth character, the 5 next to it means I'm going to be rounding up. So I'll put that 1 there on the end. I check my answer. Well done!

Part 3

q-hat is just the compliment of p-hat. So to get that, we're just going to subtract this value from 1. Instead of typing this number in again, I could say 1 minus and then type this number in again, but I don't want to have to do the typing. I'm a little lazy. So I'm just going to make that number negative and then add it to 1. It gives me the same effect. So now I need to take this number. That's eight decimal places.

Part 4

And now I need to do the same thing with the second sample. So the size of the second sample, those who were given the placebo, is listed here in the problem statement. That's the 196850.

Part 5

And then the proportion of success, which again is going to be developing the disease --- so I got to take the 136 that actually develop the disease there in the second sample and divide by that sample size. So again I get a really small number. Well done!

Part 6

And now we're looking for q-hat2, which of course is going to be the complement of p-hat2. So again, I'm just going to make this number, which is p-hat2 --- make that negative. Add it to 1. Boom. There's my q-hat2. Excellent!

Part 7

Now I'm looking for p-bar. p-bar is going to be the pooled proportion. So remember when we were doing the individual p-hat. So p-hat1, I just take the 30 (the part which developed the disease, the 34) divided by the whole. I did the same thing with p-hat2; I took the part that developed the disease, the 136, and divided by its whole. And now we're going to have to combine those together. So now I'm going to have to take both the parts and add them together to get a pooled part. And then I take that pooled part and divide it by a pooled whole. So I'm going to take both of these holes and add them together. And here I get my pooled proportion, the p-bar. Well done!

Part 8

And last but certainly not least, I'm asked for q-bar, which again, is just going to be the complement of p-bar. I'm going to take this value here for p-bar in my calculator and make it negative, add the 1, and there's my value for q-bar. I'll put that here in my answer field. And it's easy to get lost with the number. It's got eight decimal places, so it's really easy to get lost. I got eight. I check my answer. Oh, it did not like that! I mistyped it somewhere. Where did I mistype this? Nine, nine, six, four, three, eight. That's what I hate about these, this eight decimal place thing, is it's hard to keep track of all these little numbers. Excellent!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below. Let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

4 Comments

Testing a probability table for distribution properties

3/12/2019

0 Comments

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to test a probability table for distribution properties. Here's our problem statement: Groups of adults are randomly selected and arranged in groups of three. The random variable x is the number in the group who say they would feel comfortable in a self driving vehicle. Determine whether a probability distribution is given. If a probability distribution has given, find its mean and standard deviation. If a probability distribution is not given, identify the requirements that are not satisfied.

Part 1

OK, so this first part of the problem is asking us whether we have a probability distribution listed in the table. In order for these, uh, this distribution table to qualify as a probability distribution, three requirements need to be met. First is that we're actually listing actual probability values. And to test that, we look at each of the values that are listed here in the probability column. And the most important property of the probability that we're going to be looking for here is that it needs to be a value between 0 and 1, inclusive. So we look at each one of these values here, and they're all between 0 and 1. So therefore the first requirement is met.

The second requirement is that there must be a matched value for each possible random variable. So if we look at our random variable, which in this case is the x, we have a probability value listed for each one of those random variable values. So the second requirement is met.

The third requirement is that each of these probabilities listed in the probability column must add up to equal 1. And by 1, I mean 1 exactly. So you're not getting like 0.99 or 1.01; it's gotta be 1 exactly. So there's two ways to do this. I find it easier to do this in Excel. So I'm just going to dump the data here in Excel. And it always brings up whatever other copy of Excel I have open. When it does this, I'm just gonna make this easy. We're just going to paste you. So notice here when I actually have the data here in Excel, if I just select these columns with the probability, down here at sums it for me automatically. This is why I say it's easiest to do this in Excel. All I have to do is dump the data in, then get the column selected, and then here I get the sum, and it's 1 exactly. So the third requirement is met.

Now if I wanted to do this in StatCrunch --- StatCrunch sometimes gets a little clunky. This particular operation is not that bad. I can actually put my window here and see what's going on. So here in StatCrunch, I'm going to go to Stat --> Summary Stats --> Columns and then sum up that probability column. We see we get the same thing. It's equal to 1 exactly. So we do have a probability distribution here. So I'm going to select that answer here from among my choices. Well done!

Part 2

And now the second part of this problem asks me to find the mean of the random variable x. This is not where you're just adding up 0, 1, 2, 3 and then dividing by 4. That's not going to give it to you. What will give it to you is when you actually calculate the --- what it's asking for is the mean of the distribution. And so that's easiest to do in StatCrunch.

The way we're going to do this, just go to Stat --> Calculators --> Custom. In the options window, our values are going to be our random variable, which in this case is the x, and the weights are going to be our probabilities. I press Compute!, and out comes this wonderful little results window here with the mean listed right here for you. It's already calculated for you. So I'm going to go ahead and stick that in my answer field. It says to round to one decimal place. Nice work!

The last part of his problem is asking us for standard deviation. And again that's listed right over here in our results window. So I'm going to go ahead and stick that value here in the answer field. Nice work!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

0 Comments

Understanding the assumptions underlying one-way ANOVA hypothesis testing

3/8/2019

1 Comment

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to understand the assumptions underlying one-way ANOVA hypothesis testing. Here's our problem statement: The company data table contains chest deceleration measurements in g (where g is the force of gravity) from samples of small, midsize, and large cars. Shown are the technology results for analysis of variance of this data table. Assume that a researcher plans to use a 5% significance level to test the claim that the different size categories have the same mean chest deceleration in the standard crash test. Complete Parts A and B below.

Part A

OK, Part A says, "What characteristic of the data specifically indicates that one-way analysis of variance should be used?" Well, let's go ahead and take a look at our data. I click on this icon here; it shows us our data. Here's our population of data, or rather the sampling that they took. And we see that there's three sizes here --- small, midsize, and large. So everything's being categorized according to the one category, which we would say is size. But there's three different options to choose from. And that's really the key characteristic here, that you've got different categories of the same type that you're trying to compare. Of course there's more than two of them. So that combination calls for one-way ANOVA testing.

So what are our options here? "The measurements are characterized according to the one characteristic of size." Yep. That's pretty much it. But let's check the other options just to make sure. "There are three samples of measurements." Well, yeah, three lends to one-way ANOVA testing, but just having three samples alone is not enough. The key characteristic is categorization according to one characteristic. "The population means are approximately normal." Well, we hope they are, but they may not be. "Nothing specifically indicates that one-way analysis of variance should be used." Yeah, that's not true. So we're going to go with Answer option A here. Good job!

Part B

Now Part B says, "If the objective is to test the claim that the three size categories have the same mean chest deceleration, why is the method referred to as analysis of variance?" Well, it's because we're analyzing a variance. And the variance that we're analyzing is a common population variance. So we're estimating the population variance, usually from two different directions. And that's what the method is actually based on.

So let's see what our answer options say here. Yeah, right here. "Estimates of common population variance." Boom. That's what we're looking for. Excellent!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just doesn't want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

1 Comment

Finding the coefficient of variation for parking meters

3/5/2019

1 Comment

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to find the coefficient of variation for parking meters. Here's our problem statement: Listed below are amounts in millions of dollars --- millions of dollars, I like the sound of that! --- collected from parking meters by a security service company and other companies during similar time periods. Do the limited data listed here show evidence of stealing by the security service company's employees?

Part 1

OK, the first part of the problem asks us for the coefficient of variation for the amount collected by the security service company. Of course, the next part's going to ask for all the other companies, so let's go ahead and dump our data into StatCrunch. I can resize this window so we can get a better look at everything. Excellent.

Now I'm going to go into Stat --> Summary Stats --> Columns. I know I'm going to have to do both my columns, so let's just go ahead and just select both of them. And notice how I selected both of them, by holding down the Ctrl button on my mouse while I clicked the other option that I wanted. And then the statistic I want is the coefficient of variation. So I select that, hit Compute!, and boom! Here's my coefficients of variation.

Notice that coefficients of variation are listed as percentages. So there's no need to adjust your decimal point to, you know, uh, you know, change the number out to its proper form. It already is in the proper form. It's asking for a percent here. So we're asked to round to one decimal place, so I'm going to do that. Good job!

Part 2

Now the second part asks for the coefficient of variation for the amount collected by the other companies. Of course, we've already calculated it, as you can see here. Again, one decimal place, so I need to round my number to [that]. Well done!

Part 3

Now the last part of this problem asks, "Do the limited data listed here show evidence of stealing by the security service company's employees? Consider a difference of greater than 1% to be significant." Well, what's the difference between our coefficients of variation here? It's about 3% three percentage points. So that's definitely greater than one percentage point. So yes, there is a significant difference in the variation. Excellent!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we did or how we can improve. And if your stats teacher is boring or just as I want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

1 Comment

Evaluating graphics for data distortion

3/1/2019

1 Comment

Intro

Howdy! I'm Professor Curtis of Aspire Mountain Academy here with more statistics homework help. Today we're going to learn how to evaluate graphics for data distortion. Here's our problem statement: The graph to the right uses cylinders to represent barrels of oil consumed by two countries. Does the graph distort the data, or does it depict the data fairly? Why or why not? If the graph distorts the data, construct a graph that depicts the data fairly.

Part 1

OK, the first part of this problem is asking, "Does the graph distort the data? Why or why not?" Our answer options here are either yes or no. And "yes, because the graph incorrectly uses objects of volume to represent the data." The other yes option says, "Because 3D objects always distort the data in graphs." Well, 3D objects generally distort the data and graphs, but that's not always the case. There are some exceptions to the rule.

But the rule is that, yeah, typically 3D objects are going to distort the data. They incorrectly represent a 3D object on a 2D surface. So some of the proportions get a little, uh, I guess you would say out of proportion. And that's why it's often deceptive to represent 3D --- use 3D objects to represent, a one --- it's actually a one dimensional value is what it is. Like the number of barrels, that's just one dimension. But you've got three dimensions when you're looking at volume. So that's where the distortion comes. You're using multiple dimensions to represent only one dimension, which is along the single number line. So yes, the graph does distort the data because they incorrectly use objects of volume to represent the data. Nice work!

Part 2

Now the second part of this problem says, If the graph does not depict the data fairly (which it doesn't), which graph below does?" OK, a bar graph is much more fair in its comparison. Look at the answer options we get here.

Answer option D is obviously wrong because the graph does not depict the data fairly. So if we scroll back up here, look at our Answer options A, B, and C, and notice Answer option C --- the graph does not start at zero. This is another technique that people use to distort the data to make it appear like the difference between, uh, you know, multiple options is different than what it actually is. So we don't want the non-zero axis here.

So it's going to be A or B. And if you just look at the values here, 21.4 for A, 5.5 for B. So A should be much larger than B. And we see that here in Answer option A. Answer option B, the values are flip flopped. Nice work!

And that's how we do it at Aspire Mountain Academy. Be sure to leave your comments below and let us know how good a job we do in or how we can improve. And if your stats teacher is boring or just as I want to help you learn stats, go to aspiremountainacademy.com, where you can learn more about accessing our lecture videos or provide feedback on what you'd like to see. Thanks for watching! We'll see you in the next video.

1 Comment

Intro

Solution

Intro

Part 1

Part 2

Part 3

Part 4

Intro

Solution

Intro

Part 1

Part 2

Intro

Part 1

Part 2

Part 3

Part 4

Part 5

Part 6

Part 7

Part 8

Intro

Part 1

Part 2

Intro

Part A

Part B

Intro

Part 1

Part 2

Part 3

Intro

Part 1

Part 2

Author

Archives

Stats

Company

Support