Question
(b) The male award winner who was 74 years old was identified as a potential outlier. If this award winner had been 54 instead of 74 years old, what effect would this decrease have on the following statistics? Justify your answers.The interquartile range of male ages:The standard deviation of male ages:The mean of male ages:The median of male ages:
(b) The male award winner who was 74 years old was identified as a potential outlier. If this award winner had been 54 instead of 74 years old, what effect would this decrease have on the following statistics? Justify your answers. The interquartile range of male ages: The standard deviation of male ages: The mean of male ages: The median of male ages:


Answers
The following data represent the male and female population by age of the United States for residents under 100 years old in July 2003. (TABLE CANNOT COPY) (a) Approximate the population mean and standard deviation of age for males. (b) Approximate the population mean and standard deviation of age for females. (c) Which gender has the higher mean age? (d) Which gender has more dispersion in age?
All right in this problem, the variable is normally distributed with me and New equals 64 thunder deviation sigma equals two. Based on this information alone, we want to answer a through develop this question is challenging understanding of how to answer questions related to normally distributed random variables. Just I'll be utilized the following relevant information on the left. We have information on how to map a Z score onto probability for area under normal distribution for a random variable or rather standard normal variables. E on the right, we have information on how to map a non standard normal variable onto a standard normal because our variable has non means zero or me non zero and a standard deviation on one, we have to use the conversion Z equals x minus mu over sigma. So we can use this to answer A through D A. What is the probability access between 15 and 70. This is equivalent to the probabilities between the z scores, 59 64/2 negative three and 79 64/2 weeks or three from a Z table. This is 30.9973 and be one of the core tiles. We look for the Z scores that separate are tales of our normal distribution in 2.25 area these are Z equals plus or minus 0.67. That's our cue one is inverting our formula above to the one sigma plus musical, 62.66 X two or Q three is easy to sigma plus me equals 65.33 in part C. We want to determine the height, symmetric about the mean that includes many of the population. That is. We're looking for the bottom line. Is he not for which? Probably easy falls between is 0.9. This is plus or minus 1.64. Plugging again into our inverted formula for X. give 60.7- 67.28. Finally, what is the probability that five selected at random? I'll have expert in 68 first. We need to find that one selected at random. Have expert in 68. The Z score is 16 64/2 equals two. That's probably the greater than 2.2 to 8. And taking the fifth. The exponent of this to the 5th power. Because we've had five randomly selected cuz greater than two to the 5th and 6.1 time saying I am nine. An incredibly small possibility.
All right in this problem, the variable is normally distributed with me and New equals 64 thunder deviation sigma equals two. Based on this information alone, we want to answer a through develop this question is challenging understanding of how to answer questions related to normally distributed random variables. Just I'll be utilized the following relevant information on the left. We have information on how to map a Z score onto probability for area under normal distribution for a random variable or rather standard normal variables. E on the right, we have information on how to map a non standard normal variable onto a standard normal because our variable has non means zero or me non zero and a standard deviation on one, we have to use the conversion Z equals x minus mu over sigma. So we can use this to answer A through D A. What is the probability access between 15 and 70. This is equivalent to the probabilities between the z scores, 59 64/2 negative three and 79 64/2 weeks or three from a Z table. This is 30.9973 and be one of the core tiles. We look for the Z scores that separate are tales of our normal distribution in 2.25 area these are Z equals plus or minus 0.67. That's our cue one is inverting our formula above to the one sigma plus musical, 62.66 X two or Q three is easy to sigma plus me equals 65.33 in part C. We want to determine the height, symmetric about the mean that includes many of the population. That is. We're looking for the bottom line. Is he not for which? Probably easy falls between is 0.9. This is plus or minus 1.64. Plugging again into our inverted formula for X. give 60.7- 67.28. Finally, what is the probability that five selected at random? I'll have expert in 68 first. We need to find that one selected at random. Have expert in 68. The Z score is 16 64/2 equals two. That's probably the greater than 2.2 to 8. And taking the fifth. The exponent of this to the 5th power. Because we've had five randomly selected cuz greater than two to the 5th and 6.1 time saying I am nine. An incredibly small possibility.
Hello. Everyone in this video. We're going to use the five number summary and a box plot to look at the statistics Of 14 California counties. First we'll look for the median of our 14 counties. So be the 7th and 8th position 123456 seven and eight. So the average of these two numbers will be a number in the middle which is five. Let's write that down. U. two or the medium is equal to five. Next we'll find quartile one which is half of the first half. So if we have seven on the seven numbers on the left side We'll look for the 4th position 1234. This would be our quartile one. Yeah I once equals to 4.4 Next to find quarter to three on behalf of the second half And also the 4th position 1234. So 5.8 will be our 4003. Now to find like you are or the inter quartile range, be quartile three minus quartile one. So we have 5.8 runners, 1.4 which is 1.4 it's a little bigger. Low, too big. But right here now to interpret The inter cartel range of 1.4, this represents that 50% of our data are within At 1.4 unit. A measurement for example, you can show this visually, we draw a line here and I'll line here. So from 4 to 4.4 that represents 25% of our data and from 4-5, that's 50% of our data Because it's 25% on each section. So For the inter quartile range between 4.4, 5.8, that is 50% of our data. Next, we'll move on to the five number summary, which is our minimum Or lowest number four, of course No. 4.4 A medium five 43. and our Max 6.5. Okay, next we'll move on to the upper and lower limits. This represents the lower limit represents um It's a formula that will help us find low outliers and the formula for upper limit help us find high outliers. So if we were to look at lower limits first, lower limit is Without 1 4.4 1.5 times 1.4 which is our inter quartile range. This should give us the 0.3, bigger. Put this right here because we're here now for upper limit we have oops upper limits which is 5.8 us 1.5 times 1.4. That should give us 7.9. We'll start with the lower limit. First to interpret this. Any number below 2.3 will be considered an outlier. If we were to look at our data, our lowest is four. So there are no low outliers. Next took an upper limit 7.9. Any number above 7.9 will be considered a high outlier height is 6.5. So we have no outliers. Next, we'll look at the box and whiskers to visualize Both are outliers which we have none and our portals. Please give me 1 2nd to draw a graph. Here's our graph. I'll make this slightly bigger. I'll cover this for now. Also showing the numbers as you can see if we retreat these lines. The first black line as the first blue line, 4.4 This is quarter to one. The Red Line represents Quartile two at 5. Quartile three is at 5.8 and our two ends are two whiskers. A left whisker represents our minimum number of four, And in the right, Whisker represents our maximum of 6.5, with no outliers indicated by any um bread, uh stars or crosses. Similarly, Our box plot also matches are five Number summary.
This question calls back to a sample of census data and they were given in question one. This The population that we're talking about here is census data that is, all of the ages of people in the US in the problem were given a small sample of that that is 100 different ages taken from this population. Now, when it comes to samples, there are a couple of different ways of describing them. That is summary statistics, ways to sum up a sample of data in just a number or two on important one that we can use here. Is that the mean, the mean of a sample which we can call X bar is just the average of the numbers in that sample. Here, explore is equal to 39.31 We can also talk about this spread. How was spread out? Is this data to do that well, To use the standard deviation which we can call s ex X in a sub script here, the standard deviation of this sample is 25.16 Now notice how I drew arrows from export and SX towards the sample and not towards the census. The entire population has a different mean and standard deviation. The mean of the population, which all right is mu to be able to distinguish from explore is 36 0.5. That's told to us in the problem the standard deviation, which all right, a sigma again to distinguish from the population from the sample is 22.5. So clearly the population mean is different from the sample mean and the population standard deviation is different from the sample standard deviation. This is a first look into a very important idea of the very important idea of sampling variability. It asks if we took another sample of 100 different ages, what would we expect? Would we expect to have the same x bar and S X Well, we would not. Whenever we take samples of random data, there's no guarantee that will have the same uh, the same beta in the second sample. We could take another sample again of 100 instead of are mean being three higher than the sample mean it could be, for example, three lower. We could have ah, 33.5 x bar in another sample mean we could also have a different standard deviation. It could instead of the sample standard deviation being three higher than the population center deviation the sample. Senate aviation could be three lower, so we could get a standard deviation of 19.1 something like that. The key to sample variability is that when sampling from a population, there's no guarantee that the one sample will look like the next.