Following a solution number 46. And we're gonna look at how outliers affect certain datasets whenever sample sizes are different. So we're giving a simple data set where the population mean is 50 and the standard deviation for the population is 10. and we're asked to find a 95% confidence interval for that Data set. So I'm using technology here just to save some time. So I'm using a T- 84, but you can use whatever technology or the formulas if you want. Um and if you go to stat and then I'm gonna go ahead and say edit now. L one, I'm gonna use both columns here, L one and L two but L one is the first data set and you see actually is the the exact same. So these 12 data values are exactly the same as L two until you get to a certain point. But this is the small data set that we originally start with. So L one Only has 12 data values. So there they are, went ahead and punched him in and um we're gonna go to stat tests and it's the seventh option here, the Z interval since we know the population standard deviation and keep it ever done on the data. Uh Since we have an actual data set, Sigma's 10 to list in this case is going to be L one. That will change that later whenever we get to the second part of this question and then the sea level frequencies one always and then the sea levels 10.9 fox that asked for a 95% confidence interval. And whenever we calculate now keep this X. Bar in mind 50.25 I'll leave it up here. But Um here's our interval here, 44.592 and 55908. Let's go and write that down. That's really just the first part, is a simple confidence interval. We don't even need to interpret it because they don't think there's a word problem associated with this, But it's 44 592 On the lower bound, and then the upper bound is 55.908. Okay, so that's our Confidence interval. Now, it says, Okay, let's pretend like the 41 and that data set is changed to a 14, and we need to confirm that it is in fact an outlier. Okay, well let's see what they're talking about. So stat edit And it's saying that this 41 here is changed to a 14, so let's go and change that and let's see what happens. I'm gonna go ahead and calculate the one bar stats here. Now there are a couple of ways that you can determine if something is an outlier. You can see if it's two standard deviations above or below the mean. So if you were to do that, you remember the standard deviation was 10 now it gives you something little different here, but we're assuming that it's 10. Um So two standard deviations would be 22 times 10 is 20, so 48 minus 20 is 28 14 is significantly less than that. So anything above or below? Two standard deviations from the mean, but there's a better way, it's the inter quartile range method, So what we do is we take the inter quartile range which is Q three minus Q one, we multiply it by 1.5, and that gives us a number to work with, So let's go and do that first. Um Yeah, so the inter quartile range, remember as Q three minus Q ones, that's 53 minus 435 in this case which is 9.5. And then we're gonna take 1.5 times 9.5. And that gives us 14.25. So this is a little bit more robust than the two standard deviation deals because we don't know if this is normal actually, so so this is good, so 14.25 And then what we do is since it's on the lower bounds of 14 is smaller, we're gonna take Q. One which is 43.5, we're gonna subtract that 14.25 And that gives us 29.25. And since 14 is less than 29-5 than it is in fact an outlier. So it's an outlier. In both cases it's definitely two standard deviations below the mean and it's 1.5 times the intercourse, our range below the first quarter. So in both cases it's an outlier. So that's the confirmation. Now, we're going to find the new Um 95% confidence interval. Okay, so let's go back to stat tests and it's that seventh option? The z interval. Okay, so everything again is exactly the same. But remember that 41 is now 14, so it should change it a little bit and it does. So 42.342 and 53.658. Let's go and write that down and then let's compare. So 42 .342 All the way up to 53 .658. And it asks what you know, how do these compare? Well, if you look at the first confidence interval to the second confidence interval, everything's been shifted down a couple units. So that confidence interval gets pulled just like the mean would be it gets pulled towards that outlier. So the confidence level or the confidence interval ci with the outlier is shifted down or pulled down towards the outlier? Okay. It's also, you know, a little bit bigger um because that variation is gonna be a little bit better, but the main thing is that shifted down because that mean is not robust, it's it's pulled towards that outlier and the mean remembers the point estimate. Okay, so that's part c. So now we go back and we're actually going to change change that back, so stat Edit and then let's go and change this back to 41. Okay, so the next part Were given another data set. Now let's go ahead and write this down. So the X bar the mean for the smaller data set, where it was just 12 data values was 50.25. That's what it asks for here. So 50.25 and then the large data set. That's where I put it in L. two now. Already populated in there. I already punched it in. But here are it's you know significantly bigger. I don't remember how many Data values. 30 or so. Yeah. So 36 data values. All right. So that's the difference there. So let's just make sure that it's the same mean. So let's change this to L. two. Okay? So 50.25. So the sample mean is exactly the same. So 50 point 25 So that's all we do there. So the sample means the same and you may think that the confidence interval needs to be the same now but we're gonna see here in a second it's not so we're gonna make a 95% confidence interval for the larger data set. So if we go back to stat Tests and again it's that 7th option. The z interval. Okay, this is all good. Except this time. The list is L two. Okay. And then we calculate and it gives us this 46.983 and 53.517 46,983, I'm sorry. Oh yeah, that's right. 46.983 2 53.517. Okay, So what does that mean? Well, it's if you look at this confidence interval of 46 to 53 compared to that first confidence interval, 40 40 55 it is a narrower confidence interval now, it's the same confidence level, we're still 95% confident. But what changed? Well in went from 12 to 36. So what does that mean about the width of the confidence interval within? Well, as in increases the width of the confidence interval decreases because that margin of error is going to decrease because you're dividing by a bigger number. So as in increases the margin of error decreases, which means the width of the confidence interval will also decrease, and that's shown there. So the X bar did not change, right? So nothing changed their with those point estimates, the only thing that changed there was the margin of error. Okay, so this last few parts, we basically just do what we just did. So, um we're just working with the second data set and we're gonna change that 41 to 14 for the second dataset. Okay, and again, we're going to confirm that it is in fact an outlier. So let's go back to couch one of our stats. Okay, so 49.5 as the means, so if you think of two standard deviations, that would be 29.5 um and 14 is less than 29.5. Or you can do the inter corte range, that's a little more robust. So I'm going to the inter corte range, that's Q three minus Q one, which is 13. Okay, so we're confirming that it's an outlier city. Inter quartile range is 50 6 -43, which is 13 And I take 1.5 times 13 to give me 19.5 and then I take Q one, which is 43 minus that, 19 5, That gives me 23.5. And since that 14 is less than 23, it is in fact an outlier. So, again, it's an outlier. In both cases it's two standard deviations Below the mean, which in this case is fine because the sample size is big enough, but the intercourse, our range is usually the more preferred method. Now we're gonna make a new 95% confidence interval. Okay, so we go back to stat tests and it's that seventh option. Okay, so all this is good. So remember I had that 41 change to 14 calculate I get 46.233-52.767. So let's write that down and then we can compare real quick. So 46 233 All the way up to 52 767. So that's the new 95% confidence. And so how did that change this here? Well, yeah, 46.9 and it got pulled down to 46.2 and this one is 53.5 and that got pulled down to 52.7. So it did get pulled down a little bit, but not nearly as much as whenever the sample size was 12. Remember this one it got pulled down, you know, a full 2 2.5 units, whereas over here it just got pulled down. You know, I don't know, just a little over a half a unit. And the reason for that, the reason why it didn't get pulled down as much is because that sample sizes larger, the larger the sample size, the less influential those outliers can be. So what happened here? The outlier pulled the confidence interval down a bit but not nearly mm as much scroll down a bit as when And equals 12. Okay, so the larger the sample size, the less influential those outliers will be.