What do we have in this question? Now? Snoop Incorporated is a firm that does marketing surveys and marketing research and stuff. The Rolling Sound Company has hired this firm to study the age distribution off people who stream music. Now, in order to check the Snoop report, Rollem used a random sample off around 519 customers and their data is given to us. So let's just look at the data we have customer age. We have customers. Is this is in yours? Okay, so this is going to be a first column in the next column. We have percent of customers from Snoop Report. Okay, this is person off. Customers from Snoop report all the right. After that, we have the number of customers from the sample. That is the observed values that we just call this the observed values, the observed values. Okay, so what are the different categories that we have? The first one is less than 14 years. The next one is 14 to 18. After that, we have 18 to 23. After that, we have 24 to 28 24 to 28. After that, we have 29 to 33 29 to 33. And after that, we have greater than 33 That is older than 33. All right, we'll need more off this space. Okay, So now what is the person of customers from snoop report? This is 12%. This is 29%. This is 11%. This is 10%. This is 14% and this is 24% now. What were the observed values? We have 88. We have 1. 35. We have 52. We have 40. We have 76. We have 1. 28. All right. Thing is the data that is given to us now. What is the question? The question is, they're saying, use a 1% level of significance 1% off. The 1% level of significance mean that Alfa is 0.1 using this test. The clever distribution off customer ages in the Snoop Report agrees with that of the sample reports. So what is going to be a null hypothesis now? Hypothesis is that the distribution, the distribution off customer ages in the Snoop report in the Snoop report fits the distribution off the sample report which the distribution off change the sample report. All right, what is going to be the alternative hypothesis? The alternative hypothesis will be that the distribution off customers, the distribution off customer ages in the snoop report in this new report doesn't faked the distribution. The distribution off the sample report off the sample is report. Now, this is also very important. These two lines, uh, the first answer, the null and the alternative hypothesis. This is what you will write in the null and the alternative hypothesis. Okay, so now we have to test the claim. Now, in order to do this, we're going to perform a Chi Square. Sadistic. And what is the first step in performing a case question district? We find the expected values. We find the expected values for all the categories. What is the formula to find the expected value expected value? It's picked it. Value is given by the formula. The sample size that we have, the sample size multiplied by the probability or the proportion of each category, the probability, or the proportion off each category. Okay, so let's just apply this formula with you for the first category for less than 14 years of age. What is our sample size? Our sample sizes 519. So this edition is 519 Okay, So what is going to be the expected value for the first category? It is going to be 12% off 519 or 0.12 in 2519 This is 62.28 62.28 For the second one, it is 0.29 into 519 This is 1 15.51 2050.51 then the 6.11 in 2519 This is 57.9 57.9 After that, we have 10% off 519 which is 51.9. After that, we have 14% 140.14 off 519 which is 72.66 72.66 And then we have 24%. 0.24 into five or nine. This is 1 24.561 24 0.56 All right, now what do we need after this? After this, we're going to calculate the individual chi squares values the individual chi square values for all the categories. What is the formula for that? The formula that we're going to apply over here is going to be for every category. We are going to find the difference off the observed and the expected values square them, divide them by the expected values. And then in the end, we will add all of these values up. That is, add the value for all off the categories and we get the overall chi square statistic for our problems. Let us look at this formula. How are we going to apply this? So for the first category, there will be a difference off 88 62 point to it. So the difference between 88 62.28 will square this 25 0.72 We square this and we divide this by 62 point win 62.8. So this is 10.16. This is 10.62 Okay, Then we have the difference between 1 50.51 and 1 35. We square this and divide this by 1 50.51 This is 1.598 So it was like this is 1.6. Okay, then we have the difference between 57.9 and 52. We square this and we divide this by 57.9 57.9 So this is 0.4538 or okay, this is going to be 0.45 That's it. Then we have the difference between 51.9 and 40. We square this and we divide this by 51.9. This is two point 7 to 8. Or let me decide. This is 2.7 three. The difference between 76 72.66 Then we square this and divide this by 72.66 72.66 This is 0.1530 point 15 three. Then we have the difference between 1 28 and 1. 24.56 and we square this 3.44 and this is divided by 1 24 0.56 And this is 0.95 Or like music like this 0.1. Okay, Now I'm going to add all of these values up. Okay? How do we do that? 10.62 plus 1.6 plus 0.45 plus 2.73 less. 0.153 plus 0.1. I'm getting this as 15 point 15.653 All right. So I can see that my son over here or I can see that. Mike ice question. The stick is coming out to be 15.653 Now, I need the degrees of freedom. The degrees of freedom DF in order to find the key value. What is the formula for degrees of freedom? It is number off categories, number off categories minus one. So how many categories do we have? We have 1234566 different categories. So this is going to be six minus one. Or I could write this as five. Now I have my chi square, so to speak and my degrees of freedom. So now I can use either chi square table, but it will just give me a range off the values. And if I use a statistical software like python or or spaces or any statistical package will give me the exactly value. So what I'm doing over here is I'm using an online too. So my p values 15 0.656653 And my use of freedom is five. What is my Alfa? My Alfa in this case is 0.1 and I hit. Calculate, I get my p value of 0.0 7908 So I get my p value as 0.7 What was my Alfa? My Alfa was 0.1 All right, so I can see that my p value is less than Alfa, Which means what I reject my null hypothesis age not. Okay. So how do I framed this? I say that I have enough statistical evidence. Just such Oh, just a moment. Just a moment. Yeah. Okay. To suggest that. What was the question? Let's just look at the wording. The distribution off customer ages, the distribution off customer ages off customer ages in the snoop report in the Snoop report and the distribution on the distribution off the sample report in the distribution off the sample report. Ah, different right are different and this is going to be our answer, and that's how we go about doing this question.