Part one. In this part, we're going to define a new variable. ICO, by equal by takes a value of one. If Iko Lbs is greater than zero and takes a value of zero. If equal LBS is zero equal by indicates whether at the prices given a family would buy any ecologically friendly apple, the fraction of families claim they would buy equal labeled airports is you will take the number of cases that equal by equals one, which is 412 divided by the total number of observations, which is 660 and you wouldn't get 0.6 to 4. This is the old LS estimates of the linear probability model. We went interpret the prices variables Yeah, which are ICO prices and regular prices. The coefficient of ICO prices is minus 0.8 and the coefficient of regular prices is point 72 So it means if iko price increases by, say, 10 cent se $0.1 then the probability of buying ICO labeled apples falls by about 0.8 point eight for regular prices. If the regular prices increase bye, 10 cent, then the probability of buying eco labeled apples increases about 0.72 We will evaluate their joint significance off their non price variables, which are family income, household size, educational level and age. We will do an F test. The F statistic is 4.43 and the two degrees of freedom is four and 653. The P value of the test is 0.15 So we are able to reject the non hypotheses. In other words, these factors jointly significant at the 1% level. Among the four non price variables, education appears to have the most important effect. The coefficient of education is 0.25 with a very small standard. Errors, a difference of four years of education, increased the probability of buying ICO labeled apples by 0.1, so we can expect that more highly educated people are more open to buying produced that is environmental friendly. Another variable that is also important is household size. The coefficients of household size is 1024 with a small standard error, comparing a couple with two Children to one that has no Children. Other factors equal the couple with two Children has a point though, for a higher probability of buying equal label apples. We've been compared to models, the original model in Part two and a new model where you replace family income with its lock value. The first thing we could notice is the are square. The art square in the original model has increased from 0.110 2.112 So the model the new models fits the data likely better. The second thing you can notice is there coefficient of family income and lock family income. This coefficient for family income is 0.1 But for lock of family income is point 04 or five. We could interpret this coefficient as if the lock of family income increased by 10%. Okay, if lack of family income increases by 0.1, which means almost a 10% increase in family income, then the probability of buying an ICO labeled product is estimated to increase by about 0.45 a small effect. After we estimate the model, we can get the fitted probability which we denote ICO by hat, we win count. The number of cases we're ICO by is greater than one, and we will count the number of cases where ICO, by it's less than zero ICO by is the probability of buying equal label apples, so it should be between zero and one inclusive. But we just estimate a linear probability model, and one limitation of this kind of model is that it produced predicted values that fall outside the range of zero and one. Now, in this part, we're going to see if this model produced strange results, and you could find that there are two fitted probabilities above one, and there is no cases where the fitted probability is smaller than zero. The number of cases or the number of observations is 660. So given that we have to wrong values, is this not a problem? For past six, we will define a new variable. A binary variable. Geico by squiggle or tilda equal by chilled er gets a value of one. If the fitted value of equal by is greater than or equal 2.5 and ICO by Tilda, takes a value of zero. If the fitted value of ICO by ISS less than 0.5 okay in the next step within tabulate. Okay. The values of ICO body, Tilda and ICO by not the fitted value, the actual probability of purchasing ICO labeled apples. This is the tabulation result or we have a confusion matrix. We have two roles predicted not by and predicted by thes rose are corresponded to two values of equal by Tilda. And we have two columns, the actual not purchase and the actual purchase of ICO labeled products. These two columns relate to the two values of the binary variable. The original dependent variable equal by the first cell is 102. The next to it is 72 and then we have 146 then 340. So these two, the highlighted cells, are the cases where the model predicts, currently 248. At two cases where people did not purchase equal labeled apples, and 412 cases where people actually purchase the equal label apples. The model predicts correctly 41% of the cases where people did not purchase equal label products and 82.5% of cases where people purchased equal label products. Yeah, we can calculate the overall percent correctly predicted by taking the number of cases that the model predicts correctly divided by the number of observations. So we have 102 plus 3 40 all divided by 6 60 and we get 67%. The model does pretty well, and it does a much better job predicting the decision to buy equal label apples.