This problem is still focused on statistical inference when we're running a regression and looking at multiple regression here and we're going to do a couple of things such as dropping a variable from their aggression, see what happens to significance of other variables and also doing a joint significance test. So the first part I just wrote out the model from Equation 4.31 in your book just to get us started here. But the question asked you to drop this last variable here. So we're going to drop R B I e s per year and you're just gonna estimate the equation with the rest of variables in there. And you are asked to look at the statistical significance of home runs per year and also the size of before the coefficient. After all is said and done. So I'll just read out underneath this equation, all right, out the coefficient that should be estimated and then the standard error, and we'll check out the significance So beta not as 11.0 to that should be the coefficient. Beta one should be 0.677 Buddha to should be 0.158 beat us three should be 0.14 and beta force should be 0.359 All right, so there's there. Coefficients. Let's put in some standard errors beneath is so intercept we usually usually what happened to the intercept is that it's statistically significant in a very high level. Who used to that data? One with that standard air looks to be very statistically significant, also positive, which makes sense so that more years you have, the higher your salary is predicted to be. How about games per year? What's the standard error on this one? Also much lower than the coefficient. So if you play more games per year means you're consistent. Player expected to have a higher salary, little batting average the average here. What's the coefficient? There's 0.11 so no, not too significant. Statistically significant. Let's take a look at that later, and then finally home runs. So this standard air looks like we got very 30 statistically significant, uh, co. If it, uh, sorry, very statistically significant effect here with home runs per year. That's so what? We can conclude about the statistical significance of home runs per year is that it is now very statistically significant. The T statistic is about five. So to circle that and to make sure we get that so very significant there you also asked to look at the size of the coefficient on home runs per year and the coefficient compared to the original estimation with RB ice. Remember, we're comparing these estimates That initial estimate. You should see that this coefficient here on home runs per year is about 2.5 times greater than the first estimation. So get to summarize you dropped RBS per year from this equation, which made home runs per year very statistically significant and also much higher effect. Uh, that's it's something to keep in mind. Uh, why you might want to include relevant variables and what happens when you drop them. In the case, This case RBS part to ask you to add three more variables. Thio this equation thio our entire equation up in the part one. So I'll just say well to spread of this new equation. So we're gonna have equation one. Plus, I'll just write out the other variables we have to add, which are runs per year and fielding percentage written this way. So how to runs per year? Fielding percentage, maybe affect predicted salary. And then the final variable we're gonna add is stolen bases per year. So you would expect all three of these additional variables to have a positive effect on the log of salary. So we'll see what actually happens. Uh huh. And party. We just asked you if you add these three new variables in to the equation from part one, one of them are individually significant and what you should get. I just read the coefficients out for these three variables in the Standard airs, you should get the following coefficients and standard airs. So there's runs per year with a standard error of 051 fielding percentage. You should get around this. And then there's the standard air for us and finally coefficient on stolen bases. I actually made a little mistake here. This is not supposed to be positive. It's supposed to be actually, keep that positive and write down here. All right, the negative coefficient. Sorry about that. That should be negative. 0.64 is the coefficient for stolen bases. And this is our standard air for stolen bases. So what can we conclude here? So just going down the line, How does runs pretty year look? So this is definitely statistically significant here. T statistic is if you divide the coefficient by the standard there you get about the T statistics about 3.4. So that's definitely a significant. So which means runs pretty year. If you're runs per year scored increases. Your salary is predicted to also go up with a high level of significance fielding percentage. Definitely not statistically significant. The T statistic is probably about 0.5. So I'll put a little X. Here's to say, not statistically significant and finally stolen bases here. It's a little closer, but still it's not going to reach. Statistical significance is actually interesting that we get a negative coefficient here. We might typically expect that if you're a player and you steal more basis during a season, that your salary might be expected to go up, so just kind of Ah, a little bit of a strange result there. Yeah, okay. Finally, in part three, they want you to test the joint significance of batting average fielding percentage and stolen bases pretty year, so I just read that out. So it's asking you to test the joint significance of batting average and fielding percentage and finally stolen bases per year and hopefully you'll join significance test already. And if you haven't have ah, good way of remembering or a good formula to help you get there. So to get joint significance test, you want to use an F test. So we need to create an F statistic, which is the following, so F statistic is following formula. I'll write this all out and then explain what each component is. It involves running a couple regressions with different variables in each regression and then plugging in certain values into this formula. And yeah, we'll go from there. Almost done. All right. So to do it, to test the joint significance again of these three different variables. What we have to dio is first run a. But it doesn't really matter what regression you run first, but first you want to run a unrestricted regression. Let's say that includes all three of these variables. In addition to the other ones from part from part one. I'm sorry, Part two. So the equation from Part two of this problem is the unrestricted regression. So you include all the variables that would be signified is you are unrestricted. After you run that regression, you have to save the sum of squared residuals, which should be an output from your statistical software program. Then you plug that number into here. Then after you've run that regression, then you need to run the, uh, restriction regression, which means you do not include these three variables in the regression. It will signify that is our and save the sum of squared residuals and plug it in here. So we've taken care of those three values after running the two regressions. Then we also have to divide up top here by que the number of restrictions with which is three. Because we're testing the joint significance of three variables that Z what? Q is so plugged that in there, then you need the number of observations here en and finally you need okay, which is the number of independent variables in the unrestricted model. So with our three variables of interest included, all of them included. So to say, number of independent variables in the U. R. Model, which stands for unrestricted model and this is the foreign one for the F statistics. So you have to run both of those regressions the unrestricted and restricted plug in all those values a zai explained into the formula and then check what pops out. So what you should get is an APP statistic of about 0.69 which is not going to be statistically significant. The P value for this is ash. Read this down. P value is about 0.56 So we can conclude that these three variables of batting average fielding percentage and stolen bases taken together their jointly insignificance and that is not too surprising. If you remember from part one, batting average was not individually significant, nor was fielding percentage and nor was stolen bases. So it makes sense that the choice in that begins is also not there. So there you have it. The problem is completed.