Hello, everybody. And welcome to a econometrics tutorial using our studio. So in this video, we're going to be looking into some endogenous variables and how you can try to use instrumental variables to, uh, fix that problem. So let's begin. First of all, you need to download the data set that we're gonna be using, which is called Old Bridge. So go ahead and open up that data set. Old bridge, open up the data for K, for one case subs That is the data we're gonna be using. And it is data about, uh, retirement funds, age, income and that kind of thing. If you open it up in help, you can see what all the D notations mean in the four. Oh, one k subs data file. So for this example, we're going to want to, um, compare what effect a bunch of these variables have on Pyra, which is whether or not you have an individual retirement arrangement. Uh, we'll be comparing it to income age and then whether or not you have a 41 K, which is a retirement savings plan some firms offer their employees were first going to start with an ordinary least squares model to come up with some in some inferences about the data. So you're going to want to, uh, do a linear regression on this formula here. So we're gonna be comparing Pyra to P 41 K. And what effect that has on it would affect income has on it income squared, age and H squared. And then, of course, an error term. So let's get to that, shall we? Actually, before we get to that, of course, it is also common practice. As you see in here, quite a few of these are factors. So you have. You know, marriage is a factor. Zero or one e 41 K is a factor. Male or female is a factor. Uh, that's not a factor. Uh, P 41 k is a factor and also appears a factor. So it is. It is pretty good practice to change the data, the data for these factors into actual factors in art. Because right now, as you can see down here, Pierre is not being recognized as a factor. Neither is male marriage. Um e 41 k or p 41 k. So it is good practice to do that with your data sets. If you have factors, you have binary variables. Go ahead and switch them again. You don't technically need to do this. The conclusions you can draw from the model are the same. More or less, Uh, enough here, what we do. So it's just good practice to do this. All right? So now is a good time to run the regression. So let's go ahead and name it. Model one. A very original. Mm hmm. All right, so let's run the model and see what happens. Boom. Creating the factors didn't seem to work, So yeah, it is good to create factors, but it's not imperative. So we will avoid it here. All right, so here we have our model. And, uh, for this example, we're going to be mainly focusing on the effect of P 41 k on Pyra. So that's right over here. We have the estimation. It is what appears to be 5% p. Value is very small, So this is really significant. This looks really good. Um, also because both Pierre and P 41 care factors. Um, the number here the B one is actually represented by this equation here. So you have Pura hat pure one hat and peers here at. So this would be the probability that you, um the probability that you have an i r a given. Um, you also have a 41 k account, and then this is the probability that you have an i. R a. Given that you don't have a 41 k account, and that's what you're the one is. So in practical terms, that means that according to this model, a person who has a four oh, one K account is 5% more likely to also have an IRA account given everything else is equal. Uh, but, you know, you might be asking yourself. All right, well, is that it? Is that it can be stopped there? Is that the effect that having a P 401 K has on pere? No. Because if you give it a quick if you think about it, just for a couple seconds might realize that there's some potential problems here, Um, and that both these variables, but mainly this one, since it's supposed to be independent, is actually endogenous. That might mean there's you're missing some kind of variable. That could explain the change in period that isn't explained by that. So say somebody doesn't like saving for their retirement through the government or through their company. So instead they go and save money in like houses. They buy a house and then they'll rent it or sell it later for retirement money. They still have a pension towards saving, but it's not captured. Um, in this checking whether the variable isn't darkness is always good. So let's do that. You're gonna have to have a thing called a TR, So just open that. And then once you have this, you could make your Ivy models. So let's do that. We'll call it Model B. So it's essentially the same thing. Except you are also including your instrument in here, which, in our case, they already have one. For us. It is E 41 case, so rather than whether they participated, participate in it or not, it's whether they're eligible for it or not. So you're getting the same sort of benefit, theoretically, that it's related to your endogenous variable. Uh, but it kinda ditches the baggage, so to speak. All right, and then once you get to this point what you're gonna want to do to, uh, your, uh your formula, your I V formula is just put a straight water and then through that down here. So our knows that you're trying to do a instrument very instrumental variable formula. So give that man a X squared and you basically replacing your P 41 k here with, uh, e 41 k. an important thing here because we're gonna wanna do a couple tests to see whether this instrument is actually, um, strong and required. So we're going to go ahead and have our give us the diagnostics. So remember, from our first our first formula where RB one was 5%. So with our ivy formula, um, it was very it was very significant. So far, I ve Formula RB one is 2% and it's it's not significant at all, really. So we can't really draw any conclusions from what, Like the relationship between p 41 K and, um Pyra. So so is our instrument e four oh, one k actually valuable. And that's where these tests come into play. So, first off, the weak instruments test is to you know, tell you whether your instrument is actually good or not. And the formula that this that are is using for the week instruments test is actually this. So it tests your what you think is an exogenous variable. Um, and then it tests it in this formula, with Z being your instrument and then sigma here being the effect it has. So the week instruments test is it's basically testing whether sigma is zero or different from zero if sigma is zero, that means that this is this whole terrible zero and your instrument is meaningless. So it's a week instrument. But if it's different from zero, then you're rejecting the null hypothesis and you actually have a worthwhile instrument. So if we go back to our and we check the weak instruments test, well, how do we reject the null hypothesis? So the the instrument e 41 K is actually worthwhile. Uh, and then the one husband test is actually just a generic test to check if you're, uh, if your variable is endogenous at all. So the formula that are is using for the horseman test is actually this one down here. So it's testing the null hypothesis that the co variance between your endogenous or what you think is your endogenous variable and your error term is zero. So that, in all hypothesis, actually means that it's exogenous. So you're testing whether it's exogenous or not. If you can refute this null hypothesis, then you have an endogenous variable. So let's see how we did. And it's significant so we can refute the null hypothesis here. Which means that P 401 k is indeed endogenous. So now that you know it's endogenous, you look at your results here instead and yeah, we went over, and already it's 2% but it's not very significant. So based on what we've done here, the we cannot actually come too much of a conclusion on whether you're not having a 41 K account with your employer and having an individual retirement account, uh, arrangement. Rather, having an individual retirement arrangement are correlated at all. Yeah, that was a quick look at endogenous variables and how you can clear up that problem with instrumental variables. Thank you, and I'll see you on the next one. Farewell