5

1. Suppose you have one continuous predictor X and binary categorical rcsponse Y which can takc valucs or 2. Suppose you collected train- ing data from the two clas...

Question

1. Suppose you have one continuous predictor X and binary categorical rcsponse Y which can takc valucs or 2. Suppose you collected train- ing data from the two classes and obtained class-specific sample means and p2 along ` with the pooled variance estimatc over the two classcs_ 1. (AOpt total, Spt for cach question)Assume equal class priors and derive the LDA classification rule for this problem_ Sketch the estimatcd class-conditional densitics and show your decision boundary on the plot Make

1. Suppose you have one continuous predictor X and binary categorical rcsponse Y which can takc valucs or 2. Suppose you collected train- ing data from the two classes and obtained class-specific sample means and p2 along ` with the pooled variance estimatc over the two classcs_ 1. (AOpt total, Spt for cach question) Assume equal class priors and derive the LDA classification rule for this problem_ Sketch the estimatcd class-conditional densitics and show your decision boundary on the plot Make sure you labcl the axes and indicate the numcrical valuc for the boundary; let'$ call it Suppose the estimates were in fact obtained from 100 training points among which 40 were from class and 60 wcrc from class Suppose now you will estimate class priors from data, repeat all the calculations in part and obtain new boundary valuc, let s call it : Without actually doing this would you be able to tell whether will be thc same as less than Or greater than is there no way to tell? Explain your answer without calculating Note: It s ok to recheck your answer once YOu have actually calculate in part (c); but your explanation must not involve the numcrical valuc Now calculate the new boundary value € described in Part (6). Suppose in addition to the pooled covariance value &2 now tell you the individual class specific covariances wCrC estimated = 0.25 and 63 =15. Based on this nw information_ would yoU recommend using LDA or QDA and why? Derive the QDA rule for purt (d) , assuming equal class priors.



Answers

To complete this exercise you need a software package that allows you to generate data from the uni-
form and normal distributions.
$(i)$ Start by generating 500 observations on $x_{r}$ -the explanatory variable - from the uniform dis-
tribution with range $[0,10]$ . (Most statistical packages have a command for the Uniform( $0,1 )$
distribution; just multiply those observations by $10 .$ ) What are the sample mean and sample
standard deviation of the $x_{i}$ ?
$(ii)$ Randomly generate 500 errors, $u_{i},$ from the Normal(0,36) distribution. (If you generate a
Normal( $(0,1),$ as is commonly available, simply multiply the outcomes by six.) Is the sample ave
erage of the $u_{i}$ exactly zero? Why or why not? What is the sample standard deviation of the $u_{i} ?_{i} ?$
$\begin{aligned} \text { (iii) Now generate the } y_{i} \text { as } & \\ & y_{i}=1+2 x_{i}+u_{i} \equiv \beta_{0}+\beta_{1} x_{i}+u_{i} \end{aligned}$
$\begin{array}{l}{\text { that is, the population intercept is one and the population slope is two. Use the data to run the }} \\ {\text { regression of } y_{i} \text { on } x_{i} \text { . What are your estimates of the intercept and slope? Are they equal to the }} \\ {\text { population values in the above equation? Explain. }}\end{array}$
$\begin{array}{l}{\text { (iv) Obtain the OLS residuals, } \hat{u}_{i} \text { , and verify that equation }(2.60) \text { holds (subject to rounding error) }} \\ {\text { (v) Compute the same quantities in equation }(2.60) \text { but use the errors } u_{i} \text { in place of the residuals. }} \\ {\text { Now what do you conclude? }}\end{array}$
$\begin{array}{l}{\text { (vi) Repeat parts (i), (ii), and (iii) with a new sample of data, starting with generating the } x_{F} \text { Now }} \\ {\text { what do you obtain for } \hat{\beta}_{0} \text { and } \hat{\beta}_{1} ? \text { Why are these different from what you obtained in part (iii)? }}\end{array}$

Part one. The Poland L s estimate of beta one oh is 0.36 zero. If the change in concentrate concentration is 0.1, then the change in the log of fair would be Beijing one head times the change in concentration and that would be 0.36 times 0.1, which is 0.36 That implies airfare is estimated to be about 3.6% higher. Part two, The 95% confidence interval obtain using the usual L s standard error is 0.301 2.419 And if we use the fully robust standard Iran's we will get point 245 and 2450.475 which is wider than the one above. The wider confidence interval is appropriate as the neglected serial correlation introduced uncertainty into our parameter estimation. Yeah, Part three. The quadratic has a use shape form, and the turning point is calculated by mhm taking partial derivative of lock of airfare with respect to lock of distance. And you will set that derivative equal zero. You wouldn't be able to find the value of lack of distance where the slope becomes positive, sir. the value of a lot of distance at the turning point is you will take 0.902 divided by two times 20.103 and you can get 4.38 This is the lock of distance, sir. When you convert it back, the value of distance is exponential of 4.38 Okay, about 80. And the shortest distance in the sample is 95 miles. So the turning point is outside the range of the data, which is a good thing in this case, what is being captured in an increasing elasticity affair with respect your distance As fare increases hard for the random effect, estimate of data one is 10.209 which is a bit smaller than the parent LS estimate. This estimate still implies a positive relationship between fair and concentration. The estimate is also very significant, with a T statistic of 7.88 Part five. The fixed effects estimate of beta one is 10.169 which is lower but not so different from the random effect estimate. And this is so because the value of, um, a perimeter in Equation 11 equation 14.11. Yeah, let's say it's, um, Fate. A hat. The Fed ahead is about 0.9, so random effects and fixed effects as meats are fairly similar. Remember, random effect uses a quasi demeaning. That depends on the estimate of this fada, I suggest in equation 14.11. Hard six. Heterogeneous effect. A supply could capture two types of factors that might correlate with concentration. Variable mhm. First, it could be factors about cities mhm near the two airports, for example, population, education level and type of employers. These factors could affect the demand for air travel, and the second set of factors could be factors relate you geographical features and infrastructure condition, such as highway qualities and whether the city locates near a river. So these factors are able to change over time. But in a short time period, let's say, um, the length of the time study in their sample. They are roughly time constant course, Yeah, and so they are able to be captured by a sub I. There are various factors like that, and it's better if we are able to control for them. So in part seven, it is more appropriate to choose to fix effect, estimate

What we are given the following data points X. Y. Listed at the top of this white board. And we want to use that answer information to answer the following six questions A through F. As follows. First in part A on the left, we want to produce a scatter plot of this data. We've already done so with the scattered provided right below and the data points marked with X's or crosses next. We want to compute the sums and the correlation coefficient are on the right. I've already listed the sums out their computers simply by following the formula sum of all X values, some of our Y values and so on. The correlation coefficient. R. Is given by this formula which makes use of the sample size and and the sounds we just computed, plugging in these values, we get articles 0.9 98. Next part C. We want to find the X. Meanwhile, I mean and the constant related to the equation line of best fit. So exciting. And what I mean are simply given as follows. Remember that the being a value they're given by the following formulas. He takes his input and the sums. It's very similar to the correlation coefficient are plugging in. We get the equals 4.509 and plugging in Wiebe R. E. And explode at a gives 33.696 Guest we have our equation for the line of best fit. White hot equals 33.6964 point 509 X. Next we want to plot this. Why had onto our scatter plot, Making sure to include our expire and are Y. Bar we do sell it as is observable here. Next let's calculate R squared and interpret so R squared to simply 0.9954 That means that approximately 99.54% of our data can be rather 99.54% of variations in our data can be explained to the data itself, and roughly half of a percent cannot be explained. Finally, for F we want to project Why, for X equals 12. Using Ry had equation. Doing so, we obtain 87.804.

This is the result for Part one. We use the full sample, which has 177 observations. From this estimation, we obtain the student ized residuals and we call them as t r. Supply. The number of student ized residuals, which are above 1.96 in absolute value, is nine. If the student ized residuals were independently drawn from a standard normal distribution, we would expect about by percent of our sample. Sir, 177 times 5%. You will get a number between eight and nine. It is 8.56 something we would expect between eight and nine cases of student ties residuals to be above two. It is so because in a standard normal distribution, about 95.5% of the observations are within two standard errors off standard deviation are within two standard deviation. And in a standard normal distribution, the standard deviation is one. So 95.5% of the observation are within two equivalently. It means 5% up to 5% of the observations are either above two or less than minus two. That's why we have the 5% number here and as just say, um, right here to be above two in absolute value, you can check. You can fact check this statement and you will find that there are eight observation with student ties. Residuals above two in absolute value for three. The student ties residuals are used to detect are liars. We will drop. There are liars, which are defined as observation with student ties. Residuals above 1.9 16 absolute value so we wouldn't drop there NYT, cases we find out in Part two, we will re estimate the model in part one again using 169 observations. This is the result. Compare with the regression. In Part one, we find that the main coefficient become significant at the 1% level. Let me come back to part one, so the first one lakh of sales is significant at the 1% level I windy note with three stars. Lakoff M. Kate Evil is significant at the 5% level, so two Stars CEO 10. Thank you significant at the 1% level, and CEO 10 square is significant at the 5% level. Back Thio, Part three, lock of sales is significant at 1% level. Still lock of em Katie Value before it was significant at the 5% level. Now it is significant. At the 1% level, nothing changed in terms of significance level for C E. 0. 10 and for CEO 10 square. Okay, nothing changed. It is still significant at the 5% level. Yeah, so we have beta head of lock of sales Mhm and CEO 10. They have the same level of significance. The exactly value is different, but not too substantial to give them a new level of significance. The estimates on em, Katie Value increase in magnitude and significance level. Okay, You may also notice that the magnitude of the estimates on sales and CEO 10 decrease, but not too much. And the coefficient of CEO 10 square does not change in terms of magnitude. Now we will use least absolute deviation to estimate the regression in part One again, this is part four. We re news all the data Here is the result. The l. A. D method is, um, estimated with with a different, um methodology. So it doesn't have the are square. It is estimated by maximum likelihood So to measure the fit of the model, you will need to look at their results generated by their statistical software. And you will look at the lock likelihood value. I would not report that here because we don't care about the fit of the model in this problem, we care about the estimates of their explanatory variables. Compare this regression with previous regression where we use l s. We see that beta one. The coefficient of lack of cells is closer to that of the restricted sample and the restricted. Simple is the regression. In part three, where we drop the outlier observations. We don't have the same observation for beta three beta three hat. The coefficient of CO 10 is actually closer to the estimate in the full sample. Even these results part five, we will be able to evaluate this statement dropping our lawyers based on extreme values of student ties. Residuals makes the resulting L s estimates closer to the L A G estimates on the phone sample. This statement is not always true. It is not true to every estimate

We want to use the sample of data points X. Y. Listed at the top of this white board to answer the following questions. A through F. We go through them one by one together. Now to start off with on the left for part A. We want to produce a scatter plot of these data points. You'll see that I already included a scatter plot right below where the data points X. Y. Are marketed by black crosses are black accents next to the right for part B. We want to compute the relevant sums and the correlation coefficient. R. I've already included the values of the sums here, where they are determined simply by following the equations exactly. So some X. Is some of the individual X values, some X squared. Is some of the individual X value squared and so on. Our is given by the formula that as follows. It takes as input sample size and and the something just computed as you can see, Plugging in, we get our equals .8351 next. We want to find the equation of the line of best fit for the sample data. To do so we have to find the mean of our x and Y values as well as the parameters for that best foot line first or X and Y means are the some of the X values over M X Y equals 16.65 and same for why Why articles 80 next. We find the parameters for the best fit line. The slope B is given by the equation here, which we know is incredibly similar to be our equation. It takes as input our sample size and and the majority of the summer's just computed plugging in. We get B equals 3.291 for the slope and then plug in B Y bar and X bar into our equation on the right gives intercept 25.232 meaning we have a line of best fit why a hat equals 25.232 plus 3.291 X. Next we return to the left of the plot or right of the scatter plot, complete part D which is plotting ry hat We make sure that our plot of white hat includes are expiring ry bar as we've done here. Next in part in the bottom. Right, let's calculate our coefficient of determination R squared which is simply the square of our correlation coefficient. And interpret it reem r squared equals 0.6974 We interpret this to mean roughly 70 of our variation in the data can be explained by the corresponding variation in X and the least squares line accordingly. 30 of our data or rather the variation on our data remains unexplained. Finally, the bottom we want to protect, why, for x equals 19. Plugging into white hat simply gives y equals 87.761. Yeah, yeah.


Similar Solved Questions

2 answers
I engch of the 1 cune 21 Jipcints normal 1 N(O) 1 3 Alcos(0) , sin(0))Nathv{1) Galculate 1 r(t) 1 A (5t , Je Ocid1 https / L Solulicn TCucath Picc Aobhecnuaovamteni1 acceleralion 1 RcaacaOF AteeNLInterval11(Rouno131
I engch of the 1 cune 2 1 Jipcints normal 1 N(O) 1 3 Alcos(0) , sin(0)) Nath v{1) Galculate 1 r(t) 1 A (5t , Je Ocid 1 https / L Solulicn TCucath Picc Aobhecnuaovamteni 1 acceleralion 1 RcaacaOF Atee NL Interval 1 1 (Rouno 1 3 1...
4 answers
CoPTRiGHT 1998 Sigma-Aldnich CoUnknown $- 53LorichALL RIGHTS RESERVEDAnnonton(pamtNuclear Magnetic Resonance SpectrumMass Spectral DataMass Spectral Data Parent Peak 152 mlz Base Peak 152 mlzIntensity 100 Intensity 100 %Notes: The "Parent Peak" gives the molecular weight of the compound_ The "Base Peak" is the largest peak in the spectrum_ The "P+n" Peaks (if any), give isotope information:
CoPTRiGHT 1998 Sigma-Aldnich Co Unknown $- 53 Lorich ALL RIGHTS RESERVED Annonton (pamt Nuclear Magnetic Resonance Spectrum Mass Spectral Data Mass Spectral Data Parent Peak 152 mlz Base Peak 152 mlz Intensity 100 Intensity 100 % Notes: The "Parent Peak" gives the molecular weight of the c...
5 answers
The standard model of particle physics (see image below) states that 17 elementary particles (6 Quarks 6 Leptons, & 5 Bosons= compose the subatomic particles that constitute matter:STANDARD MODEL OF ELEMENTARY PARTICLESILIONOFigure l: The standard modelHow many different subatomic particles can be composed of any 3 quarks? How many different subatomic particles can be composed of 1 or 2 quarks?
The standard model of particle physics (see image below) states that 17 elementary particles (6 Quarks 6 Leptons, & 5 Bosons= compose the subatomic particles that constitute matter: STANDARD MODEL OF ELEMENTARY PARTICLES ILIONO Figure l: The standard model How many different subatomic particles ...
5 answers
A technician services mailing machines at companies in the Phoenix area_ Depending on the type of malfunction, the service call can take 1, 2, 3, or 4 hours The different types of malfunctions occur at the same frequency: If required, round your answers to two decimal places_ Develop probability distribution for the duration of a service call;Duration of Call f(z)
A technician services mailing machines at companies in the Phoenix area_ Depending on the type of malfunction, the service call can take 1, 2, 3, or 4 hours The different types of malfunctions occur at the same frequency: If required, round your answers to two decimal places_ Develop probability dis...
5 answers
Find the equation of the tangent to the graph at the indicated point, HINT [Compute the derivative algebralcally; then see Example 2(b) in Section 10.5.]fx)9x;
Find the equation of the tangent to the graph at the indicated point, HINT [Compute the derivative algebralcally; then see Example 2(b) in Section 10.5.] fx) 9x;...
5 answers
Organic Functional GroupsPart AClassity each molecule as an amine, amide_ Drag each item to the appropriate bin:neitner:View Available Hint(s)AcsctCH;CHz CH;CH;C NCH3 CH;CHzaic-CH,CH;CH;CH-COcH_CH;CCH_CH;CHCH-CH;AmineAmideNeither amine nor amide
Organic Functional Groups Part A Classity each molecule as an amine, amide_ Drag each item to the appropriate bin: neitner: View Available Hint(s) Acsct CH;CHz CH;CH;C NCH3 CH;CHz aic- CH,CH;CH;CH-COcH_ CH;CCH_CH; CHCH-CH; Amine Amide Neither amine nor amide...
5 answers
Find the inverse of the function: f(x) 8x + 3 (5 marks)
Find the inverse of the function: f(x) 8x + 3 (5 marks)...
5 answers
Evaluate the following expressions.x2 (d) dx x + 5 cIn 3 (e) et-er dxK e(+e*)76s, dx(g)e*10"dx(b)UV1 +u2dudx(h) (E) J6 (-[l)dt, where [t] is the greatest integer <tV1 -z2(c) fo+1)2"_ 1 21)3/4dxdx 212 +1
Evaluate the following expressions. x2 (d) dx x + 5 cIn 3 (e) et-er dx K e(+e*)76s, dx (g) e*10"dx (b) UV1 +u2du dx (h) (E) J6 (-[l)dt, where [t] is the greatest integer <t V1 -z2 (c) fo+1)2"_ 1 21)3/4dx dx 212 +1...
5 answers
Cmullso yLudtul etn - Mum D} nntuln017 40 Idl Fnint Finidolain Itene detteeatce (naltael'ul ralml'outeennbune nuattrtut ,HaleehrnnnitattIecLale MD
cmullso y Ludtul etn - Mum D} nntuln 017 40 Idl Fnint Finidolain Itene detteeatce (naltael'ul ralml'oute ennbune nuattrtut , Haleehrnnnitatt IecLale MD...
5 answers
The soluton Select - 0f tne dnterentl 5 the correct answers cquation y Y=c+e'' / Y=c+ 6+" y = ce*y =ce y = ce"The autonomous differential equation Select the correct answer.xk-1)(x + [) has solution that isincreasing everywhere increasing ifx > increasing if 0 < x < 1 d. decreasing if-1 <x < 0 decreasing everywhere equation dx =x(l ~x), the critical poi In the autonomous differential dt
The soluton Select - 0f tne dnterentl 5 the correct answers cquation y Y=c+e'' / Y=c+ 6+" y = ce* y =ce y = ce" The autonomous differential equation Select the correct answer. xk-1)(x + [) has solution that is increasing everywhere increasing ifx > increasing if 0 < x < ...
5 answers
Ihe Iollowing data are qiven fot & biogus digester suitable for the outpul Of six cows The volurne digester 6.7 m3 Ihereuer 945 holucer (s 3.1m' ard Ihe reterition lime 29 days . Find the helght ol gas hoklet Ilaht ol qas holdet
Ihe Iollowing data are qiven fot & biogus digester suitable for the outpul Of six cows The volurne digester 6.7 m3 Ihereuer 945 holucer (s 3.1m' ard Ihe reterition lime 29 days . Find the helght ol gas hoklet Ilaht ol qas holdet...
5 answers
5. Show that for Schwartz functions, if f is even;f(r) = 2F(8) cos(Er)dewhereF(E) =f(z) cos(€x)dxAlso give a corresponding formula for odd funetions. [To relate to the definitions in the book; you may want to do a change of variable 2t€ &]
5. Show that for Schwartz functions, if f is even; f(r) = 2 F(8) cos(Er)de where F(E) = f(z) cos(€x)dx Also give a corresponding formula for odd funetions. [To relate to the definitions in the book; you may want to do a change of variable 2t€ &]...
5 answers
Zn ( 8 # (aq) reaction= 1 following| aq) 81 [ 105 t 55 Calculate AG 3 ( Ko
Zn ( 8 # (aq) reaction= 1 following| aq) 81 [ 105 t 55 Calculate AG 3 ( Ko...
5 answers
5. Let Xcc X, be a random sample from a n(0,0 "(K and let Y,x Y be a random sample from a n(0,0 independent of the Xs_Let's define a new statistic Aand considerthe hypothesis: V Y = versus A+ For this hypothesis, what is the Likelihood Ratio Test (LRT) with level &?
5. Let Xcc X, be a random sample from a n(0,0 "(K and let Y,x Y be a random sample from a n(0,0 independent of the Xs_ Let's define a new statistic A and consider the hypothesis: V Y = versus A+ For this hypothesis, what is the Likelihood Ratio Test (LRT) with level &?...
5 answers
Hw12 July13: Problem 17Previcus ProblemProblem ListNext Problempoint) Find the angle botwccn tho vectcrsad
hw12 July13: Problem 17 Previcus Problem Problem List Next Problem point) Find the angle botwccn tho vectcrs ad...
5 answers
Let A and B be two independent events with P(A) 0.12 and P(AU B) 0.88, Then; the valle of P(A Blis Lo.01636KEO 761Ki,0.02881C0.13636
Let A and B be two independent events with P(A) 0.12 and P(AU B) 0.88, Then; the valle of P(A Blis Lo.01636 KEO 761 Ki,0.0288 1C0.13636...
5 answers
In order to qualify for a police academy, candidates must score in the top 5% on general abilitiestest: The test has a mean of 210 and a standard deviation of 25. Determine the minimum possiblescore to qualify for the police academy:259169252161
In order to qualify for a police academy, candidates must score in the top 5% on general abilities test: The test has a mean of 210 and a standard deviation of 25. Determine the minimum possible score to qualify for the police academy: 259 169 252 161...

-- 0.019092--