下载

1下载券

加入VIP
  • 专属下载特权
  • 现金文档折扣购买
  • VIP免费专区
  • 千万文档免费下载

上传资料

关闭

关闭

关闭

封号提示

内容

首页 linear regression and ols 概括总结

linear regression and ols 概括总结.pdf

linear regression and ols 概括总结

zxh19891119
2013-11-26 0人阅读 举报 0 0 暂无简介

简介:本文档为《linear regression and ols 概括总结pdf》,可适用于高等教育领域

OLSinMatrixFormTheTrueModel•LetXbeann×kmatrixwherewehaveobservationsonkindependentvariablesfornobservationsSinceourmodelwillusuallycontainaconstantterm,oneofthecolumnsintheXmatrixwillcontainonlyonesThiscolumnshouldbetreatedexactlythesameasanyothercolumnintheXmatrix•Letybeann×vectorofobservationsonthedependentvariable•Let²beann×vectorofdisturbancesorerrors•Letβbeank×vectorofunknownpopulationparametersthatwewanttoestimateOurstatisticalmodelwillessentiallylooksomethinglikethefollowing:YYYnn×=XXXkXXXkXnXnXknn×kβββnk×²²²nn×Thiscanberewrittenmoresimplyas:y=Xβ²()ThisisassumedtobeanaccuratereflectionoftherealworldThemodelhasasystematiccomponent(Xβ)andastochasticcomponent(²)OurgoalistoobtainestimatesofthepopulationparametersintheβvectorCriteriaforEstimatesOurestimatesofthepopulationparametersarereferredtoasβˆRecallthatthecriteriaweuseforobtainingourestimatesistofindtheestimatorβˆthatminimizesthesumofsquaredresiduals(∑eiinscalarnotation)WhythiscriteriaWheredoesthiscriteriacomefromThevectorofresidualseisgivenby:e=y−Xβˆ()Makesurethatyouarealwayscarefulaboutdistinguishingbetweendisturbances(²)thatrefertothingsthatcannotbeobservedandresiduals(e)thatcanbeobservedItisimportanttorememberthat²=eThesumofsquaredresiduals(RSS)ise′eeeen×neeenn×=e×ee×een×en×()Itshouldbeobviousthatwecanwritethesumofsquaredresidualsas:e′e=(y−Xβˆ)′(y−Xβˆ)=y′y−βˆ′X′y−y′Xβˆβˆ′X′Xβˆ=y′y−βˆ′X′yβˆ′X′Xβˆ()wherethisdevelopmentusesthefactthatthetransposeofascalaristhescalariey′Xβˆ=(y′Xβˆ)′=βˆ′X′yTofindtheβˆthatminimizesthesumofsquaredresiduals,weneedtotakethederivativeofEqwithrespecttoβˆThisgivesusthefollowingequation:∂e′e∂βˆ=−X′yX′Xβˆ=()Tocheckthisisaminimum,wewouldtakethederivativeofthiswithrespecttoβˆagain–thisgivesusX′XItiseasytoseethat,solongasXhasfullrank,thisisapositivedefinitematrix(analogoustoapositiverealnumber)andhenceaminimumItisimportanttonotethatthisisverydifferentfromee′–thevariancecovariancematrixofresidualsHereisabriefoverviewofmatrixdifferentiaton∂a′b∂b=∂b′a∂b=a()whenaandbareK×vectors∂b′Ab∂b=Ab=b′A()whenAisanysymmetricmatrixNotethatyoucanwritethederivativeaseitherAborb′A∂β′X′y∂b=∂β′(X′y)∂b=X′y()and∂β′X′Xβ∂b=∂β′Aβ∂b=Aβ=X′Xβ()whenX′XisaK×KmatrixFormoreinformation,seeGreene(,)andGujarati(,)FromEqwegetwhatarecalledthe‘normalequations’(X′X)βˆ=X′y()Twothingstonoteaboutthe(X′X)matrixFirst,itisalwayssquaresinceitisk×kSecond,itisalwayssymmetricRecallthat(X′X)andX′yareknownfromourdatabutβˆisunknownIftheinverseof(X′X)exists(ie(X′X)−),thenpremultiplyingbothsidesbythisinversegivesusthefollowingequation:(X′X)−(X′X)βˆ=(X′X)−X′y()Weknowthatbydefinition,(X′X)−(X′X)=I,whereIinthiscaseisak×kidentitymatrixThisgivesus:Iβˆ=(X′X)−X′yβˆ=(X′X)−X′y()Notethatwehavenothadtomakeanyassumptionstogetthisfar!SincetheOLSestimatorsintheβˆvectorarealinearcombinationofexistingrandomvariables(Xandy),theythemselvesarerandomvariableswithcertainstraightforwardpropertiesPropertiesoftheOLSEstimatorsTheprimarypropertyofOLSestimatorsisthattheysatisfythecriteriaofminimizingthesumofsquaredresidualsHowever,thereareotherpropertiesThesepropertiesdonotdependonanyassumptionstheywillalwaysbetruesolongaswecomputetheminthemannerjustshownRecallthenormalformequationsfromearlierinEq(X′X)βˆ=X′y()Nowsubstituteiny=Xβˆetoget(X′X)βˆ=X′(Xβˆe)(X′X)βˆ=(X′X)βˆX′eX′e=()Theinverseof(X′X)maynotexistIfthisisthecase,thenthismatrixiscallednoninvertibleorsingularandissaidtobeoflessthanfullrankTherearetwopossiblereasonswhythismatrixmightbenoninvertibleOne,basedonatrivialtheoremaboutrank,isthatn<kiewehavemoreindependentvariablesthanobservationsThisisunlikelytobeaproblemforusinpracticeTheotheristhatoneormoreoftheindependentvariablesarealinearcombinationoftheothervariablesieperfectmulticollinearityzxhHighlightWhatdoesX′elooklikeXXXnXXXnXkXkXkneeen=X×eX×eXn×enX×eX×eXn×enXk×eXk×eXkn×en=()FromX′e=,wecanderiveanumberofpropertiesTheobservedvaluesofXareuncorrelatedwiththeresidualsX′e=impliesthatforeverycolumnxkofX,x′ke=Inotherwords,eachregressorhaszerosamplecorrelationwiththeresidualsNotethatthisdoesnotmeanthatXisuncorrelatedwiththedisturbanceswe’llhavetoassumethisIfourregressionincludesaconstant,thenthefollowingpropertiesalsoholdThesumoftheresidualsiszeroIfthereisaconstant,thenthefirstcolumninX(ieX)willbeacolumnofonesThismeansthatforthefirstelementintheX′evector(ieX×eX×eXn×en)tobezero,itmustbethecasethat∑ei=ThesamplemeanoftheresidualsiszeroThisfollowsstraightforwardlyfromthepreviouspropertyiee=∑ein=Theregressionhyperplanepassesthroughthemeansoftheobservedvalues(Xandy)Thisfollowsfromthefactthate=Recallthate=y−XβˆDividingbythenumberofobservations,wegete=y−xβˆ=Thisimpliesthaty=xβˆThisshowsthattheregressionhyperplanegoesthroughthepointofmeansofthedataThepredictedvaluesofyareuncorrelatedwiththeresidualsThepredictedvaluesofyareequaltoXβˆieyˆ=XβˆFromthiswehaveyˆ′e=(Xβˆ)′e=b′X′e=()ThislastdevelopmenttakesaccountofthefactthatX′e=ThemeanofthepredictedY’sforthesamplewillequalthemeanoftheobservedY’sieyˆ=yzxhHighlightzxhHighlightzxhHighlightzxhHighlightzxhHighlightzxhHighlightzxhHighlightzxhHighlightzxhHighlightzxhHighlightThesepropertiesalwaysholdtrueYoushouldbecarefulnottoinferanythingfromtheresidualsaboutthedisturbancesForexample,youcannotinferthatthesumofthedisturbancesiszeroorthatthemeanofthedisturbancesiszerojustbecausethisistrueoftheresidualsthisistrueoftheresidualsjustbecausewedecidedtominimizethesumofsquaredresidualsNotethatweknownothingaboutβˆexceptthatitsatisfiesallofthepropertiesdiscussedaboveWeneedtomakesomeassumptionsaboutthetruemodelinordertomakeanyinferencesregardingβ(thetruepopulationparameters)fromβˆ(ourestimatorofthetrueparameters)Recallthatβˆcomesfromoursample,butwewanttolearnaboutthetrueparametersTheGaussMarkovAssumptionsy=Xβ²ThisassumptionstatesthatthereisalinearrelationshipbetweenyandXXisann×kmatrixoffullrankThisassumptionstatesthatthereisnoperfectmulticollinearityInotherwords,thecolumnsofXarelinearlyindependentThisassumptionisknownastheidentificationconditionE²|X=E²|X²|X²n|X=E(²)E(²)E(²n)=()ThisassumptionthezeroconditionalmeanassumptionstatesthatthedisturbancesaverageouttoforanyvalueofXPutdifferently,noobservationsoftheindependentvariablesconveyanyinformationabouttheexpectedvalueofthedisturbanceTheassumptionimpliesthatE(y)=XβThisisimportantsinceitessentiallysaysthatwegetthemeanfunctionrightE(²²′|X)=σIThiscapturesthefamiliarassumptionofhomoskedasticityandnoautocorrelationToseewhy,startwiththefollowing:E(²²′|X)=E²|X²|X²n|X²|X²|X²n|X()zxhHighlightzxhHighlightzxhHighlightwhichisthesameas:E(²²′|X)=E²|X²²|X²²n|X²²|X²|X²²n|X²n²|X²n²|X²n|X()whichisthesameas:E(²²′|X)=E²|XE²²|XE²²n|XE²²|XE²|XE²²n|XE²n²|XE²n²|XE²n|X()Theassumptionofhomoskedasticitystatesthatthevarianceof²iisthesame(σ)foralliievar²i|X=σ∀iTheassumptionofnoautocorrelation(uncorrelatederrors)meansthatcov(²i,²j|X)=∀i=jieknowingsomethingaboutthedisturbancetermforoneobservationtellsusnothingaboutthedisturbancetermforanyotherobservationWiththeseassumptions,wehave:E(²²′|X)=σσσ()Finally,thiscanberewrittenas:E(²²′|X)=σ=σI()DisturbancesthatmeetthetwoassumptionsofhomoskedasticityandnoautocorrelationarereferredtoassphericaldisturbancesWecancompactlywritetheGaussMarkovassumptionsaboutthedisturbancesas:Ω=σI()whereΩisthevariancecovariancematrixofthedisturbancesieΩ=E²²′Xmaybefixedorrandom,butmustbegeneratedbyamechanismthatisunrelatedto²²|X∼N,σIThisassumptionisnotactuallyrequiredfortheGaussMarkovTheoremHowever,weoftenassumeittomakehypothesistestingeasierTheCentralLimitTheoremistypicallyevokedtojustifythisassumptionzxhHighlightzxhHighlightzxhHighlightzxhHighlightTheGaussMarkovTheoremTheGaussMarkovTheoremstatesthat,conditionalonassumptions,therewillbenootherlinearandunbiasedestimatoroftheβcoefficientsthathasasmallersamplingvarianceInotherwords,theOLSestimatoristheBestLinear,UnbiasedandEfficientestimator(BLUE)HowdoweknowthisProofthatβˆisanunbiasedestimatorofβWeknowfromearlierthatβˆ=(X′X)−X′yandthaty=Xβ²Thismeansthatβˆ=(X′X)−X′(Xβ²)βˆ=β(X′X)−X′²()since(X′X)−X′X=IThisshowsimmediatelythatOLSisunbiasedsolongaseither(i)Xisfixed(nonstochastic)sothatwehave:Eβˆ=EβE(X′X)−X′²=β(X′X)−X′E²()whereE²=byassumptionor(ii)Xisstochasticbutindependentof²sothatwehave:Eβˆ=EβE(X′X)−X′²=β(X′X)−EX′²()whereE(X′²)=ProofthatβˆisalinearestimatorofβFromEq,wehave:βˆ=β(X′X)−X′²()Sincewecanwriteβˆ=βA²whereA=(X′X)−X′,wecanseethatβˆisalinearfunctionofthedisturbancesBythedefinitionthatweuse,thismakesitalinearestimator(SeeGreene(,)ProofthatβˆhasminimalvarianceamongalllinearandunbiasedestimatorsSeeGreene(,)TheVarianceCovarianceMatrixoftheOLSEstimatesWecanderivethevariancecovariancematrixoftheOLSestimator,βˆE(βˆ−β)(βˆ−β)′=E((X′X)−X′²)((X′X)−X′²)′=E(X′X)−X′²²′X(X′X)−()wherewetakeadvantageofthefactthat(AB)′=B′A′iewecanrewrite(X′X)−X′²as²′X(X′X)−IfweassumethatXisnonstochastic,weget:E(βˆ−β)(βˆ−β)′=(X′X)−X′E²²′X(X′X)−()FromEq,wehaveE²²′=σIThus,wehave:E(βˆ−β)(βˆ−β)′=(X′X)−X′(σI)X(X′X)−=σI(X′X)−X′X(X′X)−=σ(X′X)−()Weestimateσwithσˆ,where:σˆ=e′en−k()Toseethederivationofthis,seeGreene(,)WhatdoesthevariancecovariancematrixoftheOLSestimatorlooklikeE(βˆ−β)(βˆ−β)′=var(βˆ)cov(βˆ,βˆ)cov(βˆ,βˆk)cov(βˆ,βˆ)var(βˆ)cov(βˆ,βˆk)cov(βˆk,βˆ)cov(βˆk,βˆ)var(βˆk)()Asyoucansee,thestandarderrorsoftheβˆaregivenbythesquarerootoftheelementsalongthemaindiagonalofthismatrixHypothesisTestingRecallAssumptionfromearlier,whichstatedthat²|X∼N,σIIhadstatedthatthisassumptionwasnotnecessaryfortheGaussMarkovTheorembutwascrucialfortestinginferencesaboutβˆWhyWithoutthisassumption,weknownothingaboutthedistributionofβˆHowdoesthisassumptionaboutthedistributionofthedisturbancestellusanythingaboutthedistributionofβˆWell,wejustsawinEqthattheOLSestimatorisjustalinearfunctionofthedisturbancesByassumingthatthedisturbanceshaveamultivariatenormaldistributionie²∼N,σI()zxhHighlightwearealsosayingthattheOLSestimatorisalsodistributedmultivariatenormalieβˆ∼Nβ,σ(X′X)−()butwherethemeanisβandthevarianceisσ(X′X)−ItisthisthatallowsustoconductthenormalhypothesisteststhatwearefamiliarwithRobust(HuberofWhite)StandardErrorsRecallfromEqthatwehave:var−cov(βˆ)=(X′X)−X′E²²′X(X′X)−=(X′X)−(X′ΩX)(X′X)−()ThishelpsustomakesenseofWhite’sheteroskedasticityconsistentstandarderrorsRecallthatheteroskedasticitydoesnotcauseproblemsforestimatingthecoefficientsitonlycausesproblemsforgettingthe‘correct’standarderrorsWecancomputeβˆwithoutmakinganyassumptionsaboutthedisturbancesieβˆOLS=(X′X)−X′yHowever,togettheresultsoftheGaussMarkovTheorem(thingslikeEβˆ=βetc)andtobeabletoconducthypothesistests(βˆ∼Nβ,σ(X′X)−),weneedtomakeassumptionsaboutthedisturbancesOneoftheassumptionsisthatEee′=σIThisassumptionincludestheassumptionofhomoskedasticity–var²i|X=σ∀iHowever,itisnotalwaysthecasethatthevariancewillbethesameforallobservationsiewehaveσiinsteadofσBasically,theremaybemanyreasonswhywearebetteratpredictingsomeobservationsthanothersRecallthevariancecovariancematrixofthedisturbancetermsfromearlier:E(²²′|X)=Ω=E²|XE²²|XE²²n|XE²²|XE²|XE²²n|XE²n²|XE²n²|XE²n|X()Ifweretaintheassumptionofnoautocorrelation,thiscanberewrittenas:E(²²′|X)=Ω=σσσn()Basically,themaindiagonalcontainsnvariancesof²iTheassumptionofhomoskedasticitystatesthateachofthesenvariancesarethesameieσi=σButthisisnotalwaysanappropriateAswe’llseelaterinthesemester,italsohelpsusmakesenseofBeckandKatz’spanelcorrectedstandarderrorszxhHighlightzxhHighlightzxhHighlightassumptiontomakeOurOLSstandarderrorswillbeincorrectinsofaras:X′E²²′X=σ(X′X)()NotethatourOLSstandarderrorsmaybetoobigortoosmallSo,whatcanwedoifwesuspectthatthereisheteroskedasticityEssentially,therearetwooptionsWeightedLeastSquares:Tosolvetheproblem,wejustneedtofindsomethingthatisproportionaltothevarianceWemightnotknowthevarianceforeachobservation,butifweknowsomethingaboutwhereitcomesfrom,thenwemightknowsomethingthatisproportionaltoitIneffect,wetrytomodelthevarianceNotethatthisonlysolvestheproblemofheteroskedasticityifweassumethatwehavemodelledthevariancecorrectlyweneverknowifthisistrueornotRobuststandarderrors(White):ThismethodtreatsheteroskedasticityasanuisanceratherthansomethingtobemodelledHowdorobuststandarderrorsworkWeneverobservedisturbances(²)butwedoobserveresiduals(e)Whileeachindividualresidual(ei)isnotgoingtobeaverygoodestimatorofthecorrespondingdisturbance(²i),White()showedthatX′ee′Xisaconsistent(butnotunbiased)estimatorofX′E²²′XThus,thevariancecovariancematrixofthecoefficientvectorfromtheWhiteestimatoris:var−cov(βˆ)=(X′X)−X′ee′X(X′X)−()ratherthan:var−cov(βˆ)=X′X)−X′²²′X(X′X)−=(X′X)−X′(σI)X(X′X)−()fromthenormalOLSestimatorWhite()suggestedthatwecouldtestforthepresenceofheteroskedasticitybyexaminingtheextenttowhichtheOLSestimatordivergesfromhisownestimatorWhite’stestistoregressthesquaredresiduals(ei)onthetermsinX′XieonthesquaresandthecrossproductsoftheindependentvariablesIftheRexceedsacriticalvalue(nR∼χk),thenheteroskedasticitycausesproblemsAtthatpointusetheWhiteestimator(assumingyoursampleissufficientlylarge)NealBecksuggeststhat,byandlarge,usingtheWhiteestimatorcandolittleharmandsomegoodItisworthrememberingthatX′ee′Xisaconsistent(butnotunbiased)estimatorofX′E²²′Xsincethismeansthatrobuststandarderrorsareonlyappropriatewhenthesampleisrelativelylarge(say,greaterthandegreesoffreedom)zxhHighlightPartitionedRegressionandtheFrischWaughLovellTheoremImaginethatourtruemodelis:y=XβXβ²()Inotherwords,therearetwosetsofindependentvariablesForexample,Xmightcontainsomeindependentvariables(perhapsalsotheconstant)whereasXcontainssomeotherindependentvariablesThepointisthatXandXneednotbetwovariablesonlyWewillestimate:y=XβˆXβˆe()Say,wewantedtoisolatethecoefficientsassociatedwithXieβˆThenormalformequationswillbe:()()X′XX′XX′XX′Xβˆβˆ=X′yX′y()First,let’ssolveforβˆ(X′X)βˆ(X′X)βˆ=X′y(X′X)βˆ=X′y−(X′X)βˆβˆ=(X′X)−X′y−(X′X)−X′Xβˆβˆ=(X′X)−X′(y−Xβˆ)()OmittedVariableBiasThesolutionshowninEqisthesetofOLScoefficientsintheregressionofyonX,ie,(X′X)−X′y,minusacorrectionvector(X′X)−X′XβˆThiscorrectionvectoristheequationforomittedvariablebiasThefirstpartofthecorrectionvectoruptoβˆ,ie(X′X)−X′X,isjusttheregressionofthevariablesinXdoneseparatelyandthenputtogetherintoamatrixonallthevariablesinXThiswillonlybezeroifthevariablesinXarelinearlyunrelated(uncorrelatedororthogonal)tothevariablesinXThecorrectionvectorwillalsobezeroifβˆ=ieifXvariableshavenoimpactonyThus,youcanignoreallpotentialomittedvariablesthatareeither(i)unrelatedtotheincludedvariablesor(ii)unrelatedtothedependentvariableAnyomittedvariablesthatdonotmeettheseconditionswillchangeyourestimatesofβˆiftheyweretobeincludedGreene(,)writestheomittedvariableformulaslightlydifferentlyHehasEb=βPβ()whereP=(X′X)−X′X,wherebisthecoefficientvectorofaregressionomittingtheXToseethis,comparewithEq

用户评价(0)

关闭

新课改视野下建构高中语文教学实验成果报告(32KB)

抱歉,积分不足下载失败,请稍后再试!

提示

试读已结束,如需要继续阅读或者下载,敬请购买!

文档小程序码

使用微信“扫一扫”扫码寻找文档

1

打开微信

2

扫描小程序码

3

发布寻找信息

4

等待寻找结果

我知道了
评分:

/14

linear regression and ols 概括总结

VIP

在线
客服

免费
邮箱

爱问共享资料服务号

扫描关注领取更多福利