关闭

关闭

封号提示

内容

首页 robustregression.pdf

robustregression.pdf

robustregression.pdf

上传者: 笨博士86 2013-04-30 评分 4.5 0 83 11 376 暂无简介 简介 举报

简介:本文档为《robustregressionpdf》,可适用于高等教育领域,主题内容包含RobustRegressionModelingwithSTATAlecturenotesRobertAYaffee,PhDStatistics,S符等。

RobustRegressionModelingwithSTATAlecturenotesRobertAYaffee,PhDStatistics,SocialScience,andMappingGroupAcademicComputingServicesOffice:ThirdAvenue,LevelCPhone:Email:yaffeenyueduWhatdoesRobustmeanDefinitionsdifferinscopeandcontentInthemostgeneralconstruction:RobustmodelspertainstostableandreliablemodelsStrictlyspeaking:ThreatstostabilityandreliabilityincludeinfluentialoutliersInfluentialoutliersplayedhavocwithstatisticalestimationSince,manyrobusttechniquesofestimationhavedevelopedthathavebeenresistanttotheeffectsofsuchoutliersSASProcRobustreginVersiondealswiththeseSPlusrobustlibraryinStatarreg,prais,andarimamodelsBroadlyspeaking:HeteroskedasticityHeteroskedasticallyconsistentvarianceestimatorsStataregressyxx,robustNonnormalresidualsNonparametricRegressionmodelsStataqreg,rregBootstrappedRegressionbstrapbsqregOutlineRegressionmodelingpreliminariesTestsformisspecificationOutlierinfluenceTestingfornormalityTestingforheterskedasticityAutocorrelationofresidualsRobustTechniquesRobustRegressionMedianorquantileregressionRegressionwithrobuststandarderrorsRobustautoregressionmodelsValidationandcrossvalidationResamplingSamplesplittingComparisonofSTATAwithSPLUSandSASPreliminaryTesting:Priortolinearregressionmodeling,useamatrixgraphtoconfirmlinearityofrelationshipsgraphyxx,matrixyxxTheindependentvariablesappeartobelinearlyrelatedwithyWetrytokeepthemodelssimpleIftherelationshipsarelinearthenwemodelthemwithlinearmodelsIftherelationshipsarenonlinear,thenwemodelthemwithnonlinearornonparametricmodelsTheoryofRegressionAnalysisWhatislinearregressionAnalysisFindingtherelationshipbetweenadependentandanindependentvariableGraphically,thiscanbedonewithasimpleCartesiangraphYabxe=TheMultipleRegressionFormulaYabxe=YisthedependentvariableaistheinterceptbistheregressioncoefficientxisthepredictorvariableGraphicalDecompositionofEffectsXYyˆabx=XYiY}ˆiiyyerror=yˆyregressioneffect=}{iyyTotalEffect=DecompositionofEffectsDerivationoftheInterceptnnniiiiiinnnniiiiiiiiniinnniiiiiiaybxnniiiiyabxeeyabxeyabxBecausebydefinitioneyabxnaybxaybx=====================DerivationoftheRegressionCoefficient:()()()()iiiiiinniiiiinniiiiininniiiiiiinniiiiiiniiiniiGivenyabxeeyabxeyabxeyabxexybxxbxybxxxybx==================•Ifwerecallthattheformulaforthecorrelationcoefficientcanbeexpressedasfollows:fromwhichitcanbeseenthattheregressioncoefficientb,isafunctionofr()()niiinniiiiiixyrxywherexxxyyy======niiijnixybx===*yjxsdbrsd=ExtendingthebivariatetothemultivariateCase*()yxyxxxyyxxxxxrrrsdrsdβ=*()yxyxxxyyxxxxxrrrsdrsdβ=()aYbxbx=ItisalsoeasytoextendthebivariateintercepttothemultivariatecaseasfollowsLinearMultipleRegression•SupposethatwehavethefollowingdatasetStataOLSregressionmodelsyntaxWenowseethatthesignificancelevelsrevealthatxandxarebothstatisticallysignificantTheRandadjustedRhavenotbeensignificantlyreduced,indicatingthatthismodelstillfitswellTherefore,weleavetheinteractiontermprunedfromthemodelWhataretheassumptionsofmultiplelinearregressionanalysisRegressionmodelingandtheassumptionsWhataretheassumptionslinearityHeteroskedasticityNoinfluentialoutliersinsmallsamplesNomulticollinearityNoautocorrelationofresidualsFixedindependentvariablesnomeasurementerrorNormalityofresidualsTestingthemodelformispecificationandrobustnessLinearitymatrixgraphsshownaboveMulticollinearityvifMisspecificationtestsheteroskedasticitytestsrvfplothettestresidualautocorrelationtestscorrgramoutlierdetectiontabulationofstandardizedresidualsinfluenceassessmentresidualnormalitytestssktestSpecificationtests(notcoveredinthislecture)Misspecificationtests•Weneedtotesttheresidualsfornormality•WecansavetheresidualsinSTATA,byissuingacommandthatcreatesthem,afterwehaveruntheregressioncommand•Thecommandtogeneratetheresidualsis•predictresid,residualsGenerationoftheregressionresidualsGenerationofstandardizedresiduals•Predictrstd,rstandardGenerationofstudentizedresiduals•Predictrstud,rstudentTestingtheResidualsforNormalityWeuseaSmirnovKolmogorovtestThecommandforthetestis:sktestresidThisteststhecumulativedistributionoftheresidualsagainstthatofthetheoreticalnormaldistributionwithachisquaretestTodeterminewhetherthereisastatisticallysignificantdifferenceThehypothesisisthatthereisnodifferenceWhentheprobabilityislessthan,wemustrejectthehypothesisandinferthattheresidualsarenonnormallydistributedTestingtheResidualsforheteroskedasticityWemaygraphthestandardizedorstudentizedresidualsagainstthepredictedscorestoobtainagraphicalindicationofheteroskedasticityTheCookWeisbergtestisusedtotesttheresidualsforheteroskedasticityAGraphicaltestofheteroskedasticity:rvfplot,borderyline()ThisdisplaysanyproblematicpatternsthatmightsuggestheteroskedasticityButitdoesn’ttelluswhichresidualsareoutliersCookWeisbergTest()exp()ˆ:iiiiidfpVareztwhereeerrorinregressionmodelzxorvariablelistsuppliedbyuserThetestiswhetherthettestestimatesthemodeleztSSofmodelitformsascoretestShSwherepnumberofparametersσβανχ========CookWeisbergtestsyntaxThecommandforthistestis:hettestresidAninsignificantresultindicateslackofheteroskedasticityThatis,ansucharesultindicatesthepresenceofequalvarianceoftheresidualsalongthepredictedlineThisconditionisotherwiseknownashomoskedasticityTestingtheresidualsforAutocorrelationOnecanusethecommand,dwstat,aftertheregressiontoobtaintheDurbinWatsondstatistictotestforfirstorderautocorrelationThereisabetterwayGenerateacasenumvariable:Gencasenum=nCreateatimedependentseriesRuntheLjungBoxQstatisticwhichtestspreviouslagsforautocorrelationandpartialautocorrelationThesignificanceoftheAC(Autocorrelation)andPAC(Partialautocorrelation)isshownintheProbcolumnNoneoftheseresidualshasanysignificantautocorrelationTheSTATAcommandis:corrgramresidOnecanrunAutoregressionintheeventofautocorrelationThiscanbedonewithneweyyxxxlag()timepraisyxxxOutlierdetection•Outlierdetectioninvolvesthedeterminationwhethertheresidual(error=predicted–actual)isanextremenegativeorpositivevalue•Wemayplottheresidualversusthefittedplottodeterminewhicherrorsarelarge,afterrunningtheregression•Thecommandsyntaxwasalreadydemonstratedwiththegraphonpage:rvfplot,borderyline()CreateStandardizedResiduals•Astandardizedresidualisonedividedbyitsstandarddeviationˆiistandardizedyyresidswheresstddevofresiduals==Standardizedresidualspredictresidstd,rstandardlistresidstdtabulateresidstdLimitsofStandardizedResidualsIfthestandardizedresidualshavevaluesinexcessofand,theyareoutliersIftheabsolutevaluesarelessthan,astheseare,thentherearenooutliersWhileoutliersbythemselvesonlydistortmeanpredictionwhenthesamplesizeissmallenough,itisimportanttogaugetheinfluenceofoutliersOutlierInfluence•Supposewehadadifferentdatasetwithtwooutliers•Wetabulatethestandardizedresidualsandobtainthefollowingoutput:OutlieradoesnotdistorttheregressionlinebutoutlierbdoesbaY=abxOutlierahasbadleverageandoutlieradoesnotInthisdataset,wehavetwooutliersOneisnegativeandtheotherispositiveStudentizedResiduals•Alternatively,wecouldformstudentizedresidualsThesearedistributedasatdistributionwithdf=np,thoughtheyarenotquiteindependentTherefore,wecanapproximatelydetermineiftheyarestatisticallysignificantornot•Belsleyetal()recommendedtheuseofstudentizedresidualsStudentizedResidual()()()isiiisiiieeshwhereestudentizedresidualsstandarddeviationwhereithobsisdeletedhleveragestatistic====Theseareusefulinestimatingthestatisticalsignificanceofaparticularobservation,ofwhichadummyvariableindicatorisformedThetvalueofthestudentizedresidualwillindicatewhetherornotthatobservationisasignificantoutlierThecommandtogeneratestudentizedresiduals,calledrstudtis:predictrstudt,rstudentInfluenceofOutliersLeverageismeasuredbythediagonalcomponentsofthehatmatrixThehatmatrixcomesfromtheformulafortheregressionofYˆ'(')''(')',,ˆYXXXXXYwhereXXXXthehatmatrixHThereforeYHYβ====LeverageandtheHatmatrixThehatmatrixtransformsYintothepredictedscoresThediagonalsofthehatmatrixindicatewhichvalueswillbeoutliersornotThediagonalsarethereforemeasuresofleverageLeverageisboundedbytwolimits:nandTheclosertheleverageistounity,themoreleveragethevaluehasThetraceofthehatmatrix=thenumberofvariablesinthemodelWhentheleverage>pnthenthereishighleverageaccordingtoBelsleyetal()citedinLong,JFModernMethodsofDataAnalysis(p)Forsmallersamples,VellmanandWelsch()suggestedthatpnisthecriterionCook’sDAnothermeasureofinfluenceThisisapopularoneTheformulaforitis:'()iiiiiheCooksDphsh=CookandWeisberg()suggestedthatvaluesofDthatexceededoftheFdistribution(df=p,np)arelargeUsingCook’sDinSTATA•Predictcook,cooksd•Findingtheinfluentialoutliers•Listcook,ifcook>n•Belsleysuggests(nk)asacutoffGraphicalExplorationofOutlierInfluence•Graphcookresidstd,xlabylabThetwoinfluentialoutlierscanbefoundeasilyhereintheupperrightDFbeta•OnecanusetheDFbetastoascertainthemagnitudeofinfluencethatanobservationhasonaparticularparameterestimateifthatobservationisdeleted()()ijjjjjjjbbuDFbetauhwhereuresidualsofregressionofxonremainingxs==ObtainingDFbetasinSTATARobuststatisticaloptionswhenassumptionsareviolatedNonlinearityTransformationtolinearityNonlinearregressionInfluentialOutliersRobustregressionwithrobustweightfunctionsrregyxxHeteroskedasticityofresidualsRegressionwithHuberWhiteSandwichvariancecovarianceestimatorsRegressyxx,robustResidualautocorrelationcorrectionAutoregressionwithpraisyxx,robustneweywestregressionNonnormalityofresidualsQuantileregression:qregyxxBootstrappingtheregressioncoefficientsNonlinearity:TransformationstolinearityWhentheequationisnotintrinsicallynonlinear,thedependentvariableorindependentvariablemaybetransformedtoeffectalinearizationoftherelationshipSemilog,translog,BoxCox,orpowertransformationsmaybeusedforthesepurposesBoxcoxregressionpermitsdeterminestheoptimalparametersformanyofthesetransformationsFixforNonlinearfunctionalform:NonlinearRegressionAnalysisxxnlexpyxestimatesYbbnlexpyxestimatesybbb==Examplesofexponentialgrowthcurvemodels,thefirstofwhichweestimatewithourdataNonlinearRegressioninStata•nlexpyx•(obs=)•Iteration:residualSS=•Iteration:residualSS=•Iteration:residualSS=•Iteration:residualSS=•SourceSSdfMSNumberofobs=•F(,)=•ModelProb>F=•ResidualRsquared=•AdjRsquared=•TotalRootMSE=•Resdev=•paramexpgrowthcurve,y=b*b^x••yCoefStdErrtP>tConfInterval••b•b••(SE's,Pvalues,CI's,andcorrelationsareasymptoticapproximations)•HeteroskedasticitycorrectionProfHalbertWhiteshowedthatheteroskedasticitycouldbehandledinaregressionwithaheteroskedasticityconsistentcovariancematrixestimator(DavidsonMcKinnon(),EstimationandInferenceinEconometrics,OxfordUPress,p)ThisvariancecovariancematrixunderordinaryleastsquaresisshownonthenextpageOLSCovarianceMatrixEstimator(')(')(')(')tXXXXXXwheresXXΣΣ=White’sHACestimatorWhite’sestimatorisforlargesamplesWhite’sheteroskedasticitycorrectedvarianceandstandarderrorscanbelargerorsmallerthantheOLSvariancesandstandarderrorsHeteroskedasticallyconsistentcovariancematrix“Sandwich”estimator(HWhite)(')(')('),:::::()ttttttttnXXnXXnXXewherehHowevertherearedifferentversionsHCenHCenkeHCheHChΩΩ=Ω=Ω=Ω=Ω=BreadMeat(tofu)BreadRegressionwithrobuststandarderrorsforheteroskedasticityRegressyxx,robustOptionsotherthanrobust,arehcandhcreferringtotheversionsmentionedbyDavidsonandMcKinnonaboveRobustoptionsfortheVCVmatrixinStata•Regressyxx,hc•Regressyxx,hc•ThesecorrespondtotheDavidsonandMcKinnon’sversionsoftheheteroskedasticallyconsistentvcvoptionsandProblemswithAutoregressiveErrorsProblemsinestimationwithOLSWhenthereisfirstorderautocorrelationoftheresiduals,et=DetvtEffectontheVarianceet=DetvtSourcesofAutocorrelationLaggedendogenousvariablesMisspecificationofthemodelSimultaneity,feedback,orreciprocalrelationshipsSeasonalityortrendinthemodelPraisWinstonTransformationcont’d,()()ttttvvethereforeeρρ==()tttvYabxρ=Itfollowsthat(((tttYabxvρρρ=***tttYabxv=Autocorrelationoftheresiduals:praisneweyregressionTotestwhetherthevariableisautocorrelated•Tssettime•corrgramy•praisyxx,robust•neweyyxx,lag()t(time)Testingforautocorrelationofresidualsregressmnalsumprcpredictresid,residualcorrgramresidPraisWinstonRegressionforAR()errorsUsingtherobustoptionhereguaranteesthattheWhiteheteroskedasticityconsistentsandwichvariancecovarianceestimatorwillbeusedintheautoregressionprocedureNeweyWestRobustStandarderrors•AnautocorrelationcorrectionisaddedtothemeatortofuintheWhiteSandwichestimatorbyNeweyWest(')(')('),:::::()ttttttttnXXnXXnXXewherehHowevertherearedifferentversionsHCenHCenkeHCheHChΩΩ=Ω=Ω=Ω=Ω=CentralPartofNeweyWestSandwichestimator()ˆ'ˆ'''neweywestwhitemiiiiiilXXXXnleexxxxnkmwhereknumberofpredictorsltimelagmmaximumtimelag=Ω=Ω===NeweyWestRobustStandarderrorsNeweyWeststandarderrorsarerobusttoautocorrelationandheteroskedasticitywithtimeseriesregressionmodelsAssumeOLSregression•Weregressyonxxx•WeobtainthefollowingoutputNextweexaminetheresidualsResidualAssessmentThedatasetistosmalltodropcase,soIuserobustregressionRobustregressionalgorithm:rregAregressionisperformedandabsoluteresidualsarecomputedTheseresidualsarecomputedandscaled:||iiiryxb=iiiirusyxbs==Scalingtheresiduals(|()|)iiMswhereMmedrmedr==TheresidualsarescaledbythemedianabsolutevalueofthemedianresidualEssentialAlgorithm•Theestimatoroftheparameterbminimizesthesumofalessrapidlyincreasingfunctionoftheresiduals(SASInstitute,TheRobustregProcedure,draftcopy,p,forthcoming):()niiiirQbwhereryxbisestimatedbysρσσ===Essentialalgorithmcont’dIfthiswereOLS,theρwouldbeaquadraticfunctionIfwecanascertains,wecanbytakingthederivativeswithrespecttob,findafirstordersolutionto,,,'niijirxswherejpψψρ====CaseweightsaredevelopedfromweightfunctionsCaseweightsareformedbasedonthoseresidualsWeightfunctionsforthosecaseweightsarefirsttheHuberweightsandthentheTukeybisquareweights:AweightedregressionisrerunwiththecaseweightsIterativelyreweightedleastsquares()()xwxxψ=•Thecaseweightw(x)isdefinedas:ItisupdatedateachiterationuntilitconvergesonavalueandthechangefromiterationtoiterationdeclinesbelowacriterionWeightsfunctionsforreducingoutlierinfluencecisthetuningconstantusedindeterminingthecaseweightsFortheHuberweightsc=bydefaultWeightFunctionsTukeybiweight(bisquare)CisalsothebiweighttuningconstantCissetatforthebiweightTuningC

类似资料

编辑推荐

清代山东经营地主底社会性质.pdf

寂静的春天.pdf

国学知识竞赛题目.doc

傅抱石笔谈中国绘画之理解和欣赏.pdf

兰亭序集字对联大观.pdf

职业精品

精彩专题

用户评论

0/200
    暂无评论
上传我的资料

精选资料

热门资料排行换一换

  • 李济文集 04.pdf

  • 李济文集 03.pdf

  • 李济文集 02.pdf

  • 《汉魏南北朝墓志汇编》赵超整理.…

  • 江南机器制造总局的西学翻译与传播…

  • 世界推理小说大观.pdf

  • 历代中医名着文库--中医妇科名着…

  • 历代中医名着文库--中医儿科名着…

  • 经方中药研究集成.pdf

  • 资料评价:

    / 93
    所需积分:5 立即下载

    意见
    反馈

    返回
    顶部