关闭

关闭

关闭

封号提示

内容

首页 nonparametric for r

nonparametric for r.pdf

nonparametric for r

猪晴晴可爱多 2014-03-10 评分 0 浏览量 0 0 0 0 暂无简介 简介 举报

简介:本文档为《nonparametric for rpdf》,可适用于高等教育领域,主题内容包含NONPARAMETRICANDSEMIPARAMETRICMETHODSINRJEFFREYSRACINEAbstractTheRenvironm符等。

NONPARAMETRICANDSEMIPARAMETRICMETHODSINRJEFFREYSRACINEAbstractTheRenvironmentforstatisticalcomputingandgraphics(RDevelopmentCoreTeam())offerspractitionersarichsetofstatisticalmethodsrangingfromrandomnumbergenerationandoptimizationmethodsthroughregression,paneldata,andtimeseriesmethods,bywayofillustrationThestandardRdistribution(‘baseR)comespreloadedwitharichvarietyoffunctionalityusefulforappliedeconometriciansThisfunctionalityisenhancedbyusersuppliedpackagesmadeavailableviaRserversthataremirroredaroundtheworldOfinterestinthischapteraremethodsforestimatingnonparametricandsemiparametricmodelsWesummarizemanyofthefacilitiesinRandconsidersometoolsthatmightbeofinteresttothosewishingtoworkwithnonparametricmethodswhowanttoavoidresortingtoprogramminginCorFortranbutneedthespeedofcompiledcodeasopposedtointerpretedcodesuchasGaussorMatlabbywayofexampleWeencouragethoseworkinginthefieldtostronglyconsiderimplementingtheirmethodsintheRenvironmenttherebymakingtheirworkaccessibletothewidestpossibleaudienceviaanopencollaborativeforumIntroductionUnliketheirmoreestablishedparametriccounterparts,manynonparametricandsemiparametricmethodsthathavereceivedwidespreadtheoreticaltreatmenthavenotyetfoundtheirwayintomainstreamcommercialpackagesThishashinderedtheiradoptionbyappliedresearchers,anditissafetodescribetheavailabilityofmodernnonparametricmethodsasfragmentedatbest,whichcanbefrustratingforuserswhowishtoassesswhetherornotsuchmethodscanaddvaluetotheirapplicationThus,onefrequentlyheardcomplaintaboutthestateofnonparametrickernelmethodsconcernsthelackofsoftwarealongwiththefactthatimplementationsininterpretedenvironmentssuchasGaussareordersofmagnitudeslowerthancompiledimplementationswritteninCorFortranThoughmanyresearchersmaycodetheirmethods,oftenusinginterpretedenvironmentssuchasGauss,itisfairtocharacterizemuchofthiscodeasneitherdesignednorsuitedastoolsforgeneralpurposeuseastheyaretypicallywrittensolelytodemonstrate‘proofofconcept’Eventhoughmanyauthorsaremorethanhappytocirculatesuchcode(whichisofcourseappreciated!),thisoftenimposescertainhardshipsontheuserincluding)havingtopurchasea(closedandproprietary)commercialsoftwarepackageand)havingtomodifythecodesubstantiallyinordertouseitfortheirapplicationTheRenvironmentforstatisticalcomputingandgraphics(RDevelopmentCoreTeam())offerspractitionersarangeoftoolsforestimatingnonparametric,semiparametric,andofcourseparametricmodelsUnlikemanycommercialprograms,whichmustfirstbepurchasedinordertoevaluatethem,youcanadoptRwithminimaleffortandwithnofinancialoutlayrequiredManyDate:November,JEFFREYSRACINEnonparametricmethodsarewelldocumented,tested,andaresuitableforgeneraluseviaacommoninterfacestructure(suchasthe‘formula’interface)makingiteasyforusersfamiliarwithRtodeploythesetoolsfortheirparticularapplicationFurthermore,oneofthestrengthsofRistheabilitytocallcompiledCorFortrancodeviaacommoninterfacestructuretherebydeliveringthespeedofcompliedcodeinaflexibleeasytouseenvironmentInaddition,thereexistanumberofR‘packages’(oftencalled‘libraries’or‘modules’inotherenvironments)thatimplementavarietyofkernelmethods,albeitwithvaryingdegreesoffunctionality(eg,univariateversusmultivariate,theabilityinabilitytohandlenumericalandcategoricaldataandsoforth)Finally,RdeliversarichframeworkforimplementingandmakingcodeavailabletothecommunityInthischapterweoutlinemanyofthefunctionsandpackagesavailableinRthatmightbeofinteresttopractitioners,andconsidersomeillustrativeapplicationsalongwithcodefragmentsthatmightbeofinterestBeforeproceedingfurther,wefirstbeginwithanintroductiontotheRenvironmentitselfTheREnvironmentWhatisRPerhapsitisbesttobeginwiththequestion“whatisS”SisalanguageandenvironmentdesignedforstaticalcomputingandgraphicswhichwasdevelopedatBellLaboratories(formerlyATT,nowLucentTechnologies)Shasgrowntobecomethedefactostandardamongeconometriciansandstatisticians,andtherearetwomainimplementations,thecommercialimplementationcalled‘SPLUS’,andthefree,opensourceimplementationcalled‘R’Rdeliversaricharrayofstatisticalmethods,andoneofitsstrengthsistheeasewithwhich‘packages’canbedevelopedandmadeavailabletousersforfreeRisamatureopenplatformthatisideallysuitedtothetaskofmakingonesmethodavailabletothewidestpossibleuserbasefreeofchargeInthissectionwebrieflydescribeahandfulofresourcesavailabletothoseinterestedinusingR,introducetheusertotheRenvironment,andintroducetheusertotheforeignpackagethatfacilitatesimportationofdatafrompackagessuchasSAS,SPSS,Stata,andMinitab,amongothersWebsitesAnumberofsitesaredevotedtohelpingRusers,andwebrieflymentionafewofthembelowhttp:wwwRprojectorg:ThisistheRhomepagefromwhichyoucandownloadtheprogramitselfandmanyRpackagesTherearealsomanuals,otherlinks,andfacilitiesforjoiningvariousRmailinglistshttp:CRANRprojectorg:Thisisthe‘ComprehensiveRArchiveNetwork,’“anetworkofftpandwebserversaroundtheworldthatstoreidentical,uptodate,versionsofcodeanddocumentationfortheRstatisticalpackage”PackagesareonlyputonCRANwhentheypassaratherstringentcollectionofqualityassurancechecks,andinparticularareguaranteedtobuildandrunonstandardplatformshttp:cranrprojectorgwebviewsEconometricshtml:ThisistheCRAN‘taskview’forcomputationaleconometrics“BaseRshipswithalotoffunctionalityusefulNONPARAMETRICANDSEMIPARAMETRICMETHODSINRforcomputationaleconometrics,inparticularinthestatspackageThisfunctionalityiscomplementedbymanypackagesonCRAN,abriefoverviewisgivenbelow”ThisprovidesanexcellentsummaryofbothparametricandnonparametricpackagesthatexistfortheRenvironmenthttp:pjfreefacultyorgRRtipshtml:ThissiteprovidesalargeandexcellentcollectionofRtipsGettingstartedwithRAnumberofwellwrittenmanualsexistforRandcanbelocatedattheRwebsiteThissectionisclearlynotintendedtobeasubstitutefortheseresourcesItsimplyprovidesaminimalsetofcommandswhichwillaidthosewhohaveneverusedRbeforeHavinginstalledandrunR,youwillfindyourselfatthe>promptToquittheprogram,simplytypeq()Togethelp,youcaneitherenteracommandprecededbyaquestionmark,asinhelp,ortypehelpstart()atthe>promptThelatterwillspawnyourwebbrowser(itreadsfilesfromyourharddrive,soyoudonothavetobeconnectedtotheInternettousethisfeature)YoucanentercommandsinteractivelyattheRprompt,oryoucancreateatextfilecontainingthecommandsandexecuteallcommandsinthefilefromtheRpromptbytypingsource("commandsR"),wherecommandsRisthetextfilecontainingyourcommandsManyeditorsrecognizetheRextensionprovidingusefulinterfaceforthedevelopmentofRcodeForexample,GNUEmacsisapowerfuleditorthatworkswellwithRandalsoLATEX(http:wwwgnuorgsoftwareemacsemacshtml)Whenyouquitbyenteringtheq()command,youwillbeaskedwhetherornotyouwishtosavethecurrentsessionIfyouenterY,thenthenexttimeyourunRinthesamedirectoryitwillloadalloftheobjectscreatedintheprevioussessionIfyoudoso,typingthecommandls()willlistalloftheobjectsForthisreason,itiswisetousedifferentdirectoriesfordifferentprojectsToremoveobjectsthathavebeenloaded,youcanusethecommandrm(objectname)orrm(list=ls())willremoveallobjectsinmemoryImportingdatafromotherformatsTheforeignpackageallowsyoutoreaddatacreatedbydifferentpopularprogramsToloadit,simplytypelibrary(foreign)fromwithinRSupportedformatsincludereadarff:ReadDatafromARFFFilesreaddbf:ReadaDBFFilereaddta:ReadStataBinaryFilesreadepiinfo:ReadEpiInfoDataFilesreadmtp:ReadaMinitabPortableWorksheetreadoctave:ReadOctaveTextDataFilesreadS:ReadanSBinaryordatadumpFilereadspss:ReadanSPSSDataFilereadssd:ObtainaDataFramefromaSASPermanentDataset,viareadxportreadsystat:ObtainaDataFramefromaSystatFileJEFFREYSRACINEreadxport:ReadaSASXPORTFormatLibraryThefollowingcodesnippetreadstheStatafile‘wagedta’(Wooldridge())andliststhenamesofvariablesinthedataframeR>library(foreign)R>mydat<readdta(file="wagedta")R>names(mydat)"wage""educ""exper""tenure""nonwhite""female""married""numdep""smsa""northcen""south""west""construc""ndurman""trcommpu""trade""services""profserv""profocc""clerocc""servocc""lwage""expersq""tenursq"ClearlyRmakesitsimpletomigratedatafromoneenvironmenttoanotherHavinginstalledRandhavingreadindatafromatextfileorsupportedformatsuchasaStatabinaryfile,youcantheninstallpackagesviatheinstallpackages()command,asininstallpackages("np")whichwillinstallthenppackage(HayfieldRacine())NONPARAMETRICANDSEMIPARAMETRICMETHODSINRSomeNonparametricandSemiparametricRoutinesAvailableinRTablesummarizessomeofthenonparametricandsemiparametricroutinesavailabletousersofRAscanbeseen,thereappearstobearichrangeofnonparametricimplementationsavailabletothepractitionerHowever,uponcloserinspectionmanyarelimitedinonewayoranotherinwaysthatmightfrustrateappliedeconometriciansForinstance,somenonparametricregressionmethodsadmitonlyoneregressor,whileothersadmitonlynumericaldatatypesandcannotadmitcategoricaldatathatisoftenfoundinappliedsettingsTableisnotintendedtobeexhaustive,rather,itoughttoservetoorientthereadertoasubsetofthericharrayofnonparametricmethodsthatcurrentlyexistintheRenvironmentToseearoutineinaction,youcantypeexample("funcname",package="pkgname")wherefuncnameisthenameofaroutineandpkgnameistheassociatedpackageandthiswillrunanexamplecontainedinthehelpfileforthatfunctionForinstance,example("npreg",package="np")willrunakernelregressionexamplefromthepackagenpJEFFREYSRACINETableAnillustrativesummaryofRpackagesthatimplementnonparametricmethodsPackageFunctionDescriptionashashComputesunivariateaveragedshiftedhistogramsashComputesbivariateaveragedshiftedhistogramscarnbinsComputesnumberofbinsforhistogramswithdifferentrulesgamgamComputesgeneralizedadditivemodelsusingthemethoddescribedinHastieTibshirani()GenKernKernSecComputesunivariatekerneldensityestimatesKernSurComputesbivariatekerneldensityestimatesGraphicsboxplotProducesboxandwhiskerplot(s)(base)nclassSturgesComputesthenumberofclassesforahistogramnclassscottComputesthenumberofclassesforahistogramnclassFDComputesthenumberofclassesforahistogramKernSmoothbkdeComputesaunivariatebinnedkerneldensityestimateusingthefastFouriertransformasdescribedinSilverman()bkdeDComputeabivariatebinnedkerneldensityestimateasdescribedinWand()dpikComputesabandwidthforaunivariatekerneldensityestimateusingthemethoddescribedinSheatherJones()dpillComputesabandwidthforunivariatelocallinearregressionusingthemethoddescribedinRuppert,SheatherWand()locpolyComputesaunivariateprobabilitydensityfunction,bivariateregressionfunctionortheirderivativesusinglocalpolynomialskskdeComputesamultivariatekerneldensityestimatefortodimensionalnumericaldatalocfitlocfitComputesunivariatelocalregressionandlikelihoodmodelssjpiComputesabandwidthviathepluginSheatherJones()methodkdebComputesunivariatekerneldensityestimatebandwidthsMASSbandwidthnrdComputesSilverman’sruleofthumbforchoosingthebandwidthofaunivariateGaussiankerneldensityestimatorhistscottPlotahistogramwithautomaticbinwidthselection(Scott)histFDPlotahistogramwithautomaticbinwidthselection(FreedmanDiaconis)kdedComputesabivariatekerneldensityestimatewidthSJComputestheSheatherJones()bandwidthforaunivariateGaussiankerneldensityestimatorbcvComputesbiasedcrossvalidationbandwidthselectionforaunivariateGaussiankerneldensityestimatorucvComputesunbiasedcrossvalidationbandwidthselectionforofaunivariateGaussiankerneldensityestimatornpnpcdensComputesamultivariateconditionaldensityasdescribedinHall,RacineLi()npcdistComputesamultivariateconditionaldistributionasdescribedinLiRacine(forthcoming)npcmstestConductsaparametricmodelspecificationtestasdescribedinHsiao,LiRacine()npconmodeConductsmultivariatemodalregressionnpindexcomputesamultivariatesingleindexmodelasdescribedinIchimura(),KleinSpady()npksumComputesmultivariatekernelsumswithnumericandcategoricaldatatypesnpplotConductsgeneralpurposeplottingofnonparametricobjectsnpplregcomputesamultivariatepartiallylinearmodelasdescribedinRobinson(),RacineLiu()npqcmstestConductsaparametricquantileregressionmodelspecificationtestasdescribedinZheng(),Racine()npqregComputesmultivariatequantileregressionasdescribedinLiRacine(forthcoming)npregComputesmultivariateregressionasdescribedinRacineLi(),LiRacine()npscoefComputesmultivariatesmoothcoefficientmodelsasdescribedinLiRacine(b)npsigtestComputesthesignificancetestasdescribedinRacine(),Racine,HartLi()npudensComputesmultivariatedensityestimationasdescribedinParzen(),Rosenblatt(),LiRacine()npudistComputesmultivariatedistributionfunctionsasdescribedinParzen(),Rosenblatt(),LiRacine()statsbwnrdUnivariatebandwidthselectorsforgaussianwindowsindensitydensityComputesaunivariatekerneldensityestimate(base)histComputesaunivariatehistogramsmoothsplineComputesaunivariatecubicsmoothingsplineasdescribedinChambersHastie()ksmoothComputesaunivariateNadarayaWatsonkernelregressionestimatedescribedinWandJones()loessComputesasmoothcurvefittedbytheloessmethoddescribedinCleveland,GrosseShyu()(numericpredictors)NONPARAMETRICANDSEMIPARAMETRICMETHODSINRNonparametricDensityEstimationinRUnivariatedensityestimationisoneofthemostpopularexploratorynonparametricmethodsinusetodayReaderswillnodoubtbeintimatelyfamiliarwithtwopopularnonparametricestimators,namelytheunivariatehistogramandkernelestimatorsForanindepthtreatmentofkerneldensityestimationwedirecttheinterestedreadertothewonderfulmonographsbySilverman()andScott(),whileformixeddatadensityestimationwedirectthereadertoLiRacine()andthereferencesthereinWeshallbeginwithanillustrativeparametricexampleConsideranyrandomvariableXhavingprobabilitydensityfunctionf(x),andletf()betheobjectofinterestSupposeoneispresentedwithaseriesofindependentandidenticallydistributeddrawsfromtheunknowndistributionandaskedtomodelthedensityofthedata,f(x)Forthisexampleweshallsimulaten=drawsbutimmediatelydiscardknowledgeofthetruedatageneratingprocess(DGP)pretendingthatweareunawarethatthedataisdrawnfromamixtureofnormals(N(,)andN(,)withequalprobability)ThefollowingcodesnippetdemonstratesonewaytodrawrandomsamplesfromamixtureofnormalsR>library(np)NonparametricKernelMethodsforMixedDatatypes(version)R>setseed()R>n<R>x<sort(c(rnorm(n,mean=,sd=),rnorm(n,mean=,sd=)))ThefollowingfigureplotsthetrueDGPevaluatedonanequallyspacedgridof,pointsR>xseq<seq(,,length=)R>plot(xseq,*dnorm(xseq,mean=,sd=)*dnorm(xseq,mean=,sd=),xlab="X",ylab="MixtureofNormalDensities",type="l",main="",col="blue",lty=)JEFFREYSRACINEXMixtureofNormalDensitiesSupposeonenaıvelypresumedthatthedataisdrawnfrom,say,thenormalparametricfamily(notamixturethereof),thentestedthisassumptionusingtheShapiroWilkstestThefollowingcodesnippeddemonstrateshowthisisdoneinRR>shapirotest(x)ShapiroWilknormalitytestdata:xW=,pvalue<eGiventhatthispopularparametricmodelisflatlyrejectedbythisdataset,wehavetwochoices,namely)searchforamoreappropriateparametricmodelor)usemoreflexibleestimatorsForwhatfollows,weshallpresumethatthereaderhasfoundthemselvesinjustsuchasituationThatis,theyhavefaithfullyappliedaparametricmethodandconductedaseriesoftestsofmodeladequacythatindicatethattheparametricmodelisnotconsistentwiththeunderlyingDGPTheythenturntomoreflexiblemethodsofdensityestimationNotethatthoughweareconsideringNONPARAMETRICANDSEMIPARAMETRICMETHODSINRdensityestimationatthemoment,itcouldbevirtuallyanyparametricapproachthatwehavebeendiscussing,forinstance,regressionanalysisandsoforthIfonewishedtoexamineahistogramonecouldusethefollowingcodesnippet,R>hist(x,prob=TRUE,main="")xDensityOfcourse,thoughconsistent,thehistogramsuffersfromanumberofdrawbackshenceonemightinsteadconsiderasmoothnonparametricdensityestimatorsuchastheunivariateParzenkernelestimator(Parzen())AunivariatekernelestimatorcanbeobtainedusingthedensitycommandthatispartofRbaseThisfunctionsupportsarangeofbandwidthmethods(seebwnrdfordetails)andkernels(seedensityfordetails)ThedefaultbandwidthmethodisSilverman’s’ruleofthumb’(Silverman(,page,eqn())),andforthisdataweobtainthefollowing:R>plot(density(x),main="")JEFFREYSRACINEN=Bandwidth=DensityThedensityfunctioninRhasanumberofvirtuesItisextremelyfastcomputationallyspeakingasthealgorithmdispersesthemassoftheempiricaldistributionfunctionoveraregulargridandthenusesthefastFouriertransformtoconvolvethisapproximationwithadiscretizedversionofthekernelandthenusesalinearapproximationtoevaluatethedensityatthespecifiedpointsIfonewishestoobtainaunivariatekernelestimateforalargesampleofdatathenthisisdefinitelythefunctionofchoiceHowever,forabivariate(orhigherdimensional)densityestimateonewouldrequ

用户评论(0)

0/200

精彩专题

上传我的资料

每篇奖励 +1积分

资料评分:

/36
1下载券 下载 加入VIP, 送下载券

意见
反馈

立即扫码关注

爱问共享资料微信公众号

返回
顶部

举报
资料