首页 GDC2003_Memory_Optimization_18Mar03

GDC2003_Memory_Optimization_18Mar03

举报
开通vip

GDC2003_Memory_Optimization_18Mar03MEMORYMEMORYOPTIMIZATIONOPTIMIZATIONChristerEricsonChristerEricsonSonyComputerEntertainment,SantaMonicaSonyComputerEntertainment,SantaMonica((christerchrister__ericsonericson@@playstationplaystation..sonysony.com).com)Talkcontents1/2Talkcontents1/2►►Problemsta...

GDC2003_Memory_Optimization_18Mar03
MEMORYMEMORYOPTIMIZATIONOPTIMIZATIONChristerEricsonChristerEricsonSonyComputerEntertainment,SantaMonicaSonyComputerEntertainment,SantaMonica((christerchrister__ericsonericson@@playstationplaystation..sonysony.com).com)Talkcontents1/2Talkcontents1/2►►ProblemstatementProblemstatement!!Why“memoryoptimization?”Why“memoryoptimization?”►►BriefarchitectureoverviewBriefarchitectureoverview!!ThememoryhierarchyThememoryhierarchy►►Optimizingfor(codeand)datacacheOptimizingfor(codeand)datacache!!GeneralsuggestionsGeneralsuggestions!!DatastructuresDatastructures►►PrefetchingPrefetchingandpreloadingandpreloading►►StructurelayoutStructurelayout►►TreestructuresTreestructures►►LinearizationLinearizationcachingcaching►►……Talkcontents2/2Talkcontents2/2►►……►►AliasingAliasing!!AbstractionpenaltyproblemAbstractionpenaltyproblem!!Aliasanalysis(typeAliasanalysis(type--based)based)!!‘restrict’pointers‘restrict’pointers!!TipsforreducingTipsforreducingaliasingaliasingProblemstatementProblemstatement►►Forthelast20Forthelast20--somethingyears…somethingyears…!!CPUspeedshaveincreased~60%/yearCPUspeedshaveincreased~60%/year!!Memoryspeedsonlydecreased~10%/yearMemoryspeedsonlydecreased~10%/year►►GapcoveredbyuseofcachememoryGapcoveredbyuseofcachememory►►CacheisunderCacheisunder--exploitedexploited!!DiminishingreturnsforlargercachesDiminishingreturnsforlargercaches►►Inefficientcacheuse=lowerperformanceInefficientcacheuse=lowerperformance!!Howincreasecacheutilization?CacheHowincreasecacheutilization?Cache--awareness!awareness!Needmorejustification?1/3Needmorejustification?1/3Instructionparallelism:SIMDinstructionsconsumedataat2-8timestherateofnormalinstructions!Needmorejustification?2/3Needmorejustification?2/3Proebsting’slaw:Improvementstocompilertechnologydoubleprogramperformanceevery~18years!Corollary:Don’texpectthecompilertodoitforyou!Corollary:Don’texpectthecompilertodoitforyou!Needmorejustification?3/3Needmorejustification?3/3OnMoore’slaw:►Consolesdon’tfollowit(assuch)!Fixedhardware!2nd/3rdgenerationtitlesmustgetimprovementsfromsomewhereBriefcachereviewBriefcachereview►►CachesCaches!!Codecacheforinstructions,datacachefordataCodecacheforinstructions,datacachefordata!!FormsamemoryhierarchyFormsamemoryhierarchy►►CachelinesCachelines!!Cachedividedintocachelinesof~32/64byteseachCachedividedintocachelinesof~32/64byteseach!!CorrectunitinwhichtocountmemoryaccessesCorrectunitinwhichtocountmemoryaccesses►►DirectDirect--mappedmapped!!FornKBcache,bytesatk,k+n,k+2n,…maptosameFornKBcache,bytesatk,k+n,k+2n,…maptosamecachelinecacheline►►NN--waysetwayset--associativeassociative!!LogicalcachelinecorrespondstoNphysicallinesLogicalcachelinecorrespondstoNphysicallines!!HelpsminimizecachelinethrashingHelpsminimizecachelinethrashingThememoryhierarchyThememoryhierarchyRoughly:MainmemoryL2cache1cycleCPU~1-5cyclesL1cache~5-20cycles~40-100cyclesSomecachespecsSomecachespecs~128~128--512K512K~32~32--64K64KPC128K8128K8--wayunifiedwayunified16K/16K416K/16K4--waywayXBOX256K2256K2--wayunifiedwayunified32K/32K32K/32K‡‡88--waywayGameCubeN/AN/A16K/8K16K/8K††22--waywayPS2L2cacheL1cache(I/D)►►††16Kdatascratchpadimportantpartofdesign16Kdatascratchpadimportantpartofdesign►►‡‡configurableas16K4configurableas16K4--way+16Kscratchpadway+16KscratchpadFoes:3C’sofcachemissesFoes:3C’sofcachemisses►►CompulsorymissesCompulsorymisses!!UnavoidablemisseswhendatareadforfirsttimeUnavoidablemisseswhendatareadforfirsttime►►CapacitymissesCapacitymisses!!NotenoughcachespacetoholdallactivedataNotenoughcachespacetoholdallactivedata!!ToomuchdataaccessedToomuchdataaccessedinbetweeninbetweensuccessiveusesuccessiveuse►►ConflictmissesConflictmisses!!CachethrashingduetodatamappingtosamecacheCachethrashingduetodatamappingtosamecachelineslinesFriends:Introducingthe3R’sFriends:Introducingthe3R’s►►Rearrange(code,data)Rearrange(code,data)!!ChangelayouttoincreasespatiallocalityChangelayouttoincreasespatiallocality►►Reduce(size,#cachelinesread)Reduce(size,#cachelinesread)!!Smaller/smarterformats,compressionSmaller/smarterformats,compression►►Reuse(cachelines)Reuse(cachelines)!!Increasetemporal(andspatial)localityIncreasetemporal(andspatial)localityXX(x)(x)Reuse(x)(x)XXXXReduceXX(x)(x)XXRearrangeConflictCapacityCompulsoryMeasuringcacheutilizationMeasuringcacheutilization►►ProfileProfile!!CPUperformance/eventcountersCPUperformance/eventcounters►►GivememoryaccessstatisticsGivememoryaccessstatistics►►Butnotaccesspatterns(e.g.stride)Butnotaccesspatterns(e.g.stride)!!CommercialproductsCommercialproducts►►SNSystems’Tuner,Metrowerks’CATS,Intel’sSNSystems’Tuner,Metrowerks’CATS,Intel’sVTuneVTune!!RollyourownRollyourown►►InIngccgcc‘‘--p’option+definep’option+define__mcountmcount()()►►InstrumentcodewithcallstologgingclassInstrumentcodewithcallstologgingclass!!DobackDoback--ofof--thethe--envelopecomparisonenvelopecomparison►►StudythegeneratedcodeStudythegeneratedcodeCodecacheoptimization1/2Codecacheoptimization1/2►►LocalityLocality!!ReorderfunctionsReorderfunctions►►ManuallywithinfileManuallywithinfile►►Reorderobjectfilesduringlinking(orderinReorderobjectfilesduringlinking(orderinmakefilemakefile))►►__attribute__((section("__attribute__((section("xxxxxx")))")))iningccgcc!!AdaptcodingstyleAdaptcodingstyle►►MonolithicfunctionsMonolithicfunctions►►Encapsulation/OOPislesscodecachefriendlyEncapsulation/OOPislesscodecachefriendly!!MovingtargetMovingtarget!!Bewarevariousimplicitfunctions(e.g.Bewarevariousimplicitfunctions(e.g.fptodpfptodp))Codecacheoptimization2/2Codecacheoptimization2/2►►SizeSize!!Beware:Beware:inlininginlining,unrolling,largemacros,unrolling,largemacros!!KISSKISS►►AvoidAvoidfeaturitisfeaturitis►►Providemultiplecopies(alsohelpslocality)Providemultiplecopies(alsohelpslocality)!!LoopsplittingandloopfusionLoopsplittingandloopfusion!!Compileforsize(‘Compileforsize(‘--Os’inOs’ingccgcc))!!RewriteinRewriteinasmasm(whereitcounts)(whereitcounts)►►Again,studygeneratedcodeAgain,studygeneratedcode!!BuildintuitionaboutcodegeneratedBuildintuitionaboutcodegeneratedDatacacheoptimizationDatacacheoptimization►►Lotsandlotsofstuff…Lotsandlotsofstuff…!!“Compressing”data“Compressing”data!!BlockingandstripminingBlockingandstripmining!!PaddingdatatoaligntocachelinesPaddingdatatoaligntocachelines!!PlusotherthingsIwon’tgointoPlusotherthingsIwon’tgointo►►WhatIwilltalkabout…WhatIwilltalkabout…!!PrefetchingPrefetchingandpreloadingdataintocacheandpreloadingdataintocache!!CacheCache--consciousstructurelayoutconsciousstructurelayout!!TreedatastructuresTreedatastructures!!LinearizationLinearizationcachingcaching!!MemoryallocationMemoryallocation!!AliasingAliasingand“antiand“anti--aliasingaliasing””PrefetchingPrefetchingandpreloadingandpreloading►►SoftwareSoftwareprefetchingprefetching!!NottooearlyNottooearly––datamaybeevictedbeforeusedatamaybeevictedbeforeuse!!NottoolateNottoolate––datanotfetchedintimeforusedatanotfetchedintimeforuse!!GreedyGreedy►►Preloading(pseudoPreloading(pseudo--prefetchingprefetching))!!HitHit--underunder--missprocessingmissprocessingSoftwareSoftwareprefetchingprefetching//Loopthroughandprocessall4nelementsfor(inti=0;i<4*n;i++)Process(elem[i]);constintkLookAhead=4;//Someelementsaheadfor(inti=0;i<4*n;i+=4){Prefetch(elem[i+kLookAhead]);Process(elem[i+0]);Process(elem[i+1]);Process(elem[i+2]);Process(elem[i+3]);}GreedyGreedyprefetchingprefetchingvoidPreorderTraversal(Node*pNode){//GreedilyprefetchlefttraversalpathPrefetch(pNode->left);//ProcessthecurrentnodeProcess(pNode);//GreedilyprefetchrighttraversalpathPrefetch(pNode->right);//RecursivelyvisitleftthenrightsubtreePreorderTraversal(pNode->left);PreorderTraversal(pNode->right);}Preloading(pseudoPreloading(pseudo--prefetchprefetch))Elema=elem[0];for(inti=0;i<4*n;i+=4){Eleme=elem[i+4];//Cachemiss,non-blockingElemb=elem[i+1];//CachehitElemc=elem[i+2];//CachehitElemd=elem[i+3];//CachehitProcess(a);Process(b);Process(c);Process(d);a=e;}(NB:Thiscodereadsoneelementbeyondtheendofthe(NB:Thiscodereadsoneelementbeyondtheendoftheelemelemarray.)array.)StructuresStructures►►CacheCache--consciouslayoutconsciouslayout!!Fieldreordering(usuallygroupedconceptually)Fieldreordering(usuallygroupedconceptually)!!Hot/coldsplittingHot/coldsplitting►►LetusedecideformatLetusedecideformat!!ArrayofstructuresArrayofstructures!!StructuresofarraysStructuresofarrays►►LittlecompilersupportLittlecompilersupport!!EasierfornonEasierfornon--pointerlanguages(Java)pointerlanguages(Java)!!C/C++:doityourselfC/C++:doityourselfFieldreorderingFieldreorderingstructS{void*key;intcount[20];S*pNext;};structS{void*key;S*pNext;intcount[20];};voidFoo(S*p,void*key,intk){while(p){if(p->key==key){p->count[k]++;break;}p=p->pNext;}}►►LikelyaccessedLikelyaccessedtogethersotogethersostorethemstorethemtogether!together!Hot/coldsplittingHot/coldsplittingColdfields:Coldfields:structS{void*key;S*pNext;S2*pCold;};Hotfields:Hotfields:structS2{intcount[10];};►►Allocateall‘Allocateall‘structstructS’fromamemorypoolS’fromamemorypool!!IncreasescoherenceIncreasescoherence►►PreferarrayPreferarray--styleallocationstyleallocation!!NoneedforactualpointertocoldfieldsNoneedforactualpointertocoldfieldsHot/coldsplittingHot/coldsplittingBewarecompilerpaddingBewarecompilerpaddingstructX{int8a;int64b;int8c;int16d;int64e;floatf;};Assuming4Assuming4--bytefloats,formostcompilersbytefloats,formostcompilerssizeofsizeof(X)==40,(X)==40,sizeofsizeof(Y)==40,and(Y)==40,andsizeofsizeof(Z)==24.(Z)==24.structZ{int64b;int64e;floatf;int16d;int8a;int8c;};structY{int8a,pad_a[7];int64b;int8c,pad_c[1];int16d,pad_d[2];int64e;floatf,pad_f[1];};Decreasingsize!CacheperformanceanalysisCacheperformanceanalysis►►UsagepatternsUsagepatterns!!ActivityActivity––indicateshotorcoldfieldindicateshotorcoldfield!!CorrelationCorrelation––basisforfieldreorderingbasisforfieldreordering►►LoggingtoolLoggingtool!!AccessallclassmembersthroughAccessallclassmembersthroughaccessoraccessorfunctionsfunctions!!ManuallyinstrumentfunctionstocallLog()functionManuallyinstrumentfunctionstocallLog()function!!Log()function…Log()function…►►takesobjecttype+memberfieldasargumentstakesobjecttype+memberfieldasarguments►►hashhash--mapscurrentmapscurrentargsargstocountfieldaccessestocountfieldaccesses►►hashhash--mapscurrent+previousmapscurrent+previousargsargstotracktotrackpairwisepairwiseaccessesaccessesTreedatastructuresTreedatastructures►►RearrangeRearrangenodesnodes!!IncreasespatiallocalityIncreasespatiallocality!!CacheCache--awarevs.cacheawarevs.cache--obliviouslayoutsobliviouslayouts►►ReduceReducesizesize!!Pointerelimination(usingimplicitpointers)Pointerelimination(usingimplicitpointers)!!“Compression”“Compression”►►QuantizeQuantizevaluesvalues►►StoredatarelativetoparentnodeStoredatarelativetoparentnodeBreadthBreadth--firstorderfirstorder►►PointerPointer--less:Left(n)=2n,Right(n)=2n+1less:Left(n)=2n,Right(n)=2n+1►►RequiresstorageforcompletetreeofheightHRequiresstorageforcompletetreeofheightHDepthDepth--firstorderfirstorder►►Left(n)=n+1,Right(n)=storedindexLeft(n)=n+1,Right(n)=storedindex►►OnlystoresexistingnodesOnlystoresexistingnodesvanvanEmdeEmdeBoaslayoutBoaslayout►►“Cache“Cache--oblivious”oblivious”►►RecursiveconstructionRecursiveconstructionAcompactstatickAcompactstatick--dtreedtreeunionKDNode{//leaf,type11int32leafIndex_type;//non-leaf,type00=x,//01=y,10=z-splitfloatsplitVal_type;};LinearizationLinearizationcachingcaching►►NothingbetterthanlineardataNothingbetterthanlineardata!!BestpossiblespatiallocalityBestpossiblespatiallocality!!EasilyEasilyprefetchableprefetchable►►SoSolinearizelinearizedataatruntime!dataatruntime!!!Fetchdata,storeFetchdata,storelinearizedlinearizedinacustomcacheinacustomcache!!UseittoUseittolinearizelinearize……►►hierarchytraversalshierarchytraversals►►indexeddataindexeddata►►otherrandomotherrandom--accessstuffaccessstuffMemoryallocationpolicyMemoryallocationpolicy►►Don’tallocatefromheap,usepoolsDon’tallocatefromheap,usepools!!NoblockoverheadNoblockoverhead!!KeepsdatatogetherKeepsdatatogether!!Fastertoo,andnofragmentationFastertoo,andnofragmentation►►FreeASAP,reuseimmediatelyFreeASAP,reuseimmediately!!BlockislikelyincachesoreuseitsBlockislikelyincachesoreuseitscachelinescachelines!!Firstfit,usingfreelistFirstfit,usingfreelistThecurseofThecurseofaliasingaliasingWhatisWhatisaliasingaliasing??intFoo(int*a,int*b){*a=1;*b=2;return*a;}intn;int*p1=&n;int*p2=&n;AliasingismultiplereferencestothesamestoragelocationAliasingAliasingisalsomissedopportunitiesforoptimizationisalsomissedopportunitiesforoptimizationWhatvalueisreturnedhere?Whoknows!ThecurseofThecurseofaliasingaliasing►►WhatiscausingWhatiscausingaliasingaliasing??!!PointersPointers!!Globalvariables/classmembersmakeitworseGlobalvariables/classmembersmakeitworse►►WhatistheproblemwithWhatistheproblemwithaliasingaliasing??!!Hindersreordering/eliminationofloads/storesHindersreordering/eliminationofloads/stores►►PoisoningdatacachePoisoningdatacache►►NegativelyaffectsinstructionschedulingNegativelyaffectsinstructionscheduling►►HinderscommonHinderscommonsubexpressionsubexpressionelimination(CSE),elimination(CSE),looploop--invariantcodemotion,constant/copyinvariantcodemotion,constant/copypropagation,etc.propagation,etc.Howdowedo‘antiHowdowedo‘anti--aliasing’aliasing’??►►WhatcanbedoneaboutWhatcanbedoneaboutaliasingaliasing??!!BetterlanguagesBetterlanguages►►LessLessaliasingaliasing,lowerabstractionpenalty,lowerabstractionpenalty††!!BettercompilersBettercompilers►►AliasanalysissuchastypeAliasanalysissuchastype--basedaliasanalysisbasedaliasanalysis††!!Betterprogrammers(aidingthecompiler)Betterprogrammers(aidingthecompiler)►►That’syou,afterthenext20slides!That’syou,afterthenext20slides!!!LeapoffaithLeapoffaith►►--fnofno--aliasingaliasing††TobedefinedTobedefinedMatrixmultiplication1/3Matrixmultiplication1/3Mat22mul(floata[2][2],floatb[2][2],floatc[2][2]){for(inti=0;i<2;i++){for(intj=0;j<2;j++){a[i][j]=0.0f;for(intk=0;k<2;k++)a[i][j]+=b[i][k]*c[k][j];}}}Consideroptimizinga2x2matrixmultiplication:Consideroptimizinga2x2matrixmultiplication:Howdowetypicallyoptimizeit?Right,unrolling!Howdowetypicallyoptimizeit?Right,unrolling!Matrixmultiplication2/3Matrixmultiplication2/3//16memoryreads,4writesMat22mul(floata[2][2],floatb[2][2],floatc[2][2]){a[0][0]=b[0][0]*c[0][0]+b[0][1]*c[1][0];a[0][1]=b[0][0]*c[0][1]+b[0][1]*c[1][1];//(1)a[1][0]=b[1][0]*c[0][0]+b[1][1]*c[1][0];//(2)a[1][1]=b[1][0]*c[0][1]+b[1][1]*c[1][1];//(3)}StaightforwardStaightforwardunrollingresultsinthis:unrollingresultsinthis:►►Butwait!There’sahiddenassumption!aisnotborc!Butwait!There’sahiddenassumption!aisnotborc!►►Compilerdoesn’t(cannot)knowthis!Compilerdoesn’t(cannot)knowthis!!!(1)Must(1)Mustrefetchrefetchb[0][0]andb[0][1]b[0][0]andb[0][1]!!(2)Must(2)Mustrefetchrefetchc[0][0]andc[1][0]c[0][0]andc[1][0]!!(3)Must(3)Mustrefetchrefetchb[0][0],b[0][1],c[0][0]andc[1][0]b[0][0],b[0][1],c[0][0]andc[1][0]Matrixmultiplication3/3Matrixmultiplication3/3//8memoryreads,4writesMat22mul(floata[2][2],floatb[2][2],floatc[2][2]){floatb00=b[0][0],b01=b[0][1];floatb10=b[1][0],b11=b[1][1];floatc00=c[0][0],c01=c[0][1];floatc10=c[1][0],c11=c[1][1];a[0][0]=b00*c00+b01*c10;a[0][1]=b00*c01+b01*c11;a[1][0]=b10*c00+b11*c10;a[1][1]=b10*c01+b11*c11;}Acorrectapproachisinsteadwritingitas:Acorrectapproachisinsteadwritingitas:Consumeinputs……beforeproducingoutputsAbstractionpenaltyproblemAbstractionpenaltyproblem►►HigherlevelsofabstractionhaveanegativeHigherlevelsofabstractionhaveanegativeeffectonoptimizationeffectonoptimization!!CodebrokenintosmallergenericsubunitsCodebrokenintosmallergenericsubunits!!DataandoperationhidingDataandoperationhiding►►Cannotmakelocalcopyofe.g.internalpointersCannotmakelocalcopyofe.g.internalpointers►►CannothoistconstantexpressionsoutofloopsCannothoistconstantexpressionsoutofloops►►EspeciallybecauseofEspeciallybecauseofaliasingaliasingissuesissuesC++abstractionpenaltyC++abstractionpenalty►►Lotsof(temporary)objectsaroundLotsof(temporary)objectsaround!!IteratorsIterators!!Matrix/vectorclassesMatrix/vectorclasses►►Objectsliveinheap/stackObjectsliveinheap/stack!!ThussubjecttoThussubjecttoaliasingaliasing!!MakestrackingofcurrentmembervalueverydifficultMakestrackingofcurrentmembervalueverydifficult!!Buttrackingrequiredtokeepvaluesinregisters!Buttrackingrequiredtokeepvaluesinregisters!►►ImplicitImplicitaliasingaliasingthroughthethroughthethisthispointerpointer!!ClassmembersarevirtuallyasbadasglobalvariablesClassmembersarevirtuallyasbadasglobalvariablesC++abstractionpenaltyC++abstractionpenaltyPointermembersinclassesmayaliasothermembers:Pointermembersinclassesmayaliasothermembers:classBuf{public:voidClear(){for(inti=0;i<numVals;i++)pBuf[i]=0;}private:intnumVals,*pBuf;}CodelikelytoCodelikelytorefetchrefetchnumValsnumValseachiteration!eachiteration!numValsnotalocalvariable!MaybealiasedbypBuf!C++abstractionpenaltyC++abstractionpenaltyclassBuf{public:voidClear(){for(inti=0,n=numVals;i<n;i++)pBuf[i]=0;}private:intnumVals,*pBuf;}WeknowthatWeknowthataliasingaliasingwon’thappen,andcanwon’thappen,andcanmanuallysolvethemanuallysolvethealiasingaliasingissuebywritingcodeas:issuebywritingcodeas:C++abstractionpenaltyC++abstractionpenaltySinceSincepBufpBuf[i][i]canonlyaliascanonlyaliasnumValsnumValsinthefirstinthefirstiteration,aqualitycompilercanfixthisproblembyiteration,aqualitycompilercanfixthisproblembypeelingthelooponce,turningitinto:peelingthelooponce,turningitinto:voidClear(){if(numVals>=1){pBuf[0]=0;for(inti=1,n=numVals;i<n;i++)pBuf[i]=0;}}Q:DoesQ:Doesyouryourcompilerdothisoptimization?!compilerdothisoptimization?!TypeType--basedaliasanalysisbasedaliasanalysis►►SomeSomealiasingaliasingthecompilercancatchthecompilercancatch!!ApowerfultoolisApowerfultoolistypetype--basedaliasanalysisbasedaliasanalysisUselanguagetypestodisambiguatememoryreferences!TypeType--basedaliasanalysisbasedaliasanalysis►►ANSIC/C++statesthat…ANSIC/C++statesthat…!!EachareaofmemorycanonlybeassociatedEachareaofmemorycanonlybeassociatedwithonetypeduringitslifetimewithonetypeduringitslifetime!!AliasingAliasingmayonlyoccurbetweenreferencesofmayonlyoccurbetweenreferencesofthesamethesamecompatiblecompatibletypetype►►EnablescompilertoruleoutEnablescompilertoruleoutaliasingaliasingbetweenreferencesofnonbetweenreferencesofnon--compatibletypecompatibletype!!TurnedonwithTurnedonwith––fstrictfstrict--aliasingaliasinginingccgccCompatibilityofC/C++typesCompatibilityofC/C++types►►Inshort…Inshort…!!TypescompatibleifdifferingbyTypescompatibleifdifferingbysignedsigned,,unsignedunsigned,,constconstororvolatilevolatile!!charcharandandunsignedcharunsignedcharcompatiblewithanycompatiblewithanytypetype!!OtherwisenotcompatibleOtherwisenotcompatible►►(Seestandardforfulldetails.)(Seestandardforfulldetails.)WhatTBAAcandoforyouWhatTBAAcandoforyouvoidFoo(float*v,int*n){intt=*n;for(inti=0;i<t;i++)v[i]+=1.0f;}voidFoo(float*v,int*n){for(inti=0;i<*n;i++)v[i]+=1.0f;}Itcanturnthis:Itcanturnthis:Possiblealiasingbetweenv[i]and*nintothis:intothis:Noaliasingpossiblesofetch*nonce!WhatTBAAcanalsodoWhatTBAAcanalsodo►►CauseobscurebugsinnonCauseobscurebugsinnon--conformingcode!conformingcode!!!BewareespeciallysoBewareespeciallyso--called“typepunning”uint32i;floatf;i=*((uint32*)&f);uint32i;union{floatf;uchar8c[4];}u;u.f=f;i=(u.c[3]<<24L)+(u.c[2]<<16L)+...;called“typepunning”uint32i;union{floatf;uint32i;}u;u.f=f;i=u.i;IllegalC/C++code!AllowedBygccRequiredbystandardRestrictRestrict--qualifiedpointersqualifiedpointers►►restrictrestrictkeywordkeyword!!Newto1999ANSI/ISOCstandardNewto1999ANSI/ISOCstandard!!NotinC++standardyet,butsupportedbymanyC++NotinC++standardyet,butsupportedbymanyC++compilerscompilers!!Ahintonly,somaydonothingandstillbeconformingAhintonly,somaydonothingandstillbeconforming►►ArestrictArestrict--qualifiedpointer(orreference)…qualifiedpointer(orreference)…!!…isbasicallyapromisetothecompilerthatforthe…isbasicallyapromisetothecompilerthatforthescopeofthepointer,thetargetofthepointerwillonlyscopeofthepointer,thetargetofthepointerwillonlybeaccessedthroughthatpointer(andpointerscopiedbeaccessedthroughthatpointer(andpointerscopiedfromit).fromit).!!(Seestandardforfulldetails.)(Seestandard
本文档为【GDC2003_Memory_Optimization_18Mar03】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: 免费 已有0 人下载
最新资料
资料动态
专题动态
is_403486
暂无简介~
格式:pdf
大小:1MB
软件:PDF阅读器
页数:50
分类:互联网
上传时间:2017-09-12
浏览量:10