关闭

关闭

关闭

封号提示

内容

首页 Advanced FPGA Design:architecture.implementatio…

Advanced FPGA Design:architecture.implementation.optimzition.pdf

Advanced FPGA Design:architectu…

上传者: 只为你 2012-05-08 评分 0 0 0 0 0 0 暂无简介 简介 举报

简介:本文档为《Advanced FPGA Design:architecture.implementation.optimzitionpdf》,可适用于IT/计算机领域,主题内容包含AdvancedFPGADesignArchitecture,Implementation,andOptimizationSteveKiltsSpe符等。

AdvancedFPGADesignArchitecture,Implementation,andOptimizationSteveKiltsSpectrumDesignSolutionsMinneapolis,Minnesotaacer打字机高级FPGA设计:结构、实现与优化AdvancedFPGADesignAdvancedFPGADesignArchitecture,Implementation,andOptimizationSteveKiltsSpectrumDesignSolutionsMinneapolis,MinnesotaCopyright#byJohnWileySons,IncAllrightsreservedPublishedbyJohnWileySons,Inc,Hoboken,NewJerseyPublishedsimultaneouslyinCanadaNopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorbyanymeans,electronic,mechanical,photocopying,recording,scanning,orotherwise,exceptaspermittedunderSectionoroftheUnitedStatesCopyrightAct,withouteitherthepriorwrittenpermissionofthePublisher,orauthorizationthroughpaymentoftheappropriatepercopyfeetotheCopyrightClearanceCenter,Inc,RosewoodDrive,Danvers,MA,,fax,oronthewebatwwwcopyrightcomRequeststothePublisherforpermissionshouldbeaddressedtothePermissionsDepartment,JohnWileySons,Inc,RiverStreet,Hoboken,NJ,(),fax()LimitofLiabilityDisclaimerofWarranty:Whilethepublisherandauthorhaveusedtheirbesteffortsinpreparingthisbook,theymakenorepresentationsorwarrantieswithrespecttotheaccuracyorcompletenessofthecontentsofthisbookandspecificallydisclaimanyimpliedwarrantiesofmerchantabilityorfitnessforaparticularpurposeNowarrantymaybecreatedorextendedbysalesrepresentativesorwrittensalesmaterialsTheadviceandstrategiescontainedhereinmaynotbesuitableforyoursituationYoushouldconsultwithaprofessionalwhereappropriateNeitherthepublishernorauthorshallbeliableforanylossofprofitoranyothercommercialdamages,includingbutnotlimitedtospecial,incidental,consequential,orotherdamagesForgeneralinformationonourotherproductsandservicespleasecontactourCustomerCareDepartmentwithintheUSat,outsidetheUSatorfaxWileyalsopublishesitsbooksinavarietyofelectronicformatsSomecontentthatappearsinprint,however,maynotbeavailableinelectronicformatLibraryofCongressCataloginginPublicationDataKilts,Steve,AdvancedFPGAdesign:Architecture,Implementation,andOptimizationbySteveKiltspcmIncludesindexISBN(cloth)FieldprogrammablegatearraysDesignandconstructionITitleTKGKdcPrintedintheUnitedStatesofAmericaTomywife,Teri,whofeltthatthesubjectmatterwasratherdryFlowchartofContentsContentsPrefacexiiiAcknowledgmentsxvArchitectingSpeedHighThroughputLowLatencyTimingAddRegisterLayersParallelStructuresFlattenLogicStructuresRegisterBalancingReorderPathsSummaryofKeyPointsArchitectingAreaRollingUpthePipelineControlBasedLogicReuseResourceSharingImpactofResetonAreaResourcesWithoutResetResourcesWithoutSetResourcesWithoutAsynchronousResetResettingRAMUtilizingSetResetFlipFlopPinsSummaryofKeyPointsArchitectingPowerClockControlClockSkewManagingSkewviiInputControlReducingtheVoltageSupplyDualEdgeTriggeredFlipFlopsModifyingTerminationsSummaryofKeyPointsExampleDesign:TheAdvancedEncryptionStandardAESArchitecturesOneStageforSubbytesZeroStagesforShiftRowsTwoPipelineStagesforMixColumnOneStageforAddRoundKeyCompactArchitecturePartiallyPipelinedArchitectureFullyPipelinedArchitecturePerformanceVersusAreaOtherOptimizationsHighLevelDesignAbstractDesignTechniquesGraphicalStateMachinesDSPDesignSoftwareHardwareCodesignSummaryofKeyPointsClockDomainsCrossingClockDomainsMetastabilitySolution:PhaseControlSolution:DoubleFloppingSolution:FIFOStructurePartitioningSynchronizerBlocksGatedClocksinASICPrototypesClocksModuleGatingRemovalSummaryofKeyPointsExampleDesign:ISVersusSPDIFISProtocolHardwareArchitectureviiiContentsAnalysisSPDIFProtocolHardwareArchitectureAnalysisImplementingMathFunctionsHardwareDivisionMultiplyandShiftIterativeDivisionTheGoldschmidtMethodTaylorandMaclaurinSeriesExpansionTheCORDICAlgorithmSummaryofKeyPointsExampleDesign:FloatingPointUnitFloatingPointFormatsPipelinedArchitectureVerilogImplementationResourcesandPerformanceResetCircuitsAsynchronousVersusSynchronousProblemswithFullyAsynchronousResetsFullySynchronizedResetsAsynchronousAssertion,SynchronousDeassertionMixingResetTypesNonresetableFlipFlopsInternallyGeneratedResetsMultipleClockDomainsSummaryofKeyPointsAdvancedSimulationTestbenchArchitectureTestbenchComponentsTestbenchFlowMainThreadClocksandResetsTestCasesContentsixSystemStimulusMATLABBusFunctionalModelsCodeCoverageGateLevelSimulationsToggleCoverageRunTimeTrapsTimescaleGlitchRejectionCombinatorialDelayModelingSummaryofKeyPointsCodingforSynthesisDecisionTreesPriorityVersusParallelFullConditionsMultipleControlBranchesTrapsBlockingVersusNonblockingForLoopsCombinatorialLoopsInferredLatchesDesignOrganizationPartitioningDataPathVersusControlClockandResetStructuresMultipleInstantiationsParameterizationDefinitionsParametersParametersinVerilogSummaryofKeyPointsExampleDesign:TheSecureHashAlgorithmSHAArchitectureImplementationResultsSynthesisOptimizationSpeedVersusAreaResourceSharingxContentsPipelining,Retiming,andRegisterBalancingTheEffectofResetonRegisterBalancingResynchronizationRegistersFSMCompilationRemovalofUnreachableStatesBlackBoxesPhysicalSynthesisForwardAnnotationVersusBackAnnotationGraphBasedPhysicalSynthesisSummaryofKeyPointsFloorplanningDesignPartitioningCriticalPathFloorplanningFloorplanningDangersOptimalFloorplanningDataPathHighFanOutDeviceStructureReusabilityReducingPowerDissipationSummaryofKeyPointsPlaceandRouteOptimizationOptimalConstraintsRelationshipbetweenPlacementandRoutingLogicReplicationOptimizationacrossHierarchyIORegistersPackFactorMappingLogicintoRAMRegisterOrderingPlacementSeedGuidedPlaceandRouteSummaryofKeyPointsExampleDesign:MicroprocessorSRCArchitectureSynthesisOptimizationsSpeedVersusAreaContentsxiPipeliningPhysicalSynthesisFloorplanOptimizationsPartitionedFloorplanCriticalPathFloorplan:AbstractionCriticalPathFloorplan:AbstractionStaticTimingAnalysisStandardAnalysisLatchesAsynchronousCircuitsCombinatorialFeedbackSummaryofKeyPointsPCBIssuesPowerSupplySupplyRequirementsRegulationDecouplingCapacitorsConceptCalculatingValuesCapacitorPlacementSummaryofKeyPointsAppendixAAppendixBBibliographyIndexxiiContentsPrefaceInthedesignconsultingbusiness,IhavebeenexposedtocountlessFPGA(FieldProgrammableGateArray)designs,methodologies,anddesigntechniquesWhethermyclientisontheFortunelistorisjustastartupcompany,theywillinevitablydosomethingsrightandmanythingswrongAfterhavingbeenexposedtoawidevarietyofdesignsinawiderangeofindustries,IbegandevelopingmyownarsenaloftechniquesandheuristicsfromthecombinedknowledgeoftheseexperiencesWhenmentoringnewFPGAdesignengineers,IdrawmysuggestionsandrecommendationsfromthisexperienceUpuntilnow,manyoftheserecommendationshavereferencedspecificwhitepapersandapplicationnotes(appnotes)thatdiscussspecificpracticalaspectsofFPGAdesignThepurposeofthisbookistocondenseyearsofexperiencespreadacrossnumerouscompaniesandteamsofengineers,aswellasmuchofthewisdomgatheredfromtechnologyspecificwhitepapersandappnotes,intoasinglebookthatcanbeusedtorefineadesigner’sknowledgeandaidinbecominganadvancedFPGAdesignerThereareanumberofbooksonFPGAdesign,butfewofthesetrulyaddressadvancedrealworldtopicsindetailThisbookattemptstocutoutthefatofunnecessarytheory,speculationonfuturetechnologies,andthedetailsofoutdatedtechnologiesItiswritteninaterse,conciseformatthataddressesthevarioustopicswithoutwastingthereader’stimeManysectionsinthisbookassumethatcertainfundamentalsareunderstood,andforthesakeofbrevity,backgroundinformationandortheoreticalframeworksarenotalwayscoveredindetailInstead,thisbookcoversindepthtopicsthathavebeenencounteredinrealworlddesignsInsomeways,thisbookreplacesalimitedamountofindustryexperienceandaccesstoanexperiencedmentorandwillhopefullypreventthereaderfromlearningafewthingsthehardwayItistheadvanced,practicalapproachthatmakesthisbookuniqueOnethingtonoteaboutthisbookisthatitwillnotflowfromcovertocoverlikeanovelForasetofadvancedtopicsthatarenotintrinsicallytiedtooneanother,thistypeofflowisimpossiblewithoutblatantlyfillingitwithfluffInstead,toorganizethisbook,IhaveorderedthechaptersinsuchawaythattheyfollowatypicaldesignflowThefirstchaptersdiscussarchitecture,thensimulation,thensynthesis,thenfloorplanning,andsoonThisisillustratedintheFlowchartofContentsprovidedatthebeginningofthebookToprovidexiiiaccessibilityforfuturereference,thechaptersarelistedsidebysidewiththerelevantblockintheflowdiagramTheremainingchaptersinthisbookareheavywithexamplesForbrevity,IhaveselectedVerilogasthedefaultHDL(HardwareDescriptionLanguage)XilinxasthedefaultFPGAvendor,andSynplicityasthedefaultsynthesisandfloorplanningtoolMostofthetopicscoveredinthisbookcaneasilybemappedtoVHDL,Altera,MentorGraphics,andsoforth,buttoincludealloftheseforcompletenesswouldonlyservetocloudtheimportantpointsEvenifthereaderofthisbookusestheseothertechnologies,thisbookwillstilldeliveritsvalueIfyouhaveanyfeedback,goodorbad,feelfreetoemailmeatstevekiltsspectrumdsicomSTEVEKILTSMinneapolis,MinnesotaMarchxivPrefaceAcknowledgmentsDuringthecourseofmycareer,IhavehadtheprivilegetoworkwithmanyexcellentdigitaldesignengineersMyexposuretothesetalentedengineersbeganatMedtronicandcontinuedovertheyearsthroughmyworkasaconsultantforcompaniessuchasHoneywell,Guidant,Teradyne,Telex,Unisys,AMD,ADC,andanumberofsmallerstartupcompaniesinvolvedwithawidevarietyofFPGAapplicationsIalsoowemuchofmyknowledgetotheappnotesandwhitepaperspublishedbythemajorFPGAvendorsTheseresourcescontaininvaluablerealworldheuristicsthatarenotincludedinastandardengineeringcurriculumSpecifictothisbook,IoweagreatdealtoXilinxandSynplicity,bothofwhichprovidedtheFPGAdesigntoolsusedthroughoutthebook,aswellasanumberofkeyreviewersReviewersofnotealsoincludePeterCalabreseofSynplicity,CliffCumminsofSunburstDesign,PeteDanileofSynplicity,AndersEnggaardofAxcon,MikeFetteofSpectrumDesignSolutions,PhilipFreidinofFliptronics,PaulFuchsofNuHorizons,DonHodappofXilinx,AshokKulkarniofSynplicity,RodLandersofSpectrumDesignSolutions,RyanLinkofLogic,DaveMatthewsofVerein,LanceRomanofRomanJones,BJoshuaRosenofPolybus,GaryStevensofiSine,JimTorgerson,andLarryWeegmanofXilinxSKxvacer打字机荣幸恩典ChapterArchitectingSpeedSophisticatedtooloptimizationsareoftennotgoodenoughtomeetmostdesignconstraintsifanarbitrarycodingstyleisusedThischapterdiscussesthefirstofthreeprimaryphysicalcharacteristicsofadigitaldesign:speedThischapteralsodiscussesmethodsforarchitecturaloptimizationinanFPGATherearethreeprimarydefinitionsofspeeddependingonthecontextoftheproblem:throughput,latency,andtimingInthecontextofprocessingdatainanFPGA,throughputreferstotheamountofdatathatisprocessedperclockcycleAcommonmetricforthroughputisbitspersecondLatencyreferstothetimebetweendatainputandprocesseddataoutputThetypicalmetricforlatencywillbetimeorclockcyclesTimingreferstothelogicdelaysbetweensequentialelementsWhenwesayadesigndoesnot“meettiming,”wemeanthatthedelayofthecriticalpath,thatis,thelargestdelaybetweenflipflops(composedofcombinatorialdelay,clktooutdelay,routingdelay,setuptiming,clockskew,andsoon)isgreaterthanthetargetclockperiodThestandardmetricsfortimingareclockperiodandfrequencyDuringthecourseofthischapter,wewilldiscussthefollowingtopicsindetail:HighthroughputarchitecturesformaximizingthenumberofbitspersecondthatcanbeprocessedbythedesignLowlatencyarchitecturesforminimizingthedelayfromtheinputofamoduletotheoutputTimingoptimizationstoreducethecombinatorialdelayofthecriticalpathAddingregisterlayerstodividecombinatoriallogicstructuresParallelstructuresforseparatingsequentiallyexecutedoperationsintoparalleloperationsFlatteninglogicstructuresspecifictopriorityencodedsignalsRegisterbalancingtoredistributecombinatoriallogicaroundpipelinedregistersReorderingpathstodivertoperationsinacriticalpathtoanoncriticalpathAdvancedFPGADesignBySteveKiltsCopyright#JohnWileySons,IncYangLiangInsertedTextthelargestdelayshouldbelessthanthetargetclockperiodThustimingcanbemetHIGHTHROUGHPUTAhighthroughputdesignisonethatisconcernedwiththesteadystatedataratebutlessconcernedaboutthetimeanyspecificpieceofdatarequirestopropagatethroughthedesign(latency)TheideawithahighthroughputdesignisthesameideaFordcameupwithtomanufactureautomobilesingreatquantities:anassemblylineIntheworldofdigitaldesignwheredataisprocessed,werefertothisunderamoreabstractterm:pipelineApipelineddesignconceptuallyworksverysimilartoanassemblylineinthattherawmaterialordatainputentersthefrontend,ispassedthroughvariousstagesofmanipulationandprocessing,andthenexitsasafinishedproductordataoutputThebeautyofapipelineddesignisthatnewdatacanbeginprocessingbeforethepriordatahasfinished,muchlikecarsareprocessedonanassemblylinePipelinesareusedinnearlyallveryhighperformancedevices,andthevarietyofspecificarchitecturesisunlimitedExamplesincludeCPUinstructionsets,networkprotocolstacks,encryptionengines,andsoonFromanalgorithmicperspective,animportantconceptinapipelineddesignisthatof“unrollingtheloop”Asanexample,considerthefollowingpieceofcodethatwouldmostlikelybeusedinasoftwareimplementationforfindingthethirdpowerofXNotethattheterm“software”herereferstocodethatistargetedatasetofproceduralinstructionsthatwillbeexecutedonamicroprocessorXPower=for(i=i<i)XPower=X*XPowerNotethattheabovecodeisaniterativealgorithmThesamevariablesandaddressesareaccesseduntilthecomputationiscompleteThereisnouseforparallelismbecauseamicroprocessoronlyexecutesoneinstructionatatime(forthepurposeofargument,justconsiderasinglecoreprocessor)AsimilarimplementationcanbecreatedinhardwareConsiderthefollowingVerilogimplementationofthesamealgorithm(outputscalingnotconsidered):modulepower(output:XPower,outputfinished,input:X,inputclk,start)thedurationofstartisasingleclockreg:ncountreg:XPowerassignfinished=(ncount==)always(posedgeclk)if(start)beginXPower<=Xncount<=endChapterArchitectingSpeedYangLiangHighlightYangLiangHighlightelseif(!finished)beginncount<=ncountXPower<=XPower*XendendmoduleIntheaboveexample,thesameregisterandcomputationalresourcesarereuseduntilthecomputationisfinishedasshowninFigureWiththistypeofiterativeimplementation,nonewcomputationscanbeginuntilthepreviouscomputationhascompletedThisiterativeschemeisverysimilartoasoftwareimplementationAlsonotethatcertainhandshakingsignalsarerequiredtoindicatethebeginningandcompletionofacomputationAnexternalmodulemustalsousethehandshakingtopassnewdatatothemoduleandreceiveacompletedcalculationTheperformanceofthisimplementationisThroughput,orbitsclockLatencyclocksTimingOnemultiplierdelayinthecriticalpathContrastthiswithapipelinedversionofthesamealgorithm:modulepower(outputreg:XPower,inputclk,input:X)reg:XPower,XPowerreg:X,Xalways(posedgeclk)beginPipelinestageX<=XXPower<=XPipelinestageX<=XXPower<=XPower*XPipelinestageXPower<=XPower*XendendmoduleFigureIterativeimplementationHighThroughputIntheaboveimplementation,thevalueofXispassedtobothpipelinestageswhereindependentresourcescomputethecorrespondingmultiplyoperationNotethatwhileXisbeingusedtocalculatethefinalpowerofinthesecondpipelinestage,thenextvalueofXcanbesenttothefirstpipelinestageasshowninFigureBoththefinalcalculationofX(XPowerresourc

用户评论(0)

0/200

精彩专题

上传我的资料

每篇奖励 +2积分

资料评价:

/49
仅支持在线阅读

意见
反馈

立即扫码关注

爱问共享资料微信公众号

返回
顶部