下载
加入VIP
  • 专属下载特权
  • 现金文档折扣购买
  • VIP免费专区
  • 千万文档免费下载

上传资料

关闭

关闭

关闭

封号提示

内容

首页 07

07.pdf

07

tiger
2010-10-12 0人阅读 举报 0 0 暂无简介

简介:本文档为《07pdf》,可适用于高等教育领域

HAPTEDevelopingandEvaluatingConversationalAgentsDominicWMassaroMichaelMCohenSharonDanielRonaldAColeTheworkinghypothesisofthischapteristhatcomputeruserswillbenefitfrominteractionwithconversationalagentsandaccesstothemanysourcesofinformationtheycanprovideForexample,thereisvaluableandeffectiveinformationfortheperceptionandrecognitionofspeechwhenviewingaspeaker'sfaceWehavedevelopedacompletelyanimatedsynthetictalkingheadwithwhichwecancontrolandstudytheinformativeaspectsandpsychologicalprocessesinfacetofacedialoguesOurtalkingheadcommunicatesparalinguisticaswellaslinguisticinformation,andiscontrolledbyatexttospeechsystemThegoalofourresearchistoadvancethedevelopmentofourtalkinghead,itsdesignanditsaccompanyingtechnology,andtocreateahumancomputerinterfacecenteredonavirtual,conversationalagentSuchagentswillinteractwithhumanusersinthemostnaturalmannerpossible,includingtheabilitytolistenandunderstand,aswellasspeakfluentlyAgentswillfacilitateandenrichinteractionbetweenhumansandmachinesCommunicationamonghumanswillalsobeenhancedwhenmediatedbyvirtualagents(eitherpersonalavatarsorautonomousagents)Theworkinvolvesthedevelopmentoftheconversationalagent,thedesignoftheagentinterfaceanditsenvironment,andthepsychologicalevaluationofitscontributiontohumanlanguageacquisition,communication,andproductivityFaculty,postdoctoralandassociateresearchers,andgraduateandundergraduateHumanPe~ormanceandErgonomicsCopyright�byAcademicPressAllrightsofreproductioninanyformreservedDominicWMassaro,etalstudentsfrompsychology,computerscience,linguistics,andartdepartmentsareengagedintheresearchinavarietyofwaysNewdevelopmentsaretestedatvariousstagesinmanydifferentcontextsForexample,experimentsarecarriedouttodevelopandassessthepsychologicalinfluenceoffacialexpressionsandnonverbalutterancesinhumanmachineinteractionsTheconversationalagentwillalsobetheinterfaceforaseriesofpublic,interactive,artinstallationsWewillalsoexpandtheuseoftheagentineducationalandtherapeuticenvironments,asinthelearningofnonnativelanguagesandinlearningtoreadIRELEVANCETechnologyadvocateshavealwayshopedthatspokenlanguagewouldbetheprimarymediumofcommunicationbetweenpeopleandmachinesOurtalkinghead,asaconversationalagent,takesusonestepclosertothatrealizationEachofuscouldhaveourownagent(inourownimage,ifwewish)tohandleourcommunicationsAconversationalagentdoesnotgettiredorbored,isn'twaylaidbyasorethroatand,(asofyet)belongstonounionminshort,it'saperpetual"talkingandunderstanding"machineAswedevelopthetechnology,talkingheadswillbeabletospeakinanylanguage,atanyrateofspeedorlevelofcomplexity,withtheappropriateemotionaleffectInadditiontoourcurrentuseofvisiblespeechtofacilitatelanguageacquisitionforthehearingimpaired,weenvisionapplicationsofthistechnologyinavarietyofdomains,including,butnotlimitedto,education,entertainment,andhumanmachineinteractionForexample,ourtalkingheadcouldserveasausefulaidinsecondlanguageacquisitionandinimprovingthephonologicalandreadingskillsofdyslexicchildrenAswecontinueourresearchthetalkingheadwillplayanimportantroleintheenhancementofauditorysyntheticspeechandasaneducationaltoolinlinguisticsandspeechscienceOurconversationalagentcouldinfluencelanguageacquisitionbytakingontheroleofachild'svirtualreadingtutorChildrenwhoarejustlearningtoreadareattentiveandcompetentlistenersAchildneedstobeabletorecognizewrittenwordsandtocomprehendspokenlanguageTheabilitytorecognizewrittenwordsisasufficientplatformfromwhichthechildcanbootstrapintoreadingwithunderstandingTocrossthebridgefromspokenlanguagetowrittenlanguage,thechildneedstolearntodecodewrittenwordsAchildwillnotbeabletoreaduntilshecanrecognizewordsautomaticallyAseveryparentknows,thebeginningreadermusthaveasupportpersononhandtoprovidethepronunciationofwordsnotalreadyinthechild'sreadingvocabularyOurtalkingheadcanfunctionasaperpetualsupportpersonOurconversationalagentwillnevertireofrepetition,andcanbeembodiedinanyformthechilddesiresIfthechildencountersadifficultwordintheelectronictext,shecansimplyclickonittohearandseeitsspokenlanguagecounterpartThisallowsthechildtoprocessDevelopingandEvaluatingConversationalAgentswrittenlanguageaseasilyasspokenlanguage,andlearnintheappropriatezoneofproximaldevelopment(Vygotsky,)Ourconversationalagentcouldpursueacareerasasyntheticactorintheentertainmentindustry,avirtualmuseumguide,orasalesclerkinterfaceItmightalsoplayanimportantroleinbusinessandsciencebyintegratingcommunications,computing,andnetworkenvironments,orassistinginthediscovery,interpretation,andrepresentationofdataInadditiontoautonomousconversationalagentinterfaces,talkingheadsmightalsobeusedaspersonal,educational,orentertainmentavatarsAsanavatarorsyntheticactor,theheadcouldbecontrolledthroughvisualandoracousticanalysisofahumanactorInthiscase,thesynthetictalkingheadwouldbesynchronizedtorealhumanspeech,emotion,andactionThiswouldallowuserstomeetincyberspacechatrooms,appearingasthemselves,orassumingotheridentitiesVirtualenvironmentssuchaschatroomsofferamodelforthedevelopmentofvirtuallibraries,virtualschools,andvirtualgameplayingThistechnologycouldalsobeusedforaudiovisualdubbingofmoviesforforeignaudiencesForexample,wemightcreateasyntheticRobinWilliamsthatcouldspeakGermanThevirtual,multilingualactorcouldreproduceoneoftherealactor'sperformancesinGermanandthesyntheticperformancecouldbecompositedintothebackgroundoftheoriginalfilmThistechniquewouldputanendtothedisturbingasynchronousdialoguesindubbedfilmsPerhapsbytheendofthecomingdecade,wewillbeabletousemachinetranslationforinstantaneousaudiovisualcrosslinguisticcommunicationForexample,whileIspeakEnglish,myassociateinJapancouldseemespeakingJapaneseIIBACKGROUNDANDSIGNIFICANCEAccesstotheWebandothersourcesofdigitalinformationnotonlyrequiresacomputer,butalsotheabilitytointeractwithcomputersinacompetentmannerLanguageandliteracybarriers,aswellasphysicaldisabilities,providesignificantobstaclestomanyofourcitizens"Today,everyaspectofcomputers,fromtheoutoftheboxexperiencetosurfingtheInternet,isajoyto'technoguys'andanunpleasantchallengetoordinarycitizens"(Tognazinni,,p)WearedevelopingandtestingacomputeranimatedconversationalagentthatwillincreasetheuseoftodigitalinformationandqualitativelyenhancehumanmachineinteractionOurresearchisguidedbythehypothesisthataconversationalagentwillincreaseaccesstocommunicationandinformationtechnology,andencourageandsupportlearningthroughdigitalmediaTheprimaryfunctionoftheconversationalagentistomakethemachinemorehumanlikeandthus,toempowertheuser,whoisabletointeractmorenaturallyandproductivelywithotherhumansthanwithmachinesInordertosupportthedevelopmentofconversationalagentsforuniversalaccessandlearning,researchisnecessaryonanumberofcoretechnologiesandDominicWMassaro,etaltheirintegrationTheseare:dialoguemodeling,naturallanguageprocessing,speechrecognition,speechsynthesis,andfacialanimationManyofthekeyresearchchallengesandpotentialadvancementstothestateoftheartlieattheboundarieswherethesedisciplinesmeetThisworkwillbesupportedbytheexperimentalinvestigationofthefunctionalvalueofconversationalsystemsIIITHEIMPORTANCEOFTALKINGFACESINDIALOGUEWhyhaven'tengineersandcomputerscientistsbeenabletoprogramacomputertorecognizeandunderstandspeechaswellasayearoldchildOnereasonisthatspeechrecognitionsystemsusejustoneoronlyafewsourcesofinformationPeople,ontheotherhand,seemtousemanysourcesofinformationandareabletocombineseveraloftheminanoptimalfashionComputersareprogrammedtoprocessclearcutcategoriesInterpretingambiguousorfuzzydata,however,isnaturalforhumansThisisbestseeninfacetofacecommunicationExperimentshaverevealedconclusivelythatourperceptionandunderstandingareinfluencedbyaspeaker'sfaceandaccompanyinggestures,aswellastheactualsoundofthespeech(Massaro,)Forexample,iftheambiguousauditorysentence,Mybabpopmepoobrive,ispairedwiththevisiblesentence,Mygagkokmekoogrive,theperceiverislikelytohear,MydadtaughtmetodriveTwoambiguoussourcesofinformationarecombinedtocreateameaningfulinterpretation(MassaroStork,,foradetailedanalysisofthisexample)Informationinthefaceisparticularlyeffectivewhentheauditoryspeechisdegraded,becauseofnoise,limitedbandwidth,orhearingimpairmenthowever,thestronginfluenceofvisiblespeechisnotlimitedtosituationswithdegradedauditoryinputAperceiver'srecognitionofanauditoryvisualsyllablereflectsthecontributionofbothsoundandsightOurmostrecentfindingsshowthatspeechreading,ortheabilitytoobtainspeechinformationfromtheface,isnotcompromisedbypartialindirection,partialobstruction,orvisualdistanceHumansarefairlygoodatspeechreadingeveniftheyarenotlookingdirectlyatthetalker'slipsFurthermore,accuracyisnotdramaticallyreducedwhenthefacialimageisblurred(becauseofpoorvision,forexample)whenthefaceisviewedfromabove,below,orinprofileorwhenthereisalargedistancebetweenthetalkerandtheviewerWithourcompletelyanimatedsynthetictalkingheadwecancontroltheparametersofvisiblespeechandstudyitsinformativeaspectsFigureshowsthetalkinghead,calledBaldiAscanbeseeninthefigure,thereisnotmuchbehindBaldi'sartificialexteriorHisexistenceandfunctionalitydependoncomputeranimationandtexttospeechsynthesisHisspeechiscontrolledbyparameters,including:jawrotationandthrusthorizontalmouthwidthlipcornerandprotrusioncontrolslowerliptuckverticallippositionhorizontalandverticalteethoffsetandtongueangle,width,andlengthFigureillustratesBaldi'sarticulationatonsetforthesyllablesba,va,tha,da,cha,andwaOurexperimentsDevelopingandEvaluatingConversationalAgentsFIGUREThetalkinghead,calledBaldiAscanbeseeninthefigure,thereisnotmuchbehindhisattractiveexterior(Cohen,Walker,andMassaro,Massaro,)haveshownthatvisiblespeechproducedbythesynthetichead,eveninitsadumbratedform,isalmostcomparabletothatofarealhuman(seehttp:mamboucscedupslpslfanhtml)AprimaryobjectiveofourresearchistoidentifytheinformativepropertiesofthehumanfacebyevaluatingtheeffectivenessofvariouspropertiesinoursyntheticfaceThevalueoffacialanimationinthedevelopmentofsyntheticspeechisanalogoustotheimportantcontributionofauditoryspeechsynthesisinspeechperceptionresearchThedevelopmentofarealistic,highquality,facialdisplayhasprovidedapowerfultooltocontinuetheinvestigationofanumberofquestionsinauditoryvisualspeechperceptionThisvisiblespeechsynthesispermitsthetypeofexperimentationnecessarytodetermine:)whatpropertiesofvisiblespeechareused,)howtheyareprocessed,and)howthisinformationisintegratedwithauditoryandothercontextualsourcesofinformationinspeechperceptionInourresearchwesystematicallymanipulateaudibleandvisiblespeechindependentofoneanother(asintheexampleof"Mybabpopmepoobrive,""Mygagkokmekoogrive,"="Mydadtaughtmetodrive,"givenpreviously)WepresenttheseteststimulitohumanperceiversforidentificationanddiscriminationThisexperimentalandtheoreticalframeworkhasalreadyestablishedseveralfactsconcerningspeechperceptionbyeyeandearMethodsofanalysisofrealspeecharticulationhaveguidedourresearchinvisiblespeechsynthesisPerceptionexperimentshaveindicatedhowwelltheFIGURE~iiii~�~iiiiiiiiliiii~Baldi'sarticulationatonsetforthesyllablesba,va,tha,da,cha,andwaDominicWMassaro,etalsynthesissimulatesrealspeakersAnunderstandingofvisiblespeechperceptionderivedfromtheseexperimentshasassistedinourdevelopmentofvisiblespeechOurgoalatthePerceptualScienceLaboratory(PSL)hasbeentocreateatalkingheadwhosefacialmotionslookrealistic,nottoduplicatethemusculatureofthefaceWehavechosentodevelopvisiblespeechsynthesisinthesamemannerthathasprovenmostsuccessfulwithaudiblespeechsynthesisWecallthistechniqueterminalanaloguesynthesisItsgoalis,simply,tomimicthefinalspeechproductratherthanthephysiologicalmechanismsthatproduceitOneadvantageofterminalanaloguesynthesisisthatcalculationsforchangingthesurfaceshapesofthepolygonmodelscanbecarriedoutmuchmorerapidlythancalculationsformuscleandtissuesimulationsItalsomaybeeasiertoachievethedesiredfacialshapesdirectlyratherthanintermsoftheconstituentmuscleactionsTherealtimeanimationofthesynthetichead(atuptoframespersecond)wasdevelopedonaSiliconGraphicsCrimsonRealityEnginewithmegabytesofRAMandaMHzRmicroprocessorusingtheIrisGLgraphicslibraryBecauseourimplementationisefficient,wehavebeenabletoporttheanimationalgorithmtoaPCplatform(anIntelPentiumProHzWindowsNTplatform,usingthenewerOpenGLgraphicslibrary),foruseinlanguagetrainingwithprofoundlydeafchildrenattheTuckerMaxonOralSchoolinPortland,Oregon(Coleetal,)ThetalkingheadismadeofpolygonsthathavebeenjoinedtogetherandsmoothshadedThepolygontopologyandanimationiscontrolledthroughasetofparameters(CohenMassaro,),andismadeupofapproximatelysurfacesconnectedattheedgestocreatetheDheadwitheyes,pupil,iris,sclera,eyebrows,nose,skin,lips,tongue,teeth,andneck(seeFigure)Baldihasteethandatongue,aseverygoodspeakershouldTheteetharewireframesquaresThetongueisimplementedasashadedsurfacemadeofapolygonmesh,controlledbyfourparameters:length,angle,width,andthicknessThesyntheticheaddoesnotyethaveears,hair,shoulders,arms,orhandsOurimplementationofvisiblespeechsynthesishasprogressedoverthelastyearstoincludeadditionalandmodifiedcontrolparameters,twogenerationsoftongues,avisualspeechsynthesiscontrolstrategy,texttospeechsynthesis,bimodal(auditoryvisual)synthesis,andcontrolsforparalinguisticinformationandaffectinthefaceFigureillustratesBaldi'sfacialexpressionsforhappy,angry,surprise,fear,sadness,anddisgustMostofourcurrentparametersmovepointsonthefacegeometricallybyrotation(eg,jawrotation)ortranslation,inoneormoredimensions(eg,lowerandupperlipheight,mouthwidening)OtherparametersarechangedbyinterpolationbetweenalternatefacesManyofthefaceshapeparameters,suchascheekshape,neckshape,andforeheadshape,aswellassomeaffectparameters,suchassmiling,arecontrolledbythisalgorithmThesynthesisprogram,whichconsistsofabout,linesofCcode,runsinrealtimeonbothSGIandPCplatformsWhiledevelopingtheanimationandspeechsynthesis,wehavecontinuouslyDevelopingandEvaluatingConversationalAgentsFIGUREBaldi'sexpressionofhappiness,anger,surprise,fear,sadness,anddisgustevaluatedandimprovedthesyntheticspeechItisasoberingfactthatauditoryspeechsynthesisstillfallsfarshortofnaturalspeechafteryearsofintensiveresearchanddevelopment(Cohenetal,Massaro,,Chapter)Wehavenotallowedourselvestobeswayedbythesubjectivecommentsofmanyviewerswhoclaim,"HownaturalBaldiseems"Instead,wesystematicallyperformexperimentstocomparethequalityofthesyntheticspeechtonaturalspeechTherelativerealismofBaldi'svisiblespeechismeasuredintermsofitsintelligibilitytospeechreadersTheexperimentsmeasurecomparativeintelligibilitytodeterminewhereandhowBaldifallsshortofnaturalspeakersThesynthesisisthenmodifiedaccordinglybringingitmoreinlinewithnaturalvisiblespeechInaseriesofoveradozensuchevaluationexperiments,withappropriateadjustmentstothesynthesis,thequalityofthevisiblespeechisalmostasgoodasaverygoodnaturalspeaker(Massaro,)InadditiontothisexperimentalevidencewehavecollectedexperttestimonialsAdeafspeechscientistclaimsthatBaldi'sistheonlyexistingsyntheticvisiblespeechthathecanaccuratelyspeechreadAlso,adeafman,DanielSolcher,fromSanAntonio,Texas,hasindependentlydownloadedBaldiforuseinspeechproductiontrainingHehasformallypresentedthistechnologytohisformerschool,SunshineCottage,withthehopeofmakingitaccessibletoallofthestudentsthereIVCONTROLMECHANISMTheproblemofdesigningcontrolmechanismsforsynthesisisoneaspectofthemuchbroaderproblemofcommunicationcodingTheproblemtobesolvedincommunicationcodingishowonecanmosteconomicallycodeinformationsuchasthevisual(andauditory)appearanceofatalkingfacetransmittheinformation,andreconstructtheimageatanotherlocationTherearemanylevelsofcodingthatdifferintermsofhowmuchoftheoutputinformationisaspecificreproductionoftheinputinformation,andhowmuchisreconstructedonthebasisofabstract,repeatabletokensIngeneral,themoreabstractthecoding,themorecontroltheuserhasovertheresultingimageWehavechosentousehigherlevelsofvisualandauditoryspeechcoding,whichinvolvetheidentificationofDominicWMassaro,etalabstractlinguisticobjectssuchassegments,words,andideasTheuseoftheselinguisticobjectsallowsthemostparsimonioustransmissionofinformation,aswellasallowingsynthesistotakeplaceindependentlyofanalysisWhenconsideringsuchhigherlevelsofcoding,apreliminaryquestionthatmustbeanswerediswhataretheappropriateunitsofrepresentationforanalysisandsynthesisVUNITSOFSPEECHSYNTHESISParallelingmuchoftheworkinauditoryspeechsynthesis,weusephonemesasthebasicunitofspeechsynthesisPhonemeunitsareattractivebecauseoftheirrelativelysmallnumberBecausemanyutterancesarecompletelynovel,oneofthemostimportantdecisionsistochooseaunitofsynthesisthatallowsthenece

用户评价(0)

关闭

新课改视野下建构高中语文教学实验成果报告(32KB)

抱歉,积分不足下载失败,请稍后再试!

提示

试读已结束,如需要继续阅读或者下载,敬请购买!

文档小程序码

使用微信“扫一扫”扫码寻找文档

1

打开微信

2

扫描小程序码

3

发布寻找信息

4

等待寻找结果

我知道了
评分:

/22

07

VIP

在线
客服

免费
邮箱

爱问共享资料服务号

扫描关注领取更多福利