Visyllable Based Speech Animation
Abstract
Visemes are visual counterpart of phonemes. Traditionally, the speech animation of 3D synthetic faces involvesextraction of visemes from input speech followed by the application of co-articulation rules to generate realisticanimation. In this paper, we take a novel approach for speech animation - using visyllables, the visual counterpartof syllables. The approach results into a concatenative visyllable based speech animation system. The key contributionof this paper lies in two main areas. Firstly, we define a set of visyllable units for spoken English along withthe associated phonological rules for valid syllables. Based on these rules, we have implemented a syllabificationalgorithm that allows segmentation of a given phoneme stream into syllables and subsequently visyllables. Secondly,we have recorded the database of visyllables using a facial motion capture system. The recorded visyllableunits are post-processed semi-automatically to ensure continuity at the vowel boundaries of the visyllables. We defineeach visyllable in terms of the Facial Movement Parameters (FMP). The FMPs are obtained as a result of thestatistical analysis of the facial motion capture data. The FMPs allow a compact representation of the visyllables.Further, the FMPs also facilitate the formulation of rules for boundary matching and smoothing after concatenatingthe visyllables units. Ours is the first visyllable based speech animation system. The proposed technique iseasy to implement, effective for real-time as well as non real-time applications and results into realistic speechanimation.Categories and Subject Descriptors (according to ACM CCS): 1.3.7 [Computer Graphics]: Three-Dimensional Graphics and Realism
BibTeX
@article {10.1111:1467-8659.t01-2-00711,
journal = {Computer Graphics Forum},
title = {{Visyllable Based Speech Animation}},
author = {Kshirsagar, Sumedha and Magnenat-Thalmann, Nadia},
year = {2003},
publisher = {Blackwell Publishers, Inc and the Eurographics Association},
ISSN = {1467-8659},
DOI = {10.1111/1467-8659.t01-2-00711}
}
journal = {Computer Graphics Forum},
title = {{Visyllable Based Speech Animation}},
author = {Kshirsagar, Sumedha and Magnenat-Thalmann, Nadia},
year = {2003},
publisher = {Blackwell Publishers, Inc and the Eurographics Association},
ISSN = {1467-8659},
DOI = {10.1111/1467-8659.t01-2-00711}
}