| Paper: | SP-P9.4 | ||
| Session: | Topics in Speech Synthesis | ||
| Time: | Wednesday, May 19, 15:30 - 17:30 | ||
| Presentation: | Poster | ||
| Topic: | Speech Processing: Speech Synthesis (including TTS) | ||
| Title: | REFINING SEGMENTAL BOUNDARIES FOR TTS DATABASE USING FINE CONTEXTUAL-DEPENDENT BOUNDARY MODELS | ||
| Authors: | Lijuan Wang; Tsinghua University | ||
| Yong Zhao; Microsoft Research Asia | |||
| Min Chu; Microsoft Research Asia | |||
| Jian-Lai Zhou; Microsoft Research Asia | |||
| Zhigang Cao; Tsinghua University | |||
| Abstract: | This paper proposes a post-refining method with fine contextual-dependent GMMs for the auto-segmentation task. A GMM trained with a super feature vector extracted from multiple evenly spaced frames near the boundary is used to describe the waveform evolution across a boundary. CART is used to cluster acoustically similar boundaries, so that the GMM for each leaf node is reliably trained with a small amount of limited manually labeled boundaries. An accuracy of 90% is thus achieved when only about 250 manually labeled sentences are provided to train the refining models. | ||
| Back | |||
Home -||-
Organizing Committee -||-
Technical Committee -||-
Technical Program -||-
Plenaries
Paper Submission -||-
Special Sessions -||-
ITT -||-
Paper Review -||-
Exhibits -||-
Tutorials
Information -||-
Registration -||-
Travel Insurance -||-
Housing -||-
Workshops