INTERSPEECH 2007:
Antwerp, Belgium
INTERSPEECH 2007, 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31, 2007.
ISCA 2007
Keynotes 1-4
Discriminative and Large Margin Techniques in Acoustic Modeling
- Jinyu Li, Chin-Hui Lee:
Soft margin feature extraction for automatic speech recognition.
30-33

- Yan Yin, Hui Jiang:
A fast optimization method for large margin estimation of HMMs based on second order cone programming.
34-37

- Hao-Zheng Li, Douglas D. O'Shaughnessy:
Frame margin probability discriminative training algorithm for noisy speech recognition.
38-41

- Fabio Valente, Jithendra Vepa, Christian Plahl, Christian Gollan, Hynek Hermansky, Ralf Schlüter:
Hierarchical neural networks feature extraction for LVCSR system.
42-45

- Peder A. Olsen, John R. Hershey:
Bhattacharyya error and divergence using variational importance sampling.
46-49

- Tingyao Wu, Jacques Duchateau, Dirk Van Compernolle:
Phoneme dependent frame selection preference.
50-53

Speech Production I, II
- Xinhui Zhou, Carol Y. Espy-Wilson, Mark Tiede, Suzanne Boyce:
An articulatory and acoustic study of "retroflex" and "bunched" american English rhotic sound based on MRI.
54-57

- Paula Martins, Inês Carbone, Augusto Silva, António J. S. Teixeira:
An MRI study of european portuguese nasals.
58-61

- Sayoko Takano, Hiroki Matsuzaki, Kunitoshi Motoki:
A four-cube FEM model of the extrinsic and intrinsic tongue muscles to simulate the production of vowel /i/.
62-65

- Juan F. Torres, Elliot Moore:
Performance evaluation of glottal quality measures from the perspective of vocal tract filter consistency.
66-69

- Veena D. Singampalli, Philip J. B. Jackson:
Statistical identification of critical, dependent and redundant articulators.
70-73

- Chao Qin, Miguel Á. Carreira-Perpiñán:
An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping.
74-77

Phonetic Segmentation and Classification I, II
- Peter Karsmakers, Kristiaan Pelckmans, Johan A. K. Suykens, Hugo Van hamme:
Fixed-size kernel logistic regression for phoneme classification.
78-81

- Seung Seop Park, Jong Won Shin, Jong Kyu Kim, Nam Soo Kim:
A multiple-model based framework for automatic speech segmentation.
82-85

- Aren Jansen, Partha Niyogi:
Semi-supervised learning of speech sounds.
86-89

- Abhinav Parate, Ashish Verma, Jayanta Basak:
Evaluation of syllable stress using single class classifier.
90-93

- Mohammad Nurul Huda, Muhammad Ghulam, Junsei Horikawa, Tsuneo Nitta:
Distinctive phonetic feature (DPF) based phone segmentation using hybrid neural networks.
94-97

- Jean Philippe Goldman, Mathieu Avanzi, Anne-Catherine Simon, Anne Lacheret, Antoine Auchlin:
A methodology for the automatic detection of perceived prominent syllables in spoken French.
98-101

Discourse, Dialog and Conversation
Spoken Dialog Systems I, II
- Craig Wootton, Michael F. McTear, Terry Anderson:
Utilizing online content as domain knowledge in a multi-domain dynamic dialogue system.
122-125

- Boris W. van Schooten, Sophie Rosset, Olivier Galibert, Aurélien Max, Rieks op den Akker, Gabriel Illouz:
Handling speech input in the ritel QA dialogue system.
126-129

- Woosung Kim:
Online call quality monitoring for automating agent-based call centers.
130-133

- Sebastian Möller, Klaus-Peter Engelbrecht, Antti Oulasvirta:
Analysis of communication failures for spoken dialogue systems.
134-137

- Sandra Mann, André Berton, Ute Ehrlich:
How to access audio files of large data bases using in-car speech dialogue systems.
138-141

- Kazunori Komatani, Tatsuya Kawahara, Hiroshi G. Okuno:
Analyzing temporal transition of real user's behaviors in a spoken dialogue system.
142-145

- J. Sherwani, Dong Yu, Tim Paek, Mary Czerwinski, Yun-Cheng Ju, Alex Acero:
Voicepedia: towards speech-based access to unstructured information.
146-149

- Vivek Kumar Rangarajan Sridhar, Srinivas Bangalore, Shrikanth S. Narayanan:
Exploiting prosodic features for dialog act tagging in a discriminative modeling framework.
150-153

- Hua Ai, Antonio Roque, Anton Leuski, David R. Traum:
Using information state to improve dialogue move identification in a spoken dialogue system.
154-157

- Shiu-Wah Chu, Ian M. O'Neill, Philip Hanna:
Using multiple strategies to manage spoken dialogue.
158-161

- Marcelo Quinderé, Luís Seabra Lopes, António J. S. Teixeira:
An information state based dialogue manager for a mobile robot.
162-165

Accent and Language Identification I, II
- Josef G. Bauer, Bernt Andrassy, Ekaterina Timoshenko:
Discriminative optimization of language adapted HMMs for a language identification system based on parallel phoneme recognizers.
166-169

- Khe Chai Sim, Haizhou Li:
Fusion of contrastive acoustic models for parallel phonotactic spoken language identification.
170-173

- Liang Wang, Eliathamby Ambikairajah, Eric H. C. Choi:
Multi-layer kohonen self-organizing feature map for language identification.
174-177

- Bo Yin, Eliathamby Ambikairajah, Fang Chen:
Hierarchical language identification based on automatic language clustering.
178-181

- Ekaterina Timoshenko, Harald Höge:
Using speech rhythm for acoustic language identification.
182-185

- Kakeung Wong, Man-Hung Siu, Brian Mak:
A model-based estimation of phonotactic language verification performance.
186-189

- Mike Rosner, Paulseph-John Farrugia:
A tagging algorithm for mixed language identification in a noisy domain.
190-193

- Doroteo Torre Toledano, Javier Gonzalez-Dominguez, Alejandro Abejón-Gonzalez, Danilo Spada, Ismael Mateos-Garcia, Joaquin Gonzalez-Rodriguez:
Improved language recognition using better phonetic decoders and fusion with MFCC and SDC features.
194-197

Education and Training
- Daniel Bolaños, Wayne Ward, Sarel van Vuuren, Javier Garrido:
Syllable lattices as a basis for a children's speech reading tracker.
198-201

- Fuping Pan, Qingwei Zhao, Yonghong Yan:
Mandarin vowel pronunciation quality evaluation by using formant pattern recognition.
202-205

- Matthew Black, Joseph Tepperman, Sungbok Lee, Patti Price, Shrikanth S. Narayanan:
Automatic detection and classification of disfluent reading miscues in young children's speech for the purpose of assessment.
206-209

- Nobuaki Minematsu, K. Kamata, Satoshi Asakawa, T. Makino, Tazuko Nishimura, Keikichi Hirose:
Structural assessment of language learners' pronunciation.
210-213

- Abdurrahman Samir, Sherif Mahdy Abdou, Ahmed Husien Khalil, Mohsen Rashwan:
Enhancing usability of CAPL system for qur'an recitation learning.
214-217

- Febe de Wet, Christa van der Walt, Thomas Niesler:
Automatic large-scale oral language proficiency assessment.
218-221

Robust ASR I, II
- Yuki Denda, Takamasa Tanaka, Masato Nakayama, Takanobu Nishiura, Yoichi Yamashita:
Noise-robust hands-free voice activity detection with adaptive zero crossing detection using talker direction estimation.
222-225

- Agustín Álvarez Marquina, Rafael Martínez, Pedro Gómez Vilda, Victor Nieto Lluis, V. Rodellar:
A robust mel-scale subband voice activity detector for a car platform.
226-229

- Kentaro Ishizuka, Tomohiro Nakatani, Masakiyo Fujimoto, Noboru Miyazaki:
Noise robust front-end processing with voice activity detection based on periodic to aperiodic component ratio.
230-233

- A. M. Toh, Roberto Togneri, Sven Nordholm:
Feature and distribution normalization schemes for statistical mismatch reduction in reverberant speech recognition.
234-237

- Matthew Gibson, Thomas Hain:
Temporal masking for unsupervised minimum Bayes risk speaker adaptation.
238-241

- Tsung-hsueh Hsieh, Jeih-Weih Hung:
Speech feature compensation based on pseudo stereo codebooks for robust speech recognition in additive noise environments.
242-245

- Dimitrios Dimitriadis, Petros Maragos, Stamatios Lefkimmiatis:
Multiband, multisensor robust features for noisy speech recognition.
246-249

- Akira Sasou, Hiroaki Kojima:
Noise robust speech recognition for voice driven wheelchair.
250-253

Adaptation in ASR I, II
- Yun Tang, Richard C. Rose:
Clustered maximum likelihood linear basis for rapid speaker adaptation.
254-257

- Wen Xuan Teng, Guillaume Gravier, Frédéric Bimbot, Frédéric Soufflet:
Rapid speaker adaptation by reference model interpolation.
258-261

- Randy Gomez, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano:
Rapid unsupervised speaker adaptation using single utterance based on MLLR and speaker selection.
262-265

- Brian Kan-Wing Mak, Roger Wend-Huu Hsiao:
Robustness of several kernel-based fast adaptation methods on noisy LVCSR.
266-269

- Janne Pylkkönen:
Estimating VTLN warping factors by distribution matching.
270-273

- Ming Liu, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang, Zhengyou Zhang:
Frequency domain correspondence for speaker normalization.
274-277

- Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:
Unsupervised training of adaptation rate using q-learning in large vocabulary continuous speech recognition.
278-281

- Martin Karafiát, Lukás Burget, Jan Cernocký, Thomas Hain:
Application of CMLLR in narrow band wide band adapted systems.
282-285

- Christophe Lévy, Georges Linarès, Jean-François Bonastre:
Fast adaptation of GMM-based compact models.
286-289

Speaker Verification & Identification I-IV
- Zahi N. Karam, William M. Campbell:
A new kernel for SVM MLLR based speaker recognition.
290-293

- Kong-Aik Lee, Changhuai You, Haizhou Li, Tomi Kinnunen:
A GMM-based probabilistic sequence kernel for speaker verification.
294-297

- Hagai Aronowitz:
Speaker recognition using kernel-PCA and intersession variability modeling.
298-301

- Réda Dehak, Najim Dehak, Patrick Kenny, Pierre Dumouchel:
Linear and non linear kernel GMM supervector machines for speaker verification.
302-305

- Ignacio Lopez-Moreno, Ismael Mateos-Garcia, Daniel Ramos, Joaquin Gonzalez-Rodriguez:
Support vector regression for speaker verification.
306-309

- Chris Longworth, Mark J. F. Gales:
Derivative and parametric kernels for speaker verification.
310-313

Spoken Data Retrieval I, II
- David R. H. Miller, Michael Kleber, Chia-Lin Kao, Owen Kimball, Thomas Colthurst, Stephen A. Lowe, Richard M. Schwartz, Herbert Gish:
Rapid and accurate spoken term detection.
314-317

- Yi-Cheng Pan, Hung-lin Chang, Berlin Chen, Lin-Shan Lee:
Subword-based position specific posterior lattices (s-PSPL) for indexing speech information.
318-321

- Andreas Merkel, Dietrich Klakow:
Improved methods for language model based question classification.
322-325

- Tomoyosi Akiba, Hirofumi Tsujimura:
Error-tolerant question answering for spoken documents.
326-329

- Dilek Z. Hakkani-Tür, Gökhan Tür, Michael Levit:
Exploiting information extraction annotations for document retrieval in distillation tasks.
330-333

- Kishan Thambiratnam, Frank Seide:
Learning spoken document similarity and recommendation using supervised probabilistic latent semantic analysis.
334-337

Accent and Language Identification I, II
- David A. van Leeuwen, Khiet P. Truong:
An open-set detection evaluation methodology applied to language and emotion recognition.
338-341

- Xi Yang, Man-Hung Siu, Herbert Gish, Brian Mak:
Boosting with anti-models for automatic language identification.
342-345

- Fabio Castaldo, Daniele Colibro, Emanuele Dalmasso, Pietro Laface, Claudio Vair:
Acoustic language identification using fast discriminative training.
346-349

- Ming Li, Hongbin Suo, Xiao Wu, Ping Lu, Yonghong Yan:
Spoken language identification using score vector modeling and support vector machine.
350-353

- Ricardo de Córdoba, Luis Fernando D'Haro, Fernando Fernández-Martínez, Javier Macías Guarasa, Javier Ferreiros:
Language identification based on n-gram frequency ranking.
354-357

- Wade Shen, Douglas A. Reynolds:
Improving phonotactic language recognition with acoustic adaptation.
358-361

Speech Perception I, II
- Michael C. W. Yip:
Spoken word recognition of Chinese homophones: a further investigation.
362-365

- Maria Wolters, Pauline Campbell, Christine DePlacido, Amy Liddell, David Owens:
The role of outer hair cell function in the perception of synthetic versus natural speech.
366-369

- Akiko Kusumoto, Alexander Kain, John-Paul Hosom, Jan P. H. van Santen:
Hybridizing conversational and clear speech.
370-373

- Sophie Dufour, Ulrich H. Frauenfelder:
Neighborhood density and neighborhood frequency effects in French spoken word recognition.
374-377

- Toshio Irino, Yoshie Aoki, Yoshie Hayashi, Hideki Kawahara, Roy D. Patterson:
Discrimination and recognition of scaled word sounds.
378-381

- László Tóth:
Benchmarking human performance on the acoustic and linguistic subtasks of ASR systems.
382-385

- Lin Yang, Jianping Zhang, Yonghong Yan:
Contributions of temporal fine structure cues to Chinese speech recognition in cochlear implant simulation.
386-389

- Xihong Wu, Jing Chen, Zhigang Yang, Qiang Huang, Mengyuan Wang, Liang Li:
Effect of number of masking talkers on speech-on-speech masking in Chinese.
390-393

- Odile Bagou, Sophie Dufour, Cécile Fougeron, Alain Content, Ulrich H. Frauenfelder:
Do different boundary types induce subtle acoustic cues to which French listeners are sensitive?
394-397

- Svante Stadler, Arne Leijon, Björn Hagerman:
An information theoretic approach to predict speech intelligibility for listeners with normal and impaired hearing.
398-401

- Travis Wade, Bernd Möbius:
Speaking rate effects in a landmark-based phonetic exemplar model.
402-405

- Kazumi Maniwa, Allard Jongman, Travis Wade:
Acoustic correlates of intelligibility enhancements in clearly produced fricatives.
406-409

- Tim Jürgens, Thomas Brand, Birger Kollmeier:
Modelling the human-machine gap in speech reception: microscopic speech intelligibility prediction for normal-hearing subjects with an auditory model.
410-413

- Ayako Ikeno, John H. L. Hansen:
Lombard speech impact on perceptual speaker recognition.
414-417

- Huiwen Goy, Kathleen Pichora-Fuller, Pascal van Lieshout, Gurjit Singh, Bruce Schneider:
Effect of within- and between-talker variability on word identification in noise by younger and older adults.
418-421

- H. Timothy Bunnell, N. Carolyn Schanen, Linda D. Vallino, Thierry G. Morlet, James B. Polikoff, Jennette D. Driscoll, James T. Mantell:
Speech perception in children with speech sound disorder.
422-425

- Huan Wang, Werner Hemmert:
Speech coding and information processing by auditory neurons.
426-429

- Annie C. Gilbert, Victor J. Boucher:
What do listeners attend to in hearing prosodic structures? investigating the human speech-parser using short-term recall.
430-433

Prosody:
Prosodic Structure
Prosodic Modeling I, II
- Jian Yu, Lixing Huang, Jianhua Tao, Xia Wang:
Modeling incompletion phenomenon in Mandarin dialog prosody.
462-465

- Anne Tamm, Kálmán Abari, Gábor Olaszy:
Accent assignment algorithm in Hungarian, based on syntactic analysis.
466-469

- Cheng-Yuan Lin, Pei-Chi Jao, Jyh-Shing Roger Jang:
An effective initial/final duration prediction method for corpus-based singing voice synthesis of Mandarin Chinese.
470-473

- Géza Németh, Márk Fék, Tamás Gábor Csapó:
Increasing prosodic variability of text-to-speech synthesizers.
474-477

- Damien Lolive, Nelly Barbot, Olivier Boëffard:
Unsupervised HMM classification of F0 curves.
478-481

- Ian Read, Stephen Cox:
Automatic pitch accent prediction for text-to-speech synthesis.
482-485

- Xinqiang Ni, Yining Chen, Frank K. Soong, Min Chu, Ping Zhang:
An unsupervised approach to automatic prosodic annotation.
486-489

- Zeynep Inanoglu, Steve Young:
A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality.
490-493

- Chen-Yu Chiang, Hsiu-Min Yu, Yih-Ru Wang, Sin-Horng Chen:
An automatic prosody labeling method for Mandarin speech.
494-497

Speech Analysis
Spectral Analysis, Formants and Vocal Tract Models
- Toon van Waterschoot, Marc Moonen:
Linear prediction of audio signals.
518-521

- Carlo Magi, Tomas Bäckström, Paavo Alku:
Stabilised weighted linear prediction - a robust all-pole method for speech processing.
522-525

- Daniel Rudoy, Daniel N. Spendley, Patrick J. Wolfe:
Conditionally linear Gaussian models for estimating vocal tract resonances.
526-529

- Karl Schnell, Arild Lacroix:
Time-varying pre-emphasis and inverse filtering of speech.
530-533

- Joachim Thiemann, Peter Kabal:
Reconstructing audio signals from modified non-coherent hilbert envelopes.
534-537

- Binh Phu Nguyen, Masato Akagi:
A flexible spectral modification method based on temporal decomposition and Gaussian mixture model.
538-541

- Jonathan Darch, Ben Milner:
A comparison of estimated and MAP-predicted formants and fundamental frequencies with a speech reconstruction application.
542-545

- Huiqun Deng, Douglas D. O'Shaughnessy:
Effect of incomplete glottal closures on estimates of glottal waves via inverse filtering of vowel sounds.
546-549

- Kaustubh Kalgaonkar, Mark A. Clements:
Vocal tract and area function estimation with both lip and glottal losses.
550-553

- S. Guruprasad, B. Yegnanarayana, K. Sri Rama Murty:
Detection of instants of glottal closure using characteristics of excitation source.
554-557

- Nicolas Sturmel, Christophe d'Alessandro, Boris Doval:
A comparative evaluation of the zeros of z transform representation for voice source estimation.
558-561

Speech and Audio Processing for Intelligent Environments
- Aki Härmä:
Ambient telephony: scenarios and research challenges.
562-565

- Yasunari Obuchi, Akio Amano:
Always listening to you: creating exhaustive audio database in home environments.
566-569

- Joerg Schmalenstroeer, Reinhold Haeb-Umbach:
Joint speaker segmentation, localization and identification for streaming audio.
570-573

- Yan-Chen Lu, Martin Cooke, Heidi Christensen:
Active binaural distance estimation for dynamic sources.
574-577

- Bengt J. Borgström, Abeer Alwan:
A packetization and variable bitrate interframe compression scheme for vector quantizer-based distributed speech recognition.
578-581

- Matthias Wölfel:
Channel selection by class separability measures for automatic transcriptions on distant microphones.
582-585

- Danny Wyatt, Tanzeem Choudhury, Jeff Bilmes:
Conversation detection and speaker segmentation in privacy-sensitive situated speech data.
586-589

- Alberto Abad, Carlos Segura, Climent Nadeu, Javier Hernando:
Audio-based approaches to head orientation estimation in a smart-room.
590-593

- Valentin Ion, Reinhold Haeb-Umbach:
Multi-resolution soft features for channel-robust distributed speech recognition.
594-597

Language Modeling I, II
- Yi Su, Frederick Jelinek, Sanjeev Khudanpur:
Large-scale random forest language models for speech recognition.
598-601

- Yuya Akita, Yusuke Nemoto, Tatsuya Kawahara:
PLSA-based topic detection in meetings for adaptation of lexicon and language model.
602-605

- Atsushi Sako, Tetsuya Takiguchi, Yasuo Ariki:
Language modeling using PLSA-based topic HMM.
606-609

- Yi-Cheng Pan, Lin-Shan Lee:
Lexicon adaptation with reduced character error (LARCE) - a new direction in Chinese language modeling.
610-613

- Meng-Sung Wu, Jen-Tzung Chien:
Minimum rank error training for language modeling.
614-617

- Wen Wang, Andreas Stolcke:
Integrating MAP, marginals, and unsupervised language model adaptation.
618-621

Prosody Production and Perception
Multimodal Speech Recognition
- Noureddine Aboutabit, Denis Beautemps, Jeanne Clarke, Laurent Besacier:
A HMM recognition of consonant-vowel syllables from lip contours: the cued speech case.
646-649

- Patrick Lucey, Gerasimos Potamianos, Sridha Sridharan:
A unified approach to multi-pose audio-visual ASR.
650-653

- Rowan Seymour, Darryl Stewart, Ji Ming:
Audio-visual integration for robust speech recognition using maximum weighted stream posteriors.
654-657

- Thomas Hueber, Gérard Chollet, Bruce Denby, Gérard Dreyfus, Maureen Stone:
Continuous-speech phone recognition from ultrasound and optical images of the tongue and lips.
658-661

- Bo Zhu, Timothy J. Hazen, James R. Glass:
Multimodal speech recognition with ultrasonic sensors.
662-665

- David Dean, Patrick Lucey, Sridha Sridharan, Tim Wark:
Fused HMM-adaptation of multi-stream HMMs for audio-visual speech recognition.
666-669

Speech and Other Modalities
- Carlos Toshinori Ishi, Hiroshi Ishiguro, Norihiro Hagita:
Analysis of head motions and speech in spoken dialogue.
670-673

- Lars Bo Larsen, Kasper Løvborg Jensen, Søren Larsen, Morten H. Rasmussen:
A paradigm for mobile speech-centric services.
674-677

- Pavel Campr, Marek Hrúz, Milos Zelezný:
Design and recording of Czech sign language corpus for automatic sign language recognition.
678-681

- Jens Edlund, Jonas Beskow:
Pushy versus meek - using avatars to influence turn-taking behaviour.
682-685

- Michael Wand, Szu-Chen Stan Jou, Tanja Schultz:
Wavelet-based front-end for electromyographic speech recognition.
686-689

- Gaëlle Ferré, Roxane Bertrand, Philippe Blache, Robert Espesser, Stéphane Rauzy:
Intensive gestures in French and their multimodal correlates.
690-693

- Slim Ouni, Kaïs Ouni:
Aspects of visual speech in Arabic.
694-697

- Denis Burnham, Jessica Reynolds, Guillaume Vignali, Sandra Bollwerk, Caroline Jones:
Rigid vs non-rigid face and head motion in phone and tone perception.
698-701

Multimodal/Multimedia Signal Processing
- Hedvig Kjellström, Olov Engwall, Sherif Mahdy Abdou, Olle Bälter:
Audio-visual phoneme classification for pronunciation training applications.
702-705

- Katja Grauwinkel, Britta Dewitt, Sascha Fagel:
Visual information and redundancy conveyed by internal articulator dynamics in synthetic audiovisual speech.
706-709

- Wei Zhou, Zengfu Wang:
A speech rate related lip movement model for speech animation.
710-713

- Guanyong Wu, Jie Zhu:
An extension 2DPCA based visual feature extraction method for audio-visual speech recognition.
714-717

- Soo-jong Lee, Jun Park, Eung-kyeu Kim:
Preventing an external acoustic noise from being misrecognized as a speech recognition object by confirming the lip movement image signal.
718-721

- Gregor Hofer, Hiroshi Shimodaira:
Automatic head motion prediction from speech data.
722-725

- Yuki Denda, Takanobu Nishiura, Yoichi Yamashita:
Omnidirectional audio-visual talker localizer with dynamic feature fusion based on validity and reliability criteria.
726-729

- Nick Campbell, Damien Douxchamps:
Processing image and audio information for recognising discourse participation status through features of face and voice.
730-733

Speaker Verification & Identification I-IV
- José R. Calvo, Rafael Fernández, Gabriel Hernández:
Application of shifted delta cepstral features in speaker verification.
734-737

- Luciana Ferrer, M. Kemal Sönmez, Elizabeth Shriberg:
A smoothing kernel for spatially related features and its application to speaker verification.
738-741

- Delphine Charlet, Mikaël Collet, Frédéric Bimbot:
VZ-norm: an extension of z-norm to the multivariate case for anchor model based speaker verification.
742-745

- Howard Lei, Nikki Mirghafori:
Word-conditioned HMM supervectors for speaker recognition.
746-749

- Wei-Ho Tsai:
Speaker clustering using direct maximization of a BIC-based score.
750-753

- Alexandre Preti, Jean-François Bonastre, Driss Matrouf, François Capman, Bertrand Ravera:
Confidence measure based unsupervised target model adaptation for speaker verification.
754-757

- Huanjun Bao, Ming-Xing Xu, Thomas Fang Zheng:
Emotion attribute projection for speaker recognition on emotional speech.
758-761

- Shi-Xiong Zhang, Man-Wai Mak, Helen M. Meng:
High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling.
762-765

- T. Yingthawornsuk, H. Kaymaz Keskinpala, D. Mitchell Wilkes, Richard G. Shiavi, R. M. Salomon:
Direct acoustic feature using iterative EM algorithm and spectral energy for classifying suicidal speech.
766-769

- Claudio Garretón, Néstor Becerra Yoma, Fernando Huenupán, Carlos Molina:
On comparing and combining intra-speaker variability compensation and unsupervised model adaptation in speaker verification.
770-773

- Xianyu Zhao, Yuan Dong, Hao Yang, Jian Zhao, Liang Lu, Haila Wang:
Comparison of two kinds of speaker location representation for SVM-based speaker verification.
774-777

- Mireia Farrús, Javier Hernando, Pascual Ejarque:
Jitter and shimmer measurements for speaker recognition.
778-781

- Zhenyu Shan, Yingchun Yang, Ruizhi Ye:
Natural-emotion GMM transformation algorithm for emotional speaker recognition.
782-785

- Ivy H. Tseng, Olivier Verscheure, Deepak S. Turaga, Upendra V. Chaudhari:
Optimized one-bit quantization for adapted GMM-based speaker verification.
786-789

- Mitchell McLaren, Robbie Vogt, Brendan Baker, Sridha Sridharan:
A comparison of session variability compensation techniques for SVM-based speaker recognition.
790-793

- Benoit G. B. Fauve, Nicholas W. D. Evans, Neil Pearson, Jean-François Bonastre, John S. D. Mason:
Influence of task duration in text-independent speaker verification.
794-797

Speech Enhancement
- Kamil K. Wójcicki, Stephen So, Kuldip K. Paliwal:
The effect of the additivity assumption on time and frequency domain wiener filtering for speech enhancement.
798-801

- Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki:
Noise reduction based on adaptive β-order generalized spectral subtraction for speech enhancement.
802-805

- Amit Das, John H. L. Hansen:
Class constrained ROVER based speech enhancement.
806-809

- Erhan Deger, Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu, Md. Kamrul Hasan:
EMD based soft-thresholding for speech enhancement.
810-813

- Adam Borowicz, Alexander A. Petrovsky:
An approximate solution for perceptually constrained signal subspace speech enhancement method.
814-817

- Tim Fingscheidt, Suhadi Suhadi:
Quality assessment of speech enhancement systems by separation of enhanced speech, noise, and echo.
818-821

- Anis Ben Aicha, Sofia Ben Jebara:
Perceptual musical noise reduction using critical bands tonality coefficients and masking thresholds.
822-825

- Dirk Mauler, Anil M. Nagathil, Rainer Martin:
On optimal estimation of compressed speech for hearing aids.
826-829

- Richard C. Hendriks, Jesper Jensen, Richard Heusdens:
DFT domain subspace based noise tracking for speech enhancement.
830-833

- Nitish Krishnamurthy, John H. L. Hansen:
Noise tracking for speech systems in adverse environments.
834-837

- Abderrahman Essebbar, Tristan Poinsard:
Speech enhancement using multi-reference noise reduction in a vehicle environment.
838-841

- Ernst Warsitz, Reinhold Haeb-Umbach, Dang Hai Tran Vu:
Blind adaptive principal eigenvector beamforming for acoustical source separation.
842-845

- Zbynek Koldovský, Petr Tichavský:
Time-domain blind audio source separation using advanced ICA methods.
846-849

- Siu Wa Lee, Frank K. Soong, Pak-Chung Ching:
Model-based speech separation with single-microphone input.
850-853

- Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Masato Miyoshi:
Multi-step linear prediction based speech dereverberation in noisy reverberant environment.
854-857

- Seung Yeol Lee, Jong Won Shin, Hwan Sik Yun, Nam Soo Kim:
A statistical model based post-filtering algorithm for residual echo suppression.
858-861

- Xiaoshan Huang, Xiaoqun Zhao:
An optimal speech enhancement under speech uncertainty probability and masking property of auditory system.
862-865

Structure-based and Template-based Automatic Speech Recognition
- Viktoria Maier, Roger K. Moore:
Temporal episodic memory model: an evolution of minerva2.
866-869

- Gianpaolo Coro, Francesco Cutugno, Fulvio Caropreso:
Speech recognition with factorial-HMM syllabic acoustic models.
870-873

- Mathias De Wachter, Kris Demuynck, Patrick Wambacq, Dirk Van Compernolle:
Evaluating acoustic distance measures for template based recognition.
874-877

- Yan Han, Lou Boves:
Hierarchical acoustic modeling based on random-effects regression for automatic speech recognition.
878-881

- Annika Hämäläinen, Louis ten Bosch, Lou Boves:
Construction and analysis of multiple paths in syllable models.
882-885

- Carol Y. Espy-Wilson, Tarun Pruthi, Amit Juneja, Om Deshmukh:
Landmark-based approach to speech recognition: an alternative to HMMs.
886-889

- Satoshi Asakawa, Nobuaki Minematsu, Keikichi Hirose:
Automatic recognition of connected vowels only using speaker-invariant representation of speech dynamics.
890-893

- Roberto Togneri, Li Deng:
A structured speech model parameterized by recursive dynamics and neural networks.
894-897

- Li Deng, Helmer Strik:
Structure-based and template-based automatic speech recognition - comparing parametric and non-parametric approaches.
898-901

- David Grangier, Samy Bengio:
Learning the inter-frame distance for discriminative template-based keyword detection.
902-905

- Dong Yu, Li Deng, Alex Acero:
Handling phonetic context and speaker variation in a structure-based speech recognizer.
906-909

Robust ASR Against Noise and Reverberation
- Maarten Van Segbroeck, Hugo Van hamme:
Vector-quantization based mask estimation for missing data automatic speech recognition.
910-913

- Sébastien Demange, Christophe Cerisara, Jean Paul Haton:
Accurate marginalization range for missing data recognition.
914-917

- Marco Kühne, Roberto Togneri, Sven Nordholm:
Smooth soft mel-spectrographic masks based on blind sparse source separation.
918-921

- Jonathan Laidler, Martin Cooke, Neil D. Lawrence:
Model-driven detection of clean speech patches in noise.
922-925

- Richard M. Stern, Evandro B. Gouvêa, Govindarajan Thattai:
"polyaural" array processing for automatic speech recognition in degraded environments.
926-929

- Nicolás Morales, Liang Gu, Yuqing Gao:
Adding noise to improve noise robustness in speech recognition.
930-933

Language Resources and Tools
- Eric Fosler-Lussier, Laura Dilley, Na'im Tyson, Mark Pitt:
The buckeye corpus of speech: updates and enhancements.
934-937

- Nora Barroso, Aitzol Ezeiza, N. Gilisagasti, Karmele López de Ipiña, A. López, Juan Miguel López:
Development of multimodal resources for multilingual information retrieval in the basque context.
938-941

- Reva Schwartz, Wade Shen, Joseph P. Campbell, Shelley Paget, Julie Vonwiller, Dominique Estival, Christopher Cieri:
Construction of a phonotactic dialect corpus using semiautomatic annotation.
942-945

- Slim Abdennadher, Mohamed Aly, Dirk Bühler, Wolfgang Minker, Johannes Pittermann:
BECAM tool - a semi-automatic tool for bootstrapping emotion corpus annotation and management.
946-949

- Christopher Cieri, Linda Corson, David Graff, Kevin Walker:
Resources for new research directions in speaker recognition: the mixer 3, 4 and 5 corpora.
950-953

- Peter A. Heeman, Andy McMillin, J. Scott Yaruss:
Intercoder reliability in annotating complex disfluencies.
954-957

Single-channel Speech Enhancement
- Mohammad H. Radfar, Richard M. Dansereau:
Single channel speech separation using maximum a posteriori estimation.
958-961

- Suhadi Suhadi, Tim Fingscheidt:
Speech enhancement with improved a posteriori SNR computation.
962-965

- Thang Vu Tat, Germine Seide, Masashi Unoki, Masato Akagi:
Method of LP-based blind restoration for improving intelligibility of bone-conducted speech.
966-969

- Tiago H. Falk, Svante Stadler, W. Bastiaan Kleijn, Wai-Yip Chan:
Noise suppression based on extending a speech-dominated modulation band.
970-973

- Amin Haji Abolhassani, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy, Mohamed-Faouzi Harkat:
Speech enhancement using PCA and variance of the reconstruction error model identification.
974-977

- Jong Won Shin, Woohyung Lim, June Sig Sung, Nam Soo Kim:
Speech reinforcement based on partial specific loudness.
978-981

Phonetics and Phonology
- Tamara Rathcke, Jonathan Harrington:
The phonetics and phonology of high and low tones in two falling f0-contours in standard German.
982-985

- Tina John, Jonathan Harrington:
Temporal alignment of creaky voice in neutralised realisations of an underlying, post-nasal voicing contrast in German.
986-989

- Mike Demol, Werner Verhelst, Piet Verhoeve:
The duration of speech pauses in a multilingual environment.
990-993

- Dafydd Gibbon, Jolanta Bachan, Grazyna Demenko:
Syllable timing patterns in Polish: results from annotation mining.
994-997

- Constandinos Kalimeris, Stelios Bakamidis:
Minimal pairs and functional loads of sound contrasts obtained from a list of modern greek words.
998-1001

- Daan Wissing:
More on acoustic correlates of stress.
1002-1005

- Cécile Woehrling, Philippe Boula de Mareüil:
Comparing praat and snack formant measurements on two large corpora of northern and southern French.
1006-1009

- William J. Barry, Bistra Andreeva, Ingmar Steiner:
The phonetic exponency of phrasal accentuation in French and German.
1010-1013

- Christiana Christodoulou:
Phonetic geminates in cypriot greek: the case of voiceless plosives.
1014-1017

- Darcie Williams, François Poiré:
Predicting vowel duration in spontaneous canadian French speech.
1018-1021

- Ivan Chow, François Poiré:
Rhotic variation and schwa epenthesis in windsor French.
1022-1025

- Audrey Bürki, Cécile Fougeron, Cédric Gendrot:
On the categorical nature of the process involved in schwa elision in French.
1026-1029

- Yue-Ning Hu, Min Chu, Chao Huang, Yan-Ning Zhang:
Exploring tonal variations via context-dependent tone models.
1030-1033

- Philippe Martin, Jun Li:
Acoustic analysis of the neutral tone in Mandarin.
1034-1037

- Rerrario Shui-Ching Ho, Yoshinori Sagisaka:
F0 analysis of perceptual distance among Cantonese level tones.
1038-1041

Robust ASR I, II
- Yu Hu, Qiang Huo:
Irrelevant variability normalization based HMM training using VTS approximation of an explicit model of environmental distortions.
1042-1045

- Luis Buera, Antonio Miguel, Eduardo Lleida, Oscar Saz, Alfonso Ortega:
On the jointly unsupervised feature vector normalization and acoustic model compensation for robust speech recognition.
1046-1049

- Yu Tsao, Chin-Hui Lee:
An ensemble modeling approach to joint characterization of speaker and speaking environments.
1050-1053

- Shih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen:
Cluster-based polynomial-fit histogram equalization (CPHEQ) for robust speech recognition.
1054-1057

- Pedro M. Martinez, José C. Segura, Luz García:
Robust distributed speech recognition using histogram equalization and correlation information.
1058-1061

- Jen-Tzung Chien, Koichi Shinoda, Sadaoki Furui:
Predictive minimum Bayes risk classification for robust speech recognition.
1062-1065

- Ning Ma, Jon Barker, Phil Green:
Applying word duration constraints by using unrolled HMMs.
1066-1069

- Xiong Xiao, Engsiong Chng, Haizhou Li:
Evaluating the temporal structure normalisation technique on the Aurora-4 task.
1070-1073

- Hynek Boril, Petr Fousek, Harald Höge:
Two-stage system for robust neutral/lombard speech recognition.
1074-1077

- Takatoshi Jitsuhiro, Tomoji Toriyama, Kiyoshi Kogure:
Noise suppression using search strategy with multi-model compositions.
1078-1081

- Takanobu Nishiura, Yoshiki Hirano, Yuki Denda, Masato Nakayama:
Investigations into early and late reflections on distant-talking speech recognition toward suitable reverberation criteria.
1082-1085

- Stefan Windmann, Reinhold Haeb-Umbach:
An approach to iterative speech feature enhancement and recognition.
1086-1089

- Jeih-Weih Hung:
Optimization of temporal filters in the modulation frequency domain for constructing robust features in speech recognition.
1090-1093

- Rico Petrick, Kevin Lohde, Matthias Wolff, Rüdiger Hoffmann:
The harming part of room acoustics in automatic speech recognition.
1094-1097

- Yuan-Fu Liao, Yh-Her Yang, Chi-Hui Hsu, Cheng-Chang Lee, Jing-Teng Zeng:
A reference model weighting-based method for robust speech recognition.
1098-1101

- Babak Nasersharif, Ahmad Akbari, Mohammad Mehdi Homayounpour:
Mel sub-band filtering and compression for robust speech recognition.
1102-1105

Features for ASR
- Chang-Wen Hsu, Lin-Shan Lee:
Extended powered cepstral normalization (p-CN) with range equalization for robust features in speech recognition.
1106-1109

- Makoto Sakai, Norihide Kitaoka, Seiichi Nakagawa:
Selection of optimal dimensionality reduction method using chernoff bound for segmental unit input HMM.
1110-1113

- Vivek Tyagi:
Fepstrum: an improved modulation spectrum for ASR.
1114-1117

- Dusan Macho:
Narrowband to wideband feature expansion for robust multilingual ASR.
1118-1121

- Weifeng Li, Hervé Bourlard:
Non-linear spectral contrast stretching for in-car speech recognition.
1122-1125

- Xiao-Bing Li, Douglas D. O'Shaughnessy:
Clustering-based two-dimensional linear discriminant analysis for speech recognition.
1126-1129

- Yotaro Kubo, Shigeki Okawa, Akira Kurematsu, Katsuhiko Shirai:
A study on temporal features derived by analytic signal.
1130-1133

- Stephen A. Zahorian, Tara Singh, Hongbing Hu:
Dimensionality reduction of speech features using nonlinear principal components analysis.
1134-1137

- D. Rama Sanand, D. Dinesh Kumar, Srinivasan Umesh:
Linear transformation approach to VTLN using dynamic frequency warping.
1138-1141

- Vladimir Fabregas Surigué de Alencar, Abraham Alcaim:
Features interpolation domain for distributed speech recognition and performance for ITU-t g.723.1 CODEC.
1142-1145

- Shoei Sato, Kazuo Onoe, Akio Kobayashi, Shinichi Homma, Toru Imai, Tohru Takagi, Tetsunori Kobayashi:
Dynamic integration of multiple feature streams for robust real-time LVCSR.
1146-1149

- Hironori Matsumasa, Tetsuya Takiguchi, Yasuo Ariki, Ichao Li, Toshitaka Nakabayashi:
PCA-based feature extraction for fluctuation in speaking style of articulation disorders.
1150-1153

- Fabio Valente, Jithendra Vepa, Hynek Hermansky:
Multi-stream features combination based on dempster-shafer rule for LVCSR system.
1154-1157

- Natasha Singh-Miller, Michael Collins, Timothy J. Hazen:
Dimensionality reduction for speech recognition using neighborhood components analysis.
1158-1161

- Dan Su, Xihong Wu, Huisheng Chi:
Probabilistic latent speaker analysis for large vocabulary speech recognition.
1162-1165

- S. R. Mahadeva Prasanna, Hynek Hermansky:
MRASTA and PLP in automatic speech recognition.
1166-1169

Objective Assessment of Voice and Speech Quality
- Markus Brckl:
Women's vocal aging: a longitudinal approach.
1170-1173

- Laurence Cnockaert, Jean Schoentgen, Canan Ozsancak, Pascal Auzou, Francis Grenez:
Effect of intensive voice therapy on vocal tremor for parkinson speakers.
1174-1177

- Ali Alpan, Abdellah Kacha, Francis Grenez, Jean Schoentgen:
Assessment of vocal dysperiodicities in connected disordered speech.
1178-1181

- Anne-Maria Laukkanen, Jaromír Horácek, Pavel Svancara, Elina Lehtinen:
Effects of FE modelled consequences of tonsillectomy on perceptual evaluation of voice.
1182-1185

- Irma Verdonck-de Leeuw, Louis ten Bosch, Li Ying Chao, Rico N. P. M. Rinkel, Pepijn A. Borggreven, Lou Boves, C. René Leemans:
Speech quality after major surgery of the oral cavity and oropharynx with microvascular soft tissue reconstruction.
1186-1189

- Christel G. de Bruijn, Sandra P. Whiteside:
Voice fatigue and use of speech recognition: a study of voice quality ratings.
1190-1193

- Jean-François Bonastre, Corinne Fredouille, Alain Ghio, Antoine Giovanni, Gilles Pouchoulin, Joana Revis, Bernard Teston, P. Yu:
Complementary approaches for voice disorder assessment.
1194-1197

- Gilles Pouchoulin, Corinne Fredouille, Jean-François Bonastre, Alain Ghio, Antoine Giovanni:
Frequency study for the characterization of the dysphonic voices.
1198-1201

- Victor J. Boucher:
Acoustic correlates of laryngeal-muscle fatigue: findings for a phonometric prevention of acquired voice pathologies.
1202-1205

- Andreas Maier, Maria Schuster, Anton Batliner, Elmar Nöth, Emeka Nkenke:
Automatic scoring of the intelligibility in patients with cancer of the oral cavity.
1206-1209

- Jacques Duchateau, Leen Cleuren, Hugo Van hamme, Pol Ghesquière:
Automatic assessment of children's reading level.
1210-1213

- Carlos A. Ferrer, María Esperanza Hernández-Díaz, Eduardo González:
Using waveform matching techniques in the measurement of shimmer in voiced signals.
1214-1217

- Rubén Fraile, Juan Ignacio Godino-Llorente, Nicolás Sáenz-Lechón, Víctor Osma-Ruiz, Pedro Gómez Vilda:
Analysis of the impact of analogue telephone channel on MFCC parameters for voice pathology detection.
1218-1221

- Claudia Manfredi, Leonardo Bocchi, G. Cantarella, Giorgio Peretti, G. Guidi, V. Mezzatesta:
Objective parameters from videokymographic images: a user-friendly interface.
1222-1225

Speaker Verification & Identification I-IV
- Elizabeth Shriberg, Luciana Ferrer:
A text-constrained prosodic system for speaker verification.
1226-1229

- Asmaa El Hannani, Dijana Petrovska-Delacrétaz:
Fusing acoustic, phonetic and data-driven systems for text-independent speaker verification.
1230-1233

- Najim Dehak, Patrick Kenny, Pierre Dumouchel:
Continuous prosodic features and formant modeling with joint factor analysis for speaker verification.
1234-1237

- Claudio Vair, Daniele Colibro, Fabio Castaldo, Emanuele Dalmasso, Pietro Laface:
Loquendo - Politecnico di torino's 2006 NIST speaker recognition evaluation system.
1238-1241

- Driss Matrouf, Nicolas Scheffer, Benoit G. B. Fauve, Jean-François Bonastre:
A straightforward and efficient implementation of the factor analysis model for speaker verification.
1242-1245

- Timothy J. Hazen, Daniel Schultz:
Multi-modal user authentication from video for mobile or variable-environment applications.
1246-1249

Discourse, Dialog and Emotion Expression
Prosodic Modeling I, II
- Keikichi Hirose, Keiko Ochi, Nobuaki Minematsu:
Corpus-based generation of prosodic features from text based on generation process model.
1274-1277

- Jilei Tian, Jani Nurminen, Imre Kiss:
Novel eigenpitch-based prosody model for text-to-speech synthesis.
1278-1281

- Volker Strom, Ani Nenkova, Robert A. J. Clark, Yolanda Vazquez-Alvarez, Jason M. Brenier, Simon King, Dan Jurafsky:
Modelling prominence and emphasis improves unit-selection synthesis.
1282-1285

- Seiya Takada, Yuji Yagi, Keikichi Hirose, Nobuaki Minematsu:
A framework of reply speech generation for concept-to-speech conversion in spoken dialogue systems.
1286-1289

- Thorsten Stocksmeier, Stefan Kopp, Dafydd Gibbon:
Synthesis of prosodic attitudinal variants in German backchannel ja.
1290-1293

- Ke Li, Yoko Greenberg, Yoshinori Sagisaka:
Inter-language prosodic style modification experiment using word impression vector for communicative speech generation.
1294-1297

Resource Acquisition and Preparation; Resource and System Evaluation
- Ivan Habernal, Miloslav Konopík:
JAAE: the java abstract annotation editor.
1298-1301

- Goshu Nagino, Makoto Shozakai, Kiyohiro Shikano:
How to judge reusability of existing speech corpora for target task by utilizing statistical multidimensional scaling.
1302-1305

- Peter Rutten:
Feasibility of constructing an expressive speech corpus from television soap opera dialogue.
1306-1309

- Rosemary Orr, Bernat González i Llinares, Françoise Petersen, Helge Hüttenrauch, Martin Böcker, Michael Tate:
Collection of empirical data for standardization of generic vocabularies in speech driven ICT devices and services.
1310-1313

- Antonio Marcos Selmini, Fábio Violaro:
Acoustic-phonetic features for refining the explicit speech segmentation.
1314-1317

- Benjamin Lecouteux, Georges Linarès, Frédéric Beaugendre, Pascal Nocera:
Text island spotting in large speech databases.
1318-1321

- Tim Paek, Yun-Cheng Ju, Christopher Meek:
People watcher: a game for eliciting human-transcribed data for automated directory assistance.
1322-1325

- Andrew L. Kun, Tim Paek, Zeljko Medenica:
The effect of speech interface accuracy on driving performance.
1326-1329

- Hua Zhang, Lijuan Wang, Frank K. Soong, Wenju Liu:
Context constrained-generalized posterior probability for verifying phone transcriptions.
1330-1333

- Pongtep Angkititrakul, DongGu Kwak, SangJo Choi, JeongHee Kim, Anh PhucPhan, Amardeep Sathyanarayana, John H. L. Hansen:
Getting start with UTDrive: driver-behavior modeling and assessment of distraction for in-vehicle speech systems.
1334-1337

- BalaKrishna Kolluru, Yoshihiko Gotoh:
Relative evaluation of informativeness in machine generated summaries.
1338-1341

- Toshiyuki Takezawa, Masahide Mizushima, Tohru Shimizu, Gen-ichiro Kikui:
A method for evaluating task-oriented spoken dialog translation systems based on communication efficiency.
1342-1345

- Charlotte van Hooijdonk, Edwin Commandeur, Reinier Cozijn, Emiel Krahmer, Erwin Marsi:
Using eye movements for online evaluation of speech synthesis.
1346-1349

- Jian Li, Dmitry Sityaev, Jie Hao:
Sentence level intelligibility evaluation for Mandarin text-to-speech systems using semantically unpredictable sentences.
1350-1353

- Judith M. Kessens, David A. van Leeuwen:
N-best: the northern- and southern-dutch benchmark evaluation of speech recognition technology.
1354-1357

- Trym Holter, Svein Srsdal:
A MAP based approach to adaptive speech intelligibility measurements.
1358-1361

- Sirinoot Boonsuk, Proadpran Punyabukkana, Atiwong Suchato:
Phone boundary detection using selective refinements and context-dependent acoustic features.
1362-1365

Speech Production I, II
- Sorin Dusan:
Vocal tract length during speech production.
1366-1369

- Nobuhiro Miki, Kyohei Hayashi:
Approximation method of subglottal system using ARMA filter.
1370-1373

- Asterios Toutios, Konstantinos G. Margaritis:
Enhancing acoustic-to-EPG mapping with lip position information.
1374-1377

- Tokihiko Kaburagi, Yosuke Tanabe:
A model of glottal flow incorporating viscous-inviscid interaction.
1378-1381

- Kilian G. Seeber:
Thinking outside the cube: modeling language processing tasks in a multiple resource paradigm.
1382-1385

- Julien Cisonni, Annemie Van Hirtum, Jan Willems, Xavier Pelorson:
Experimental validation of direct and inverse glottal flow models for unsteady flow conditions.
1386-1389

- Hideyuki Nomura, Tetsuo Funada:
Effect of unsteady glottal flow on the speech production process.
1390-1393

- Katrin Schneider, Bernd Möbius:
Word stress correlates in spontaneous child-directed speech in German.
1394-1397

- Michael Aron, Nicolas Ferveur, Erwan Kerrien, Marie-Odile Berger, Yves Laprie:
Acquisition and synchronization of multimodal articulatory data.
1398-1401

- Vincent Robert, Yves Laprie, Anne Bonneau:
A phonetic concatenative approach of labial coarticulation.
1402-1405

- Aseel Turkmani, Adrian Hilton, Philip J. B. Jackson, James D. Edge:
Visual analysis of lip coarticulation in VCV utterances.
1406-1409

- Matti Airas, Paavo Alku:
Comparison of multiple voice source parameters in different phonation types.
1410-1413

- Monja A. Knoll, Lisa Scharrer:
Acoustic and affective comparisons of natural and imaginary infant-, foreigner- and adult-directed speech.
1414-1417

- André Arajo, Luis M. T. Jesus, Isabel M. Costa:
Vowel production in two occlusal classes.
1418-1421

- Rajesh Khatiwada:
Nepalese retroflex stops: a static palatography study of inter- and intra-speaker variability.
1422-1425

- Charles A. Lamoureux, Victor J. Boucher:
Effects of testosterone levels on temporal and intonational aspects of speech: more exploratory data.
1426-1428

ASR:
New Paradigms
- Tien Ping Tan, Laurent Besacier:
Modeling context and language variation for non-native speech recognition.
1429-1432

- Xufang Zhao, Douglas D. O'Shaughnessy:
An evaluation of cross-language adaptation and native speech training for rapid HMM construction based on very limited training data.
1433-1436

- Konstantin Markov, Satoshi Nakamura:
Never-ending learning with dynamic hidden Markov network.
1437-1440

- Catherine Breslin, Mark J. F. Gales:
Building multiple complementary systems using directed decision trees.
1441-1444

- Hiroaki Nanjo, Yuichi Oku, Takehiko Yoshimi:
Automatic speech recognition framework for multilingual audio contents.
1445-1448

- Ghazi Bouselmi, Dominique Fohr, Irina Illina:
Combined acoustic and pronunciation modelling for non-native speech recognition.
1449-1452

- Tadashi Emori, Yoshifumi Onishi, Koichi Shinoda:
Automatic estimation of scaling factors among probabilistic models in speech recognition.
1453-1456

- Emilian Stoimenov, John W. McDonough:
Memory efficient modeling of polyphone context with weighted finite-state transducers.
1457-1460

- Valeriy Pylypenko:
Extra large vocabulary continuous speech recognition algorithm based on information retrieval.
1461-1464

- I. Lee Hetherington:
PocketSUMMIT: small-footprint continuous speech recognition.
1465-1468

- Tobias Cincarek, Izumi Shindo, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano:
Development of preschool children subsystem for ASR and q&a in a real-environment speech-oriented guidance task.
1469-1472

- Chengyuan Ma, Chin-Hui Lee:
A study on word detector design and knowledge-based pruning and rescoring.
1473-1476

- Thomas Colthurst, Tresi Arvizo, Chia-Lin Kao, Owen Kimball, Stephen A. Lowe, David R. H. Miller, Jim Van Sciver:
Parameter tuning for fast speech recognition.
1477-1480

- Louis ten Bosch, Bert Cranen:
A computational model for unsupervised word discovery.
1481-1484

- Bernd T. Meyer, Matthias Wächter, Thomas Brand, Birger Kollmeier:
Phoneme confusions in human and automatic speech recognition.
1485-1488

- Kengo Ohta, Masatoshi Tsuchiya, Seiichi Nakagawa:
Construction of spoken language model including fillers using filler prediction model.
1489-1492

- Raghunandan Kumaran, Jeff Bilmes, Katrin Kirchhoff:
Attention shift decoding for conversational speech recognition.
1493-1496

Speech and Language Technology for Less-resourced Languages
- Péter Mihajlik, Tibor Fegyó, Zoltán Tüske, Pavel Ircing:
A morpho-graphemic approach for the recognition of spontaneous speech in agglutinative languages - like Hungarian.
1497-1500

- Mei Yang, Jing Zheng, Andreas Kathol:
A semi-supervised learning approach for morpheme segmentation for an Arabic dialect.
1501-1504

- Gerhard B. Van Huyssteen, Martin J. Puttkammer:
Accelerating the annotation of lexical data for less-resourced languages.
1505-1508

- Christoph Draxler:
On web-based creation of speech resources for less-resourced languages.
1509-1512

- Miroslav Martinovic, Srdjdan Vesic, Goran Rakic:
Building an information retrieval system for serbian - challenges and solutions.
1513-1516

- Guy De Pauw, Peter Waiganjo Wagacha:
Bootstrapping morphological analysis of gĩkũyũ using unsupervised maximum entropy learning.
1517-1520

- Jerneja Zganec-Gros, Stanislav Gruden:
The voiceTRAN machine translation system.
1521-1524

- Sérgio Paulo, Luís C. Oliveira:
MuLAS: a framework for automatically building multi-tier corpora.
1525-1528

- Jacquelijn Ringersma, Marc Kemps-Snijders:
Creating multimedia dictionaries of endangered languages using LEXUS.
1529-1532

- Hrafn Loftsson, Eiríkur Rögnvaldsson:
IceNLP: a natural language processing toolkit for icelandic.
1533-1536

- Marius Peche, Marelie H. Davel, Etienne Barnard:
Phonotactic spoken language identification with limited training data.
1537-1540

- Solomon Teferra Abate, Wolfgang Menzel:
Automatic speech recognition for an under-resourced language - amharic.
1541-1544

- Abdillahi Nimaan, Pascal Nocera, Frédéric Béchet, Jean-François Bonastre:
Information retrieval strategies for accessing african audio corpora.
1545-1548

- Vesa Siivola, Mathias Creutz, Mikko Kurimo:
Morfessor and variKN machine learning tools for speech and language technology.
1549-1552

- Markpong Jongtaveesataporn, Issara Thienlikit, Chai Wutiwiwatchai, Sadaoki Furui:
Towards better language modeling for Thai LVCSR.
1553-1556

Adaptation in ASR I, II
Speech Perception I, II
- Douglas Brungart, Nandini Iyer:
Time-compressed speech perception with speech and noise maskers.
1581-1584

- Anne Cutler, Martin Cooke, Maria Luisa Garcia Lecumberri, Dennis Pasveer:
L2 consonant identification in noise: cross-language comparisons.
1585-1588

- Jennifer T. Le, Catherine T. Best, Michael D. Tyler, Christian Kroos:
Effects of non-native dialects on spoken word recognition.
1589-1592

- Julien Meyer, Fanny Meunier, Laure Dentel:
Identification of natural whistled vowels by non-whistlers.
1593-1596

- Alexandra Jesse, James M. McQueen:
Prelexical adjustments to speaker idiosyncrasies: are they position-specific?
1597-1600

- Holger Mitterer:
Top-down effects on compensation for coarticulation are not replicable.
1601-1604

Spoken Language Understanding
- Christian Raymond, Giuseppe Riccardi:
Generative and discriminative algorithms for spoken language understanding.
1605-1608

- Elias Iosif, Alexandros Potamianos:
A soft-clustering algorithm for automatic induction of semantic classes.
1609-1612

- Agustín Gravano, Stefan Benus, Julia Hirschberg, Shira Mitchell, Ilia Vovsha:
Classification of discourse functions of affirmative words in spoken dialogue.
1613-1616

- Bogdan Minescu, Géraldine Damnati, Frédéric Béchet, Renato de Mori:
Conditional use of word lattices, confusion networks and 1-best string hypotheses in a sequential interpretation strategy.
1617-1620

- Jáchym Kolár, Yang Liu, Elizabeth Shriberg:
Speaker adaptation of language models for automatic dialog act segmentation of meetings.
1621-1624

- Amparo Albalate, Dimitar Dimitrov, Roberto Pieraccini:
Unsupervised categorisation approaches for technical support automated agents.
1625-1628

Pitch Extraction I, II
- Michael Wohlmayr, Marián Képesi:
Joint position-pitch extraction from multichannel audio.
1629-1632

- Hyun Soo Kim:
Morphological pre-processing technique and its applications on speech signal.
1633-1636

- Patricia A. Pelle, Claudio Estienne:
A pitch extraction system based on phase locked loops and consensus decision.
1637-1640

- Milan Legát, Jindrich Matousek, Daniel Tihelka:
A robust multi-phase pitch-mark detection algorithm.
1641-1644

- Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu, Md. Kamrul Hasan:
Pitch estimation of noisy speech signals using empirical mode decomposition.
1645-1648

- Daniel Hirst, Hyongsil Cho, Sunhee Kim, Hyunji Yu:
Evaluating two versions of the momel pitch modelling algorithm on a corpus of read speech in Korean.
1649-1652

- Hussein Hussein, Oliver Jokisch:
Hybrid electroglottograph and speech signal based algorithm for pitch marking.
1653-1656

Speech Coding and Transmission
- Saikat Chatterjee, Thippur V. Sreenivas:
Normalized two stage SVQ for minimum complexity wide-band LSF quantization.
1657-1660

- Peng Zhang, Changchun Bao:
A novel 2kb/s waveform interpolation speech coder based on non-negative matrix factorization.
1661-1664

- Ahmed Ismail, Yasser Dakroury, Hazem Abbas:
A novel energy distribution comparison approach for robust speech spectrum vector quantization.
1665-1668

- Ahmed Ismail, Yasser Dakroury, Hazem Abbas:
Novel low-band phase representation for low bit-rate speech coding.
1669-1672

- Chun-Feng Wu, Cheng-Lung Lee, Wen-Whei Chang:
Perceptual-based playout mechanisms for multi-stream voice over IP networks.
1673-1676

- Robert Zopf, Jes Thyssen, Juin-Hwey Chen:
Time-warping and re-phasing in packet loss concealment.
1677-1680

- Yannis Agiomyrgiannakis, Yannis Stylianou:
The harmonic model codec (HMC) framework for voIP.
1681-1684

- Yannis Agiomyrgiannakis, Yannis Stylianou:
Bit-erasure channel decoding for GMM-based multiple description coding.
1685-1688

- Hua Yuan, Tiago H. Falk, Wai-Yip Chan:
Degradation-classification assisted single-ended quality measurement of speech.
1689-1692

- Alexander Raake, Sascha Spors, Jens Ahrens, Jitendra Ajmera:
Concept and evaluation of a downward-compatible system for spatial teleconferencing using automatic speaker clustering.
1693-1696

- Min-Ki Lee, Kyung-Tae Kim, Hong-Goo Kang, Dae Hee Youn:
Speech quality estimation using packet loss effects in CELP-type speech coders.
1697-1700

- Masahiro Oshikiri, Hiroyuki Ehara, Toshiyuki Morii, Tomofumi Yamanashi, Kaoru Satoh, Koji Yoshida:
An 8-32 kbit/s scalable wideband coder extended with MDCT-based bandwidth extension on top of a 6.8 kbit/s narrowband CELP coder.
1701-1704

Topics in Acoustic Modeling
- Robert Wielgat, Tomasz P. Zielinski, Pawel Swietojanski, Piotr Zoladz, Daniel Król, Tomasz Wozniak, Stanislaw Grabias:
Comparison of HMM and DTW methods in automatic recognition of pathological phoneme pronunciation.
1705-1708

- Kai Yu, Mark J. F. Gales, Philip C. Woodland:
Unsupervised training with directed manual transcription for recognising Mandarin broadcast audio.
1709-1712

- Hao Wu, Xihong Wu:
Context dependent syllable acoustic model for continuous Chinese speech recognition.
1713-1716

- Dimitris Oikonomidis, Vassilios Diakoloukas, Vassilios Digalakis:
A sub-optimal viterbi-like search for linear dynamic models classification.
1717-1720

- Georg Heigold, Ralf Schlüter, Hermann Ney:
On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields.
1721-1724

- Stefano Scanzio, Pietro Laface, Roberto Gemello, Franco Mana:
Speeding-up neural network training using sentence and frame selection.
1725-1728

- Linquan Liu, Thomas Fang Zheng, Makoto Akabane, Ruxin Chen, Wenhu Wu:
Using a small development set to build a robust dialectal Chinese speech recognizer.
1729-1732

Confidence Measures (and Related Topics)
- Carlos Molina, Néstor Becerra Yoma, Fernando Huenupán, Claudio Garretón:
Unsupervised re-scoring of observation probability in viterbi based on reinforcement learning by using confidence measure and HMM neighborhood.
1733-1736

- Shiuan-Sung Lin, François Yvon:
Optimization on decoding graphs by discriminative training.
1737-1740

- Stéphane Huet, Guillaume Gravier, Pascale Sébillot:
Morphosyntactic processing of n-best lists for improved recognition and confidence measure computation.
1741-1744

- Xiang Li, Juan M. Huerta:
How predictable is ASR confidence in dialog applications?
1745-1748

- Alexandre Allauzen:
Error detection in confusion network.
1749-1752

- Takanobu Oba, Takaaki Hori, Atsushi Nakamura:
An approach to efficient generation of high-accuracy and compact error-corrective models for speech recognition.
1753-1756

- Hamed Ketabdar, Mirko Hannemann, Hynek Hermansky:
Detection of out-of-vocabulary words in posterior based ASR.
1757-1760

Grapheme-to-Phoneme Conversion
- Daniela Braga, Luís Pinto Coelho, Fernando Gil Vianna Resende Jr.:
Homograph ambiguity resolution in front-end design for portuguese TTS systems.
1761-1764

- Ghinwa F. Choueiter, Stephanie Seneff, James R. Glass:
New word acquisition using subword modeling.
1765-1768

- Samuel Thomas, Ashish Verma:
Language identification of person names using CF-IOF based weighing function.
1769-1772

- Henk van den Heuvel, Jean-Pierre Martens, Nanneke Konings:
G2p conversion of names: what can we do (better)?
1773-1776

- Ausdang Thangthai, Chai Wutiwiwatchai, Anocha Rugchatjaroen, Sittipong Saychum:
A learning method for Thai phonetization of English words.
1777-1780

- Steffen Werner, Rüdiger Hoffmann:
Spontaneous speech synthesis by pronunciation variant selection - a comparison to natural speech.
1781-1784

- Nikos Tsourakis, Vassilios Digalakis:
A generic methodology of converting transliterated text to phonetic strings case study: greeklish.
1785-1788

- Rita Singh, Evandro B. Gouvêa, Bhiksha Raj:
Probabilistic deduction of symbol mappings for extension of lexicons.
1789-1792

Lexical and Prosodic Modeling
- Sergey Astrov, Joachim Hofer, Harald Höge:
Use of syllable center detection for improved duration modeling in Chinese Mandarin connected digits recognition.
1793-1796

- Thomas Pellegrini, Lori Lamel:
Using phonetic features in unsupervised word decompounding for ASR with application to a less-represented language.
1797-1800

- Sheng Qiang, Yao Qian, Frank K. Soong, Congfu Xu:
Robust F0 modeling for Mandarin speech recognition in noise.
1801-1804

- Dino Seppi, Daniele Falavigna, Georg Stemmer, Roberto Gretter:
Word duration modeling for word graph rescoring in LVCSR.
1805-1808

- Fabio Tamburini, Petra Wagner:
On automatic prominence detection for German.
1809-1812

- Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan:
Prosody-enriched lattices for improved syllable recognition.
1813-1816

- Joel Pinto, Andrew Lovitt, Hynek Hermansky:
Exploiting phoneme similarities in hybrid HMM-ANN keyword spotting.
1817-1820

- C. E. Liu, Kishan Thambiratnam, Frank Seide:
Online vocabulary adaptation using limited adaptation data.
1821-1824

Speech Recognition by Automatic Attribute Transcription
- Chin-Hui Lee, Mark A. Clements, Sorin Dusan, Eric Fosler-Lussier, Keith Johnson, Biing-Hwang Juang, Lawrence R. Rabiner:
An overview on automatic speech attribute transcription (ASAT).
1825-1828

- Ilana Bromberg, Qian Qian, Jun Hou, Jinyu Li, Chengyuan Ma, Brett Matthews, Antonio Moreno-Daniel, Jeremy Morris, Sabato Marco Siniscalchi, Yu Tsao, Yu Wang:
Detection-based ASR in the automatic speech attribute transcription project.
1829-1832

- Chi-Yueh Lin, Hsiao-Chuan Wang:
Attribute-based Mandarin speech recognition using conditional random fields.
1833-1836

- Helmer Strik, Khiet P. Truong, Febe de Wet, Catia Cucchiarini:
Comparing classifiers for pronunciation error detection.
1837-1840

- Jarek Krajewski, Bernd J. Kröger:
Using prosodic and spectral characteristics for sleepiness detection.
1841-1844

- Brian M. Ore, Raymond E. Slyh:
Score fusion for articulatory feature detection.
1845-1848

Speaker Diarization
First and Second Language Learning
- Wai-Sum Lee:
Tone production by the speakers of different age-and-gender groups.
1873-1876

- Nan Xu, Denis Burnham, Christine Kitamura:
Vowels and tones in infant directed speech: hyperarticulation for both, but different developmental patterns.
1877-1880

- Eon-Suk Ko:
Acquisition of vowel duration in children speaking american English.
1881-1884

- Hiroko Hirano, Keikichi Hirose, Goh Kawai, Wentao Gu, Nobuaki Minematsu:
F0 models show Chinese speakers of Japanese insert intonational boundaries and drop pitch.
1885-1888

- Paola Escudero, Jelle Kastelein, Klara A. Weiand, R. J. J. H. van Son:
Formal modelling of L1 and L2 perceptual learning: computational linguistics versus machine learning.
1889-1892

- Mirjam Broersma:
Kettle hinders cat, shadow does not hinder shed: activation of 'almost embedded' words in nonnative listening.
1893-1896

Speech Synthesis I, II
- Sacha Krstulovic, Anna Hunecke, Marc Schröder:
An HMM-based speech synthesis system applied to German and its adaptation to a limited set of expressive football announcements.
1897-1900

- Liang Gu, Wei Zhang, Lazkin Tahir, Yuqing Gao:
Statistical vowelization of Arabic text for speech synthesis in speech-to-speech translation systems.
1901-1904

- Wu Liu, Dezhi Huang, Yuan Dong, Xinnian Mao, Haila Wang:
A pair-based language model for the robust lexical analysis in Chinese text-to-speech synthesis.
1905-1908

- Ranniery Maia, Tomoki Toda, Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda:
A trainable excitation model for HMM-based speech synthesis.
1909-1912

- Jochen Steigner, Marc Schröder:
Cross-language phonemisation in German text-to-speech synthesis.
1913-1916

- Ryuki Tachibana, Tohru Nagano, Gakuto Kurata, Masafumi Nishimura, Noboru Babaguchi:
Preliminary experiments toward automatic generation of new TTS voices from recorded speech alone.
1917-1920

Phonetic Segmentation and Classification I, II
- Xiaochuan Niu, Jan P. H. van Santen:
Dual-channel acoustic detection of nasalization states.
1921-1924

- Tarun Pruthi, Carol Y. Espy-Wilson:
Acoustic parameters for the automatic detection of vowel nasalization.
1925-1928

- Jun Hou, Lawrence R. Rabiner, Sorin Dusan:
On the use of time-delay neural networks for highly accurate classification of stop consonants.
1929-1932

- Ladan Golipour, Douglas D. O'Shaughnessy:
A new approach for phoneme segmentation of speech signals.
1933-1936

- Veronique Stouten, Kris Demuynck, Hugo Van hamme:
Automatically learning the units of speech by non-negative matrix factorisation.
1937-1940

- Ozlem Kalinli, Shrikanth S. Narayanan:
A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech.
1941-1944

- Sung Jun An, Young-Ik Kim, Rhee Man Kil:
Zero-crossing-based ratio masking for sound segregation.
1945-1948

- Satomi Tanaka, Minoru Tsuzaki, Hiroaki Kato, Yoshinori Sagisaka:
Event detection of speech signals based on auditory processing with a dynamic compressive gammachirp filterbank.
1949-1952

- Odette Scharenborg, Mirjam Ernestus, Vincent Wan:
Segmentation of speech: child's play?
1953-1956

- Andrew Errity, John McKenna, Barry Kirkpatrick:
Dimensionality reduction methods applied to both magnitude and phase derived features.
1957-1960

Voice Conversion and Modification
- Zdenek Hanzlícek, Jindrich Matousek:
F0 transformation within the voice conversion framework.
1961-1964

- Daniel Erro, Asunción Moreno:
Weighted frequency warping for voice conversion.
1965-1968

- Daniel Erro, Asunción Moreno:
Frame alignment method for cross-lingual voice conversion.
1969-1972

- Jani Nurminen, Jilei Tian, Victor Popa:
Voicing level control with application in voice conversion.
1973-1976

- Winston S. Percybrooks, Elliot Moore:
New algorithm for LPC residual estimation from LSF vectors for a voice conversion system.
1977-1980

- Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano:
Speaker adaptive training for one-to-many eigenvoice conversion based on Gaussian mixture model.
1981-1984

- Petko N. Petkov, W. Bastiaan Kleijn:
Improving the phase vocoder approach to pitch-shifting.
1985-1988

- Larbi Mesbahi, Vincent Barreaud, Olivier Boëffard:
Comparing GMM-based speech transformation systems.
1989-1992

Speaker Verification & Identification I-IV
- Michael Gerber, René Beutler, Beat Pfister:
Quasi text-independent speaker-verification based on pattern matching.
1993-1996

- Yosef A. Solewicz, Moshe Koppel:
Virtual fusion for speaker recognition.
1997-2000

- Yi-Hsiang Chao, Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang, Ruei-Chuan Chang:
Evolutionary minimum verification error learning of the alternative hypothesis model for LLR-based speaker verification.
2001-2004

- Seiichi Nakagawa, Kouhei Asakawa, Longbiao Wang:
Speaker recognition by combining MFCC and phase information.
2005-2008

- Sandeep Manocha, Carol Y. Espy-Wilson:
A semi-automatic approach for speaker mining of tapped telephone conversations.
2009-2012

- Hao Yang, Yuan Dong, Xianyu Zhao, Jian Zhao, Liang Lu, Haila Wang:
Cluster adaptive training weights as features in SVM-based speaker verification.
2013-2016

- Hideki Okamoto, Mariko Kojima, Tomoko Matsui, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano:
Study on speaker verification with non-audible murmur segments.
2017-2020

- Xugang Lu, Jianwu Dang:
Dimension reduction for speaker identification based on mutual information.
2021-2024

- Jonas Lindh, Anders Eriksson:
Robustness of long time measures of fundamental frequency.
2025-2028

- Vinod Prakash, John H. L. Hansen:
Score distribution scaling for speaker recognition.
2029-2032

- Andrew C. Morris, Jacques C. Koreman, B. Ly-Van, Harin Sellahewa, Sabah Jassim, R. Llarena Gómez:
Global features for rapid identity verification with dynamic biometric data.
2033-2036

- Tuan Van Pham, Michael Neffe, Gernot Kubin:
Robust voice activity detection for narrow-bandwidth speaker verification under adverse environments.
2037-2040

- Fernando Huenupán, Néstor Becerra Yoma, Carlos Molina, Claudio Garretón:
Speaker verification with multiple classifier fusion using Bayes based confidence measure.
2041-2044

- Girija Chetty, Michael Wagner:
Audiovisual speaker identity verification based on lip motion features.
2045-2048

- Gökhan Tür, Elizabeth Shriberg, Andreas Stolcke, Sachin S. Kajarekar:
Duration and pronunciation conditioned lexical modeling for speaker verification.
2049-2052

- Jean-François Bonastre, Driss Matrouf, Corinne Fredouille:
Artificial impostor voice transformation effects on false acceptance rates.
2053-2056

Improved Acoustic Modeling for ASR
- Jen-Wei Kuo, Hung-Yi Lo, Hsin-Min Wang:
Improved HMM/SVM methods for automatic phoneme segmentation.
2057-2060

- Takahiro Shinozaki, Tatsuya Kawahara:
Gaussian mixture optimization for HMM based on efficient cross-validation.
2061-2064

- Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda:
Model-space MLLR for trajectory HMMs.
2065-2068

- Hamed Ketabdar, Hervé Bourlard:
In-context phone posteriors as complementary features for tandem ASR.
2069-2072

- Qian Qian, Xiaodong He, Li Deng:
Phone-discriminating minimum classification error (p-MCE) training for phonetic recognition.
2073-2076

- Lori Lamel, Abdelkhalek Messaoudi, Jean-Luc Gauvain:
Improved acoustic modeling for transcribing Arabic broadcast data.
2077-2080

- Erik McDermott, Atsushi Nakamura:
String and lattice based discriminative training for the corpus of spontaneous Japanese lecture transcription task.
2081-2084

- Byung-Ok Kang, Ho-Young Jung, Yunkeun Lee:
Discriminative noise adaptive training approach for an environment migration.
2085-2088

- Jia-Yu Chen, Peder A. Olsen, John R. Hershey:
Word confusability - measuring hidden Markov model similarity.
2089-2092

- Thomas Deselaers, Georg Heigold, Hermann Ney:
Speech recognition with state-based nearest neighbour classifiers.
2093-2096

- Remco Teunen, Masami Akamine:
HMM-based speech recognition using decision trees instead of GMMs.
2097-2100

- Christian Gollan, Stefan Hahn, Ralf Schlüter, Hermann Ney:
An improved method for unsupervised training of LVCSR systems.
2101-2104

- Mohamed Kamal Omar:
A variational approach to robust maximum likelihood estimation for speech recognition.
2105-2108

- Kai Yu, Rob A. Rutenbar:
Generating small, accurate acoustic models with a modified Bayesian information criterion.
2109-2112

- Peter Bell, Simon King:
Sparse Gaussian graphical models for speech recognition.
2113-2116

- Sakriani Sakti, Konstantin Markov, Satoshi Nakamura:
An HMM acoustic model incorporating various additional knowledge sources.
2117-2120

- Matti Varjokallio, Mikko Kurimo:
Comparison of subspace methods for Gaussian mixture models in speech recognition.
2121-2124

Multilingualism in Speech and Language Processing
- Tanja Schultz, Alan W. Black, Sameer Badaskar, Matthew Hornyak, John Kominek:
SPICE: web-based tools for rapid language adaptation in speech processing systems.
2125-2128

- Filip Deprez, Jan Odijk, Jan De Moortel:
Introduction to multilingual corpus-based concatenative speech synthesis.
2129-2132

- Frederik Stouten, Jean-Pierre Martens:
Recognition of foreign names spoken by native speakers.
2133-2136

- Ricardo de Córdoba, Luis Fernando D'Haro, Fernando Fernández-Martínez, Juan Manuel Montero, Roberto Barra-Chicote:
Language identification using several sources of information with a multiple-Gaussian classifier.
2137-2140

- Carmen del Solar, Guillermo Pérez, Eva Florencio, David Moral, Gabriel Amores Carredano, Pilar Manchón Portillo:
Dynamic language change in MIMUS.
2141-2144

Systems for LVCSR and Rich Transcription I, II
- Jonas Lööf, Christian Gollan, Stefan Hahn, Georg Heigold, Björn Hoffmeister, Christian Plahl, David Rybach, Ralf Schlüter, Hermann Ney:
The RWTH 2007 TC-STAR evaluation system for european English and Spanish.
2145-2148

- Chin-Wei Eugene Koh, Hanwu Sun, Tin Lay Nwe, Trung Hieu Nguyen, Bin Ma, Engsiong Chng, Haizhou Li, Susanto Rahardja:
Using direction of arrival estimate and acoustic feature information in speaker diarization.
2149-2152

- Fernando Batista, Diamantino Caseiro, Nuno J. Mamede, Isabel Trancoso:
Recovering punctuation marks for automatic speech recognition.
2153-2156

- Jui-Feng Yeh, Chung-Hsien Wu, Wei-Yen Wu:
Disfluency correction of spontaneous speech using conditional random fields with variable-length features.
2157-2160

- Jing Huang, Etienne Marcheret, Karthik Visweswariah, Vit Libal, Gerasimos Potamianos:
Detection, diarization, and transcription of far-field lecture speech.
2161-2164

- Timothy J. Hazen, Brennan Sherry, Mark Adler:
Speech-based annotation and retrieval of digital photographs.
2165-2168

Language Learning and Assessment
- Joseph Tepperman, Abe Kazemzadeh, Shrikanth S. Narayanan:
A text-free approach to assessing nonnative intonation.
2169-2172

- John Lee, Stephanie Seneff:
Automatic generation of cloze items for prepositions.
2173-2176

- Christopher J. Waple, Hongcui Wang, Tatsuya Kawahara, Yasushi Tsubota, Masatake Dantsuji:
Evaluating and optimizing Japanese tutor system featuring dynamic question generation and interactive guidance.
2177-2180

- Catia Cucchiarini, Ambra Neri, Febe de Wet, Helmer Strik:
ASR-based pronunciation training: scoring accuracy and pedagogical effectiveness of a system for dutch L2 learners.
2181-2184

- Joseph Tepperman, Matthew Black, Patti Price, Sungbok Lee, Abe Kazemzadeh, Matteo Gerosa, Margaret Heritage, Abeer Alwan, Shrikanth S. Narayanan:
A Bayesian network classifier for word-level reading assessment.
2185-2188

Multimodal Interaction:
Analysis and Technology
- Hartwig Holzapfel, Alex Waibel:
Behavior models for learning and receptionist dialogs.
2189-2192

- Markku Turunen, Jaakko Hakulinen, Anssi Kainulainen, Aleksi Melto, Topi Hurtig:
Design of a rich multimodal interface for mobile spoken route guidance.
2193-2196

- Mariët Theune, Dennis Hofs, Marco van Kessel:
The virtual guide: a direction giving embodied conversational agent.
2197-2200

- Sudeep Gandhe, David R. Traum:
Creating spoken dialogue characters from corpora without annotations.
2201-2204

- Pui-Yu Hui, Zhengyu Zhou, Helen M. Meng:
Complementarity and redundancy in multimodal user inputs with speech and pen gestures.
2205-2208

- Linda Bell, Joakim Gustafson:
Children's convergence in referring expressions to graphical objects in a speech-enabled computer game.
2209-2212

Emotion
- Hiromi Kawatsu, Sumio Ohno:
An analysis of individual differences in the f0 contour and the duration of anger utterances at several degrees.
2213-2216

- Yoshiko Arimoto, Sumio Ohno, Hitoshi Iida:
Acoustic features of anger utterances during natural dialog.
2217-2220

- Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg, Wisam Dakka:
Comparing american and palestinian perceptions of charisma using acoustic-prosodic and lexical analysis.
2221-2224

- Carlos Busso, Sungbok Lee, Shrikanth S. Narayanan:
Using neutral speech models for emotional speech analysis.
2225-2228

- N. Satoh, Katsuya Yamauchi, Shoichi Matsunaga, Masaru Yamashita, R. Nakagawa, Kazuyuki Shinohara:
Emotion clustering using the results of subjective opinion tests for emotion recognition in infants' cries.
2229-2232

- Roberto Barra-Chicote, Juan Manuel Montero, Javier Macías Guarasa, Juana M. Gutiérrez-Arriola, Javier Ferreiros, José Manuel Pardo:
On the limitations of voice conversion techniques in emotion identification tasks.
2233-2236

- Kate Dupuis, Kathleen Pichora-Fuller:
Use of lexical and affective prosodic cues to emotion by younger and older adults.
2237-2240

- Purnima Gupta, Nitendra Rajput:
Two-stream emotion recognition for call center monitoring.
2241-2244

- Ioulia Grichkovtsova, Anne Lacheret, Michel Morel:
The role of intonation and voice quality in the affective speech perception.
2245-2248

- Bogdan Vlasenko, Björn Schuller, Andreas Wendemuth, Gerhard Rigoll:
Combining frame and turn-level information for robust recognition of emotions within speech.
2249-2252

Speakers:
Expression, Emotion and Personality Recognition
- Björn Schuller, Anton Batliner, Dino Seppi, Stefan Steidl, Thurid Vogt, Johannes Wagner, Laurence Devillers, Laurence Vidrascu, Noam Amir, Loïc Kessous, Vered Aharonson:
The relevance of feature type for the automatic classification of emotional user states: low level descriptors and functionals.
2253-2256

- Minh-Quang Vu, Laurent Besacier, Eric Castelli:
Automatic question detection: prosodic-lexical features and crosslingual experiments.
2257-2260

- Makoto Tachibana, Keigo Kawashima, Junichi Yamagishi, Takao Kobayashi:
Performance evaluation of HMM-based style classification with a small amount of training data.
2261-2264

- Khiet P. Truong, David A. van Leeuwen:
Visualizing acoustic similarities between emotions in speech: an acoustic map of emotions.
2265-2268

- Hao Hu, Ming-Xing Xu, Wei Wu:
Fusion of global statistical and segmental spectral features for speech emotion recognition.
2269-2272

- Vidhyasaharan Sethu, Eliathamby Ambikairajah, Julien Epps:
Group delay features for emotion detection.
2273-2276

- Christian A. Müller, Felix Burkhardt:
Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age.
2277-2280

- Frank Enos, Elizabeth Shriberg, Martin Graciarena, Julia Hirschberg, Andreas Stolcke:
Detecting deception using critical segments.
2281-2284

- Takashi Nose, Yoichi Kato, Takao Kobayashi:
Style estimation of speech based on multiple regression hidden semi-Markov model.
2285-2288

- Chi Zhang, John H. L. Hansen:
Analysis and classification of speech mode: whispered through shouted.
2289-2292

First Language, Second Language, Cross-language
- Melissa Bettoni-Techio, Andréia S. Rauber, Rosana Denise Koerich:
Perception and production of word-final alveolar stops by brazilian portuguese learners of English.
2293-2296

- Denise Cristina Kluge, Andréia S. Rauber, Mara Silvia Reis, Ricardo Augusto Hoffmann Bion:
The relationship between the perception and production of English nasal codas by brazilian learners of English.
2297-2300

- Takafumi Utashiro, Goh Kawai:
CALL courseware for learning reactive tokens in face-to-face dialogs.
2301-2304

- Shinya Kiriyama, Ryo Tsuji, Tomohiko Kasami, Shogo Ishikawa, Naofumi Otani, Hiroaki Horiuchi, Yoichi Takebayashi, Shigeyoshi Kitazawa:
The developmental analysis of demonstrative expression skills utilizing a multimodal infant behavior corpus.
2305-2308

- Elena E. Lyakso, Olga V. Frolova:
Russian vowels system acoustic features development in ontogenesis.
2309-2312

- Petra van Alphen, Elise de Bree, Paula Fikkert, Frank Wijnen:
The role of metrical stress in comprehension and production in dutch children at-risk of dyslexia.
2313-2316

- Seiichi Nakagawa, Kei Ohta:
A statistical method of evaluating pronunciation proficiency for presentation in English.
2317-2320

- Akiyo Joto, Yoshiki Nagase, Seiya Funatsu:
The intelligibility and its relations to acoustic characteristics of English /s/ and /esh/ produced by native speakers of Japanese.
2321-2324

- Martijn Goudbeek, Daniel Swingley, Keith R. Kluender:
The limits of multidimensional category learning.
2325-2328

- Maria Uther, James Uther, Panos Athanasopoulos, Pushpendra Singh, Reiko Akahane-Yamada:
Mobile adaptive CALL (MAC): a lightweight speech-based intervention for mobile language learners.
2329-2332

- Catherine T. Best, Pierre A. Hallé, Jennifer S. Pardo:
English and French speakers' perception of voicing distinctions in non-native lateral consonant syllable onsets.
2333-2336

- Francisco Lacerda, Lisa Gustavsson:
Predicting the consequences of vocalizations in early infancy.
2337-2340

- David Weenink, Guangqin Chen, Zongyan Chen, Stefan de Konink, Dennis Vierkant, Eveline van Hagen, R. J. J. H. van Son:
Learning tone distinctions for Mandarin Chinese.
2341-2344

- Catherine Lai, Kyle Gorman, Jiahong Yuan, Mark Liberman:
Perception of disfluency: language differences and listener bias.
2345-2348

Language Modeling I, II
- Hiroki Yamazaki, Koji Iwano, Koichi Shinoda, Sadaoki Furui, Haruo Yokota:
Dynamic language model adaptation using presentation slides for lecture speech recognition.
2349-2352

- Cosmin Munteanu, Gerald Penn, Ronald Baecker:
Web-based language modelling for automatic lecture transcription.
2353-2356

- Tanel Alumäe, Toomas Kirt:
LSA-based language model adaptation for highly inflected languages.
2357-2360

- Aaron Heidel, Hung-an Chang, Lin-Shan Lee:
Language model adaptation using latent dirichlet allocation and an efficient topic inference algorithm.
2361-2364

- Sibel Yaman, Jen-Tzung Chien, Chin-Hui Lee:
Structural Bayesian language modeling and adaptation.
2365-2368

- Ciro Martins, António J. S. Teixeira, João Paulo Neto:
Vocabulary selection for a broadcast news transcription system using a morpho-syntactic approach.
2369-2372

- Nguyen Bach, Mohamed Noamany, Ian R. Lane, Tanja Schultz:
Handling OOV words in Arabic ASR via flexible morphological constraints.
2373-2376

- Raquel Justo, M. Inés Torres:
Phrases in category-based language models for Spanish and basque ASR.
2377-2380

- Ebru Arisoy, Hasim Sak, Murat Saraclar:
Language modeling for automatic turkish broadcast news transcription.
2381-2384

Spoken Data Retrieval I, II
- Roy Wallace, Robbie Vogt, Sridha Sridharan:
A phonetic search approach to the 2006 NIST spoken term detection evaluation.
2385-2388

- Yoshiaki Itoh, Kohei Iwata, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee:
An integration method of retrieval results using plural subword models for vocabulary-free spoken document retrieval.
2389-2392

- Dimitra Vergyri, Izhak Shafran, Andreas Stolcke, Venkata Ramana Rao Gadde, Murat Akbacak, Brian Roark, Wen Wang:
The SRI/OGI 2006 spoken term detection system.
2393-2396

- Masataka Goto, Jun Ogata, Kouichirou Eto:
Podcastle: a web 2.0 approach to speech recognition research.
2397-2400

- Nathalie Camelin, Frédéric Béchet, Géraldine Damnati, Renato de Mori:
Speech mining in noisy audio message corpus.
2401-2404

- Jian Shao, Qingwei Zhao, Pengyuan Zhang, Zhaojie Liu, Yonghong Yan:
A fast fuzzy keyword spotting algorithm based on syllable confusion network.
2405-2408

- Wooil Kim, John H. L. Hansen:
Advances in speechfind: transcript reliability estimation employing confidence measure based on discriminative sub-word model for SDR.
2409-2412

- Benoît Favre, Jean-François Bonastre, Patrice Bellot:
An interactive timeline for speech database browsing.
2413-2416

Novel Techniques for the NATO Non-native Air-traffic Control and HIWIRE Cockpit Databases
- Stéphane Pigeon, Wade Shen, Aaron D. Lawson, David A. van Leeuwen:
Design and characterization of the non-native military air traffic communications database (nnMATC).
2417-2420

- Wade Shen, Douglas A. Reynolds:
A comparison of speaker clustering and speech recognition techniques for air situational awareness.
2421-2424

- Dimitrios Dimitriadis, José C. Segura, Luz García, Alexandros Potamianos, Petros Maragos, Vassilis Pitsikalis:
Advanced front-end for robust speech recognition in extremely adverse environments.
2425-2428

- Roberto Gemello, Franco Mana, Stefano Scanzio:
Experiments on hiwire database using denoising and adaptation with a hybrid HMM-ANN model.
2429-2432

- Brett Y. Smolenski:
Detection and removal of switching noise in push-to-talk and voice operated exchange communications systems.
2433-2436

- Luis Buera, Antonio Miguel, Oscar Saz, Eduardo Lleida, Alfonso Ortega:
Evaluation of the combined use of MEMLIN and MLLR on the non-native adaptation task of hiwire project database.
2437-2440

Systems for Spoken Language Translation I, II
- Daniel Déchelotte, Holger Schwenk, Gilles Adda, Jean-Luc Gauvain:
Improved machine translation of speech-to-text outputs.
2441-2444

- Shirin Saleem, Krishna Subramanian, Rohit Prasad, David Stallard, Chia-Lin Kao, Prem Natarajan, Raid Suleiman:
Improvements in machine translation for English/iraqi speech translation.
2445-2448

- Evgeny Matusov, Dustin Hillard, Mathew Magimai-Doss, Dilek Z. Hakkani-Tür, Mari Ostendorf, Hermann Ney:
Improving speech translation with automatic boundary prediction.
2449-2452

- Roldano Cattoni, Nicola Bertoldi, Marcello Federico:
Punctuating confusion networks for speech translation.
2453-2456

- Aarthi Reddy, Richard C. Rose, Alain Désilets:
Integration of ASR and machine translation models in a document translation task.
2457-2460

- Yik-Cheung Tam, Tanja Schultz:
Bilingual LSA-based translation lexicon adaptation for spoken language translation.
2461-2464

Articulatory Features
Wideband Speech Processing
- Amr H. Nour-Eldin, Peter Kabal:
Objective analysis of the effect of memory inclusion on bandwidth extension of narrowband speech.
2489-2492

- Bernd Geiser, Hervé Taddei, Peter Vary:
Artificial bandwidth extension without side information for ITU-t g.729.1.
2493-2496

- Hannu Pulakka, Paavo Alku, Laura Laaksonen, Päivi Valve:
The effect of highband harmonic structure in the artificial bandwidth expansion of telephone speech.
2497-2500

- Shingo Kuroiwa, Masashi Takashina, Satoru Tsuge, Fuji Ren:
Artificial bandwidth extension for speech signals using speech recogniton.
2501-2504

- Driss Guerchi, Tamer F. Rabie, Abdelrhani Louzi:
Voicing-based codebook in low-rate wideband CELP coding.
2505-2508

- Ethan R. Duni, Bhaskar D. Rao:
Performance of speaker-dependent wideband speech coding.
2509-2512

Accessibility Issues
- Philippe Dreuw, David Rybach, Thomas Deselaers, Morteza Zahedi, Hermann Ney:
Speech recognition techniques for a sign language recognition system.
2513-2516

- Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano:
Impact of various small sound source signals on voice conversion accuracy in speech communication aid for laryngectomees.
2517-2520

- Petr Cerva, Jan Nouza:
Design and development of voice controlled aids for motor-handicapped persons.
2521-2524

- Kouichi Katsurada, Yuji Okuma, Makoto Yano, Yurie Iribe, Tsuneo Nitta:
Management of static/dynamic properties in a multimodal interaction system.
2525-2528

- Rubén San Segundo, Alicia Pérez, Daniel Ortiz, Luis Fernando D'Haro, M. Inés Torres, Francisco Casacuberta:
Evaluation of alternatives on speech to sign language translation.
2529-2532

- Géza Németh, Gábor Olaszy, Mátyás Bartalis, Géza Kiss, Csaba Zainkó, Péter Mihajlik:
Speech based drug information system for aged and visually impaired persons.
2533-2536

- Waldo Nogueira Vazquez, Tamás Harczos, Bernd Edler, Jörn Ostermann, Andreas Büchner:
Automatic speech recognition with a cochlear implant front-end.
2537-2540

- Soo-Young Suk, Hiroaki Kojima:
Voice activated powered wheelchair with non-voice rejection algorithm.
2541-2544

- Laurianne Sitbon, Patrice Bellot, Philippe Blache:
Phonetic based sentence level rewriting of questions typed by dyslexic spellers in an information retrieval context.
2545-2548

New Application Areas
- André Berton, Peter Regel-Brietzmann, Hans Ulrich Block, Stefanie Schachtl, Manfred Gehrke:
How to integrate speech-operated internet information dialogs into a car.
2549-2552

- James R. Glass, Timothy J. Hazen, D. Scott Cyphers, Igor Malioutov, David Huynh, Regina Barzilay:
Recent progress in the MIT spoken lecture processing project.
2553-2556

- Philipp Fischer, Andreas Österle, André Berton, Peter Regel-Brietzmann:
How to personalize speech applications for web-based information in a car.
2557-2560

- Satoshi Ikeda, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno:
Topic estimation with domain extensibility for guiding user's out-of-grammar utterances in multi-domain spoken dialogue systems.
2561-2564

- Ryota Nishimura, Norihide Kitaoka, Seiichi Nakagawa:
Prosody change and response timing analysis in spontaneously spoken dialogs and their modeling in a spoken dialog system.
2565-2568

- Satoshi Tamura, Kunihiko Takamatsu, Shinji Ogura, Satoru Hayamizu:
GEMSIS - a novel application of speech recognition to emergency and disaster medicine.
2569-2572

- Rachel Coulston, Esther Klabbers, Jacques de Villiers, John-Paul Hosom:
Application of speech technology in a home based assessment kiosk for early detection of alzheimer's disease.
2573-2576

- Olga Vybornova, Monica Gemo, Ronald Moncarey, Benoit M. Macq:
Ontology-based multimodal high level fusion involving natural language analysis for aged people home care application.
2577-2580

Story Segmentation
- Shing-kai Chan, Lei Xie, Helen M. Meng:
Modeling the statistical behavior of lexical chains to capture word cohesiveness for automatic story segmentation.
2581-2584

- James G. Fung, Dilek Z. Hakkani-Tür, Mathew Magimai-Doss, Elizabeth Shriberg, Sébastien Cuendet, Nikki Mirghafori:
Cross-linguistic analysis of prosodic features for sentence segmentation.
2585-2588

- Andrew Rosenberg, Mehrbod Sharifi, Julia Hirschberg:
Varying input segmentation for story boundary detection in English, Arabic and Mandarin broadcast news.
2589-2592

- BalaKrishna Kolluru, Yoshihiko Gotoh:
Speaker role based structural classification of broadcast news stories.
2593-2596

Systems for LVCSR and Rich Transcription I, II
- Ümit Güz, Sébastien Cuendet, Dilek Z. Hakkani-Tür, Gökhan Tür:
Co-training using prosodic and lexical information for sentence segmentation.
2597-2600

- Yannick Estève, Sylvain Meignier, Paul Deléglise, Julie Mauclair:
Extracting true speaker identities from transcriptions.
2601-2604

- Rong Fu, Ian D. Benest:
An improved speaker diarization system.
2605-2608

- Sebastian Stüker, Christian Fügen, Florian Kraft, Matthias Wölfel:
The ISL 2007 English speech transcription system for european parliament speeches.
2609-2612

- Mei-Yuh Hwang, Wen Wang, Xin Lei, Jing Zheng, Özgür Çetin, Gang Peng:
Advances in Mandarin broadcast speech recognition.
2613-2616

- Jun Ogata, Masataka Goto, Kouichirou Eto:
Automatic transcription for a web 2.0 service to search podcasts.
2617-2620

Prosody:
Production
- Matthias Jilka, Bernd Möbius:
The influence of vowel quality features on peak alignment.
2621-2624

- Yen-Liang Shue, Markus Iseli, Nanette Veilleux, Abeer Alwan:
Pitch accent versus lexical stress: quantifying acoustic measures related to the voice source.
2625-2628

- Stefan Benus, Agustín Gravano, Julia Hirschberg:
Prosody, emotions, and... 'whatever'.
2629-2632

- Wentao Gu, Rerrario Shui-Ching Ho, Tan Lee:
Modeling tones in hakka on the basis of the command-response model.
2633-2636

- Gerrit Kentner:
Length, ordering preference and intonational phrasing: evidence from pauses.
2637-2640

- Jörg Peters, Judith Hanssen, Carlos Gussenhoven:
Alignment of the second low target in dutch falling-rising pitch contours.
2641-2644

- Helena Moniz, Ana Isabel Mata, Céu Viana:
On filled-pauses and prolongations in european portuguese.
2645-2648

Prosody:
Perception
- Michael Olsberg, Yi Xu, Jeremy Green:
Dependence of tone perception on syllable perception.
2649-2652

- Ralf Winkler:
Testing the relevance of speech rate, pitch and a glottal Chink for the perception of age in synthesized speech using formant synthesis.
2653-2656

- Tamás Böhm, Stefanie Shattuck-Hufnagel:
Utterance-final glottalization as a cue for familiar speaker recognition.
2657-2660

- Chun-Fang Huang, Masato Akagi:
A rule-based speech morphing for verifying a expressive speech perception model.
2661-2664

- Elina Helander, Jani Nurminen:
On the importance of pure prosody in the perception of speaker identity.
2665-2668

- Shi-Han Chen, Chih-Chung Kuo:
Perceptual relevance of pitch contours of Mandarin tones and its efficacy in prosody generation of speech synthesis.
2669-2672

- Hiromitsu Nishizaki, Mitsuhiro Somiya, Kenji Kobayashi, Yoshihiro Sekiguchi:
The effect of filled pauses in a lecture speech on impressive evaluation of listeners.
2673-2676

- Yujia Li, Tan Lee:
Perceptual equivalence of approximated Cantonese tone contours.
2677-2680

- Suleman Shahid, Emiel Krahmer, Marc Swerts:
Audiovisual emotional speech of game playing children: effects of age and culture.
2681-2684

Machine Learning for Spoken Dialog Systems
Spoken Dialog Systems I, II
- Dong Yu, Yun-Cheng Ju, Ye-Yi Wang, Geoffrey Zweig, Alex Acero:
Automated directory assistance system - from theory to practice.
2709-2712

- Geoffrey Zweig, Patrick Nguyen, Yun-Cheng Ju, Ye-Yi Wang, Dong Yu, Alex Acero:
The voice-rate dialog system for consumer ratings.
2713-2716

- Andi Winterboer, Jiang Hu, Johanna D. Moore, Clifford Nass:
The influence of user tailoring and cognitive load on user performance in spoken dialogue systems.
2717-2720

- Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Geoffrey Zweig, Alex Acero:
Confidence measures for voice search applications.
2721-2724

- Ryuichiro Higashinaka, Kohji Dohsaka, Shigeaki Amano, Hideki Isozaki:
Effects of quiz-style information presentation on user understanding.
2725-2728

- Hong-Kwang Jeff Kuo, Vaibhava Goel:
A data visualization and analysis method for natural language call routing system design.
2729-2732

Phonetics
- Christiane Ulbrich, Horst Ulbrich:
Realisations and alternations in German /r/-realisation.
2733-2736

- Christopher S. Doty, Kaori Idemaru, Susan G. Guion:
Singleton and geminate stops in Finnish - acoustic correlates.
2737-2740

- Christophe Van Bael, R. Harald Baayen, Helmer Strik:
Segment deletion in spontaneous speech: a corpus study using mixed effects models with crossed random effects.
2741-2744

- Hongying Zheng, Peter W. M. Tsang, William S.-Y. Wang:
Categorical perception of Cantonese tones in context: a cross-linguistic study.
2745-2748

- Yiya Chen, Jiahong Yuan:
A corpus study of the 3rd tone sandhi in standard Chinese.
2749-2752

- Jonathan Harrington, Sallyanne Palethorpe, Catherine I. Watson:
Age-related changes in fundamental frequency and formants: a longitudinal study of four speakers.
2753-2756

Pitch Extraction I, II
- Jasha Droppo, Alex Acero:
A fine pitch model for speech.
2757-2760

- Prasanta Kumar Ghosh, Antonio Ortega, Shrikanth S. Narayanan:
Pitch period estimation using multipulse model and wavelet transform.
2761-2764

- Martin Heckmann, Frank Joublin, Christian Goerick:
Combining rate and place information for robust pitch extraction.
2765-2768

- Heidi Christensen, Ning Ma, Stuart N. Wrigley, Jon Barker:
Integrating pitch and localisation cues at a speech fragment level.
2769-2772

- Jean-Sylvain Liénard, François Signol, Claude Barras:
Speech fundamental frequency estimation using the alternate comb.
2773-2776

- Andrew Rosenberg, Julia Hirschberg:
Detecting pitch accent using pitch-corrected energy-based predictors.
2777-2780

Spoken Language Understanding and Summarization
- Jian Zhang, Ricky Ho Yin Chan, Pascale Fung, Lu Cao:
A comparative study on speech summarization of broadcast news and lecture speech.
2781-2784

- Gabriel Murray, Steve Renals:
Towards online speech summarization.
2785-2788

- Tomoyuki Yamagata, Atsushi Sako, Tetsuya Takiguchi, Yasuo Ariki:
System request detection in conversation based on acoustic and speaker alternation features.
2789-2792

- Michael Levit, Elizabeth Boschee, Marjorie Freedman:
Selecting on-topic sentences from natural language corpora.
2793-2796

- Seokhwan Kim, Minwoo Jeong, Gary Geunbae Lee:
A semi-supervised method for efficient construction of statistical spoken language understanding resources.
2797-2800

- Yasuhisa Fujii, Norihide Kitaoka, Seiichi Nakagawa:
Automatic extraction of cue phrases for important sentences in lecture speech and automatic lecture speech summarization.
2801-2804

- Yi-Ting Chen, Hsuan-Sheng Chiu, Hsin-Min Wang, Berlin Chen:
A unified probabilistic generative framework for extractive spoken document summarization.
2805-2808

- Matthieu Hébert:
Generic class-based statistical language models for robust speech understanding in directed dialog applications.
2809-2812

- Michael L. Seltzer, Yun-Cheng Ju, Ivan Tashev, Alex Acero:
Robust location understanding in spoken dialog systems using intersections.
2813-2816

Systems for Spoken Language Translation I, II
- David Stallard, Fred Choi, Chia-Lin Kao, Kriste Krstovski, Premkumar Natarajan, Rohit Prasad, Shirin Saleem, Krishna Subramanian:
The BBN 2007 displayless English/iraqi speech-to-speech translation system.
2817-2820

- Ruhi Sarikaya, Yonggang Deng, Yuqing Gao:
Context dependent word modeling for statistical machine translation using part-of-speech tags.
2821-2824

- Darren Scott Appling, Nick Campbell:
Translating conversational speech to standard linguistic form.
2825-2828

- Caroline Lavecchia, Kamel Smaïli, David Langlois, Jean Paul Haton:
Using inter-lingual triggers for machine translation.
2829-2832

- Daniele Falavigna, Nicola Bertoldi, Fabio Brugnara, Roldano Cattoni, Mauro Cettolo, Boxing Chen, Marcello Federico, Diego Giuliani, Roberto Gretter, Deepa Gupta, Dino Seppi:
The IRST English-Spanish translation system for european parliament speeches.
2833-2836

- Christian Fügen, Muntsin Kolss:
The influence of utterance chunking on machine translation performance.
2837-2840

- Kristin Precoda, Jing Zheng, Dimitra Vergyri, Horacio Franco, Colleen Richey, Andreas Kathol, Sachin S. Kajarekar:
Iraqcomm: a next generation translation system.
2841-2844

- Sharath Rao, Ian R. Lane, Tanja Schultz:
Optimizing sentence segmentation for spoken language translation.
2845-2848

Speech Synthesis I, II
- Suphattharachai Chomphan, Takao Kobayashi:
Implementation and evaluation of an HMM-based Thai speech synthesis system.
2849-2852

- Davide Bonardo, Enrico Zovato:
Speech synthesis enhancement in noisy environments.
2853-2856

- Helmut Schmid, Bernd Möbius, Julia Weidenkaff:
Tagging syllable boundaries with joint n-gram models.
2857-2860

- Jun Xu, Dezhi Huang, Yongxin Wang, Yuan Dong, Lianhong Cai, Haila Wang:
Hierarchical non-uniform unit selection based on prosodic structure.
2861-2864

- Peter Birkholz:
Control of an articulatory speech synthesizer based on dynamic approximation of spatial articulatory targets.
2865-2868

- Nobuyuki Nishizawa, Hisashi Kawai:
A preselection method based on cost degradation from the optimal sequence for concatenative speech synthesis.
2869-2872

- Guntram Strecha, Matthias Eichner, Rüdiger Hoffmann:
Line cepstral quefrencies and their use for acoustic inventory coding.
2873-2876

- Peter Cahill, Daniel Aioanei, Julie Carson-Berndsen:
Articulatory acoustic feature applications in speech synthesis.
2877-2880

- Aleksandra Krul, Géraldine Damnati, François Yvon, Cédric Boidin, Thierry Moudenc:
Approaches for adaptive database reduction for text-to-speech synthesis.
2881-2884

- Richard Tzong-Han Tsai, Hsi-Chuan Hung, Hong-Jie Dai, Wen-Lian Hsu:
Exploiting unlabeled internal data in conditional random fields to reduce word segmentation errors for Chinese texts.
2885-2888

- Barry Kirkpatrick, Darragh O'Brien, Ronan Scaife, Andrew Errity:
On the role of spectral dynamics in unit selection speech synthesis.
2889-2892

- Brian Langner, Alan W. Black:
ugloss: a framework for improving spoken language generation understandability.
2893-2896

- Karl Schnell, Arild Lacroix:
Combination of LSF and pole based parameter interpolation for model-based diphone concatenation.
2897-2900

- Kishore Prahallad, Arthur R. Toth, Alan W. Black:
Automatic building of synthetic voices from large multi-paragraph speech databases.
2901-2904

- Ascensión Gallardo-Antolín, Roberto Barra-Chicote, Marc Schröder, Sacha Krstulovic, Juan Manuel Montero:
Automatic phonetic segmentation of Spanish emotional speech.
2905-2908

- Dacheng Lin, Yong Zhao, Frank K. Soong, Min Chu, Jieyu Zhao:
Iterative unit selection with unnatural prosody detection.
2909-2912

Voice Activity Detection and Sound Classification