INTERSPEECH 2006: Pittsburgh, PA, USA
INTERSPEECH 2006 - ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006. ISCA 2006
Language Modeling for Spoken Dialog Systems
Matthew Purver, Florin Ratiu, Lawrence Cavedon: Robust interpretation in dialogue by combining confidence scores with contextual features.
Teruhisa Misu, Tatsuya Kawahara: A bootstrapping approach for developing language model of new spoken dialogue systems by selecting web texts.
Axel Horndasch, Elmar Nöth, Anton Batliner, Volker Warnke: Phoneme-to-grapheme mapping for spoken inquiries to the semantic web.
Karl Weilhammer, Matthew N. Stuttle, Steve Young: Bootstrapping language models for dialogue systems.
Junlan Feng: Question answering with discriminative learning algorithms.
Feature Enhancement for Robust ASR
Patrick Kenny, Vishwa Gupta, Gilles Boulianne, Pierre Ouellet, Pierre Dumouchel: Feature normalization using smoothed mixture transformations.
Chia-Hsin Hsieh, Chung-Hsien Wu, Jun-Yu Lin: Stochastic vector mapping-based feature enhancement using prior model and environment adaptation for noisy speech recognition.
Babak Nasersharif, Ahmad Akbari: A framework for robust MFCC feature extraction using SNR-dependent compression of enhanced mel filter bank energies.
Friedrich Faubel, Matthias Wölfel: Coupling particle filters with automatic speech recognition for speech feature enhancement.
Chang-Wen Hsu, Lin-Shan Lee: Extension and further analysis of higher order cepstral moment normalization (HOCMN) for robust features in speech recognition.
Md. Babul Islam, Hiroshi Matsumoto, Kazumasa Yamamoto: An improved mel-wiener filter for mel-LPC based speech recognition.
Dialog and Discourse
Lluís F. Hurtado, David Griol, Encarna Segarra, Emilio Emilio, Sanchis Sanchis: A stochastic approach for dialog management based on neural networks.
Satanjeev Banerjee, Alexander I. Rudnicky: A texttiling based approach to topic boundary detection in meetings.
Stefan Schulz, Hilko Donker: An user-centered development of an intuitive dialog control for speech-controlled music selection in cars.
Antoine Raux, Dan Bohus, Brian Langner, Alan W. Black, Maxine Eskenazi: Doing research on a deployed spoken dialogue system: one year of let's go! experience.
Jackson Liscombe, Jennifer J. Venditti, Julia Hirschberg: Detecting question-bearing turns in spoken tutorial dialogues.
The Speech Separation Challenge
Soundararajan Srinivasan, Yang Shao, Zhaozhang Jin, DeLiang Wang: A computational auditory scene analysis system for robust speech recognition.
Runqiang Han, Pei Zhao, Qin Gao, Zhiping Zhang, Hao Wu, Xihong Wu: CASA based speech separation for robust speech recognition.
Mark R. Every, Philip J. B. Jackson: Enhancement of harmonic content of speech based on a dynamic programming pitch tracking algorithm.
Jon Barker, André Coy, Ning Ma, Martin Cooke: Recent advances in speech fragment decoding techniques.
Tuomas Virtanen: Speech recognition using factorial hidden Markov models for separation in the feature space.
Ji Ming, Timothy J. Hazen, James R. Glass: Combining missing-feature theory, speech enhancement and speaker-dependent/-independent modeling for speech separation.
Trausti T. Kristjansson, John R. Hershey, Peder A. Olsen, Steven J. Rennie, Ramesh A. Gopinath: Super-human multi-talker speech recognition: the IBM 2006 speech separation challenge system.
Om Deshmukh, Carol Y. Espy-Wilson: Modified phase opponency based solution to the speech separation challenge.
Multilingual and Multi-Accent Processing
Jonas Lööf, Maximilian Bisani, Christian Gollan, Georg Heigold, Björn Hoffmeister, Christian Plahl, Ralf Schlüter, Hermann Ney: The 2006 RWTH parliamentary speeches transcription system.
Ghazi Bouselmi, Dominique Fohr, Irina Illina, Jean Paul Haton: Multilingual non-native speech recognition using phonetic confusion-based acoustic model modification and graphemic constraints.
Joyce Y. C. Chan, P. C. Ching, Tan Lee, Houwei Cao: Automatic speech recognition of Cantonese-English code-mixing utterances.
M. Zimmerman, Dilek Hakkani-Tür, James G. Fung, Nikki Mirghafori, L. Gottlieb, Elizabeth Shriberg, Yang Liu: The ICSI+ multilingual sentence segmentation system.
Yan Ming Cheng, Changxue Ma, Lynette Melnar: Cross-language evaluation of voice-to-phoneme conversions for voice-tag application in embedded platforms.
Huanliang Wang, Yao Qian, Frank K. Soong, Jian-Lai Zhou, Jiqing Han: A multi-space distribution (MSD) approach to speech recognition of tonal languages.
Viet Bac Le, Laurent Besacier: Comparison of acoustic modeling techniques for Vietnamese and Khmer ASR.
Seyed Ghorshi, Saeed Vaseghi, Qin Yan: Comparative analysis of formants of British, american and australian accents.
Linquan Liu, Thomas Fang Zheng, Wenhu Wu: Automatic initial/final generation for dialectal Chinese speech recognition.
Ruhi Sarikaya, Ossama Emam, Imed Zitouni, Yuqing Gao: Maximum entropy modeling for diacritization of Arabic text.
Slavomír Lihan, Jozef Juhar, Anton Cizmar: Comparison of Slovak and Czech speech recognition based on grapheme and phoneme acoustic models.
Corpora, Annotation, and Assessment Metrics I, II

Cosmin Munteanu, Gerald Penn, Ronald Baecker, Elaine G. Toms, David James: Measuring the acceptable word error rate of machine-generated webcast transcripts.
Goshu Nagino, Makoto Shozakai: Analyzing reusability of speech corpus based on statistical multidimensional scaling method.
Susan Fitt, Korin Richmond: Redundancy and productivity in the speech technology lexicon - can we do better?
Takeshi Yamada, Masakazu Kumakura, Nobuhiko Kitawaki: Word intelligibility estimation of noise-reduced speech.
Christoph Draxler: Exploring the unknown - collecting 1000 speakers over the internet for the ph@ttsessionz database of adolescent speakers.
Timothy Murphy, Dorel Picovici, Abdulhussain E. Mahdi: A new single-ended measure for assessment of speech quality.
Ailbhe Ní Chasaide, John Wogan, Brian Ó Raghallaigh, Áine Ní Bhriain, Eric Zoerner, Harald Berthelsen, Christer Gobl: Speech technology for minority languages: the case of Irish (gaelic).
Francisco José Fraga, Carlos Alberto Ynoguti, André Godoi Chiovato: Further investigations on the relationship between objective measures of speech quality and speech recognition rates in noisy environments.
Volodya Grancharov, David Y. Zhao, Jonas Lindblom, W. Bastiaan Kleijn: Non-intrusive speech quality assessment with low computational complexity.
Min-Siong Liang, Ren-Yuan Lyu, Yuang-Chin Chiang: Using speech recognition technique for constructing a phonetically transcribed taiwanese (min-nan) text corpus.
Andrej Zgank, Tomaz Rotovnik, Matej Grasic, Marko Kos, Damjan Vlaj, Zdravko Kacic: Sloparl - slovenian parliamentary speech and text corpus for large vocabulary continuous speech recognition.
Hitoshi Aoki, Atsuko Kurashima, Akira Takahashi: Conversational quality estimation model for wideband IP-telephony services.
Kelley Kilanski, Jonathan Malkin, Xiao Li, Richard Wright, Jeff A. Bilmes: The vocal joystick data collection effort and vowel corpus.
Dmitry Sityaev, Katherine Knill, Tina Burrows: Comparison of the ITU-t p.85 standard to other methods for the evaluation of text-to-speech systems.
Christophe Van Bael, Lou Boves, Henk van den Heuvel, Helmer Strik: Automatic phonetic transcription of large speech corpora: a comparative study.
Speech Coding
Joon-Hyuk Chang, Woohyung Lim, Nam Soo Kim: Signal modification incorporating perceptual weighting filter.
Jani Nurminen: Enhanced dynamic codebook reordering for advanced quantizer structures.
Chang-Heon Lee, Sung-Kyo Jung, Thomas Eriksson, Won-Suk Jun, Hong-Goo Kang: An efficient segment-based speech compression technique for hand-held TTS systems.
V. Ramasubramanian, D. Harish: An unified unit-selection framework for ultra low bit-rate speech coding.
Jes Thyssen, Juin-Hwey Chen: Efficient VQ techniques and general noise shaping in noise feedback coding.
Yasheng Qian, Wei-Shou Hsu, Peter Kabal: Classified comfort noise generation for efficient voice transmission.
Balázs Kövesi, Dominique Massaloux, David Virette, Julien Bensa: Integration of a CELP coder in the ARDOR universal sound codec.
Saikat Chatterjee, T. V. Sreenivas: Two stage transform vector quantization of LSFs for wideband speech coding.
Saikat Chatterjee, T. V. Sreenivas: Comparison of prediction based LSF quantization methods using split VQ.
Kyle D. Anderson, Philippe Gournay: Pitch resynchronization while recovering from a late frame in a predictive speech decoder.
Speech Enhancement I, II
Suhadi Suhadi, Sorel Stan, Tim Fingscheidt: A novel environment-dependent speech enhancement method with optimized memory footprint.
Esfandiar Zavarehei, Saeed Vaseghi, Qin Yan: Weighted codebook mapping for noisy speech enhancement using harmonic-noise model.
Jesper Jensen, Richard C. Hendriks, Jan S. Erkelens, Richard Heusdens: MMSE estimation of complex-valued discrete Fourier coefficients with generalized gamma priors.
Amarnag Subramanya, Michael L. Seltzer, Alex Acero: Automatic removal of typed keystrokes from speech signals.

Wen Jin, Michael S. Scordilis: Single channel speech enhancement by frequency domain constrained optimization and temporal masking.
Jong Won Shin, Seung Yeol Lee, Hwan Sik Yun, Nam Soo Kim: Speech enhancement based on residual noise shaping.
Hannu Pulakka, Laura Laaksonen, Paavo Alku: Quality improvement of telephone speech by artificial bandwidth expansion - listening tests in three languages.
Benjamin J. Shannon, Kuldip K. Paliwal, Climent Nadeu: Speech enhancement based on spectral estimation from higher-lag autocorrelation.
Nitish Krishnamurthy, John H. L. Hansen: Noise update modeling for speech enhancement: when do we do enough?

Takahiro Murakami, Yoshihisa Ishida: Adaptive filtering for attenuating musical noise caused by spectral subtraction.
Myung-Suk Song, Chang-Heon Lee, Hong-Goo Kang: Performance analysis of various single channel speech enhancement algorithms for automatic speech recognition.
ASR Other I, II
Gilles Boulianne, Jean-Francois Beaumont, Maryse Boisvert, Julie Brousseau, Patrick Cardinal, Claude Chapdelaine, Michel Comeau, Pierre Ouellet, Frédéric Osterrath: Computer-assisted closed-captioning of live TV broadcasts in French.
Mohamed Afify, Ruhi Sarikaya, Hong-Kwang Jeff Kuo, Laurent Besacier, Yuqing Gao: On the use of morphological analysis for dialectal Arabic speech recognition.
Isabel Trancoso, Ricardo Nunes, Luís Neves, Céu Viana, Helena Moniz, Diamantino Caseiro, Ana Isabel Mata: Recognition of classroom lectures in european portuguese.
Thomas Pellegrini, Lori Lamel: Investigating automatic decomposition for ASR in less represented languages.
Abdillahi Nimaan, Pascal Nocera, Jean-François Bonastre: Automatic transcription of Somali language.
Özgür Çetin, Elizabeth Shriberg: Analysis of overlaps in meetings by dialog factors, hot spots, speakers, and collection site: insights for automatic speech recognition.
Ryu Takeda, Shun'ichi Yamamoto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno: Improving speech recognition of two simultaneous speech signals by integrating ICA BSS and automatic missing feature mask generation.
Wooil Kim, John H. L. Hansen: Missing-feature reconstruction for band-limited speech recognition in spoken document retrieval.
Hahn Koo, Yan Ming Cheng: Incremental learning of MAP context-dependent edit operations for spoken phone number recognition in an embedded platform.
Yasunari Obuchi, Nobuo Hataoka: Development and evaluation of speech database in automotive environments for practical speech recognition systems.
Dong Yu, Yun-Cheng Ju, Alex Acero: An effective and efficient utterance verification technology using word n-gram filler models.
J. M. Górriz, Javier Ramírez, Carlos García Puntonet, José C. Segura: An efficient bispectrum phase entropy-based algorithm for VAD.
Petr Cerva, Jan Nouza, Jan Silovský: Two-step unsupervised speaker adaptation based on speaker and gender recognition and HMM combination.
Satoshi Nakamura, Masakiyo Fujimoto, Kazuya Takeda: CENSREC2: corpus and evaluation environments for in car continuous digit speech recognition.
Cheng-Tao Chu, Yun-Hsuan Sung, Yuan Zhao, Daniel Jurafsky: Detection of word fragments in Mandarin telephone conversation.
Angel M. Gomez, Juan J. Ramos-Muñoz, Antonio M. Peinado, Victoria E. Sánchez: Multi-flow block interleaving applied to distributed speech recognition over IP networks.
Edward C. Lin, Kai Yu, Rob A. Rutenbar, Tsuhan Chen: Moving speech recognition from software to silicon: the in silico vox project.

Modeling Prosodic Features
Sankaranarayanan Ananthakrishnan, Shrikanth Narayanan: Combining acoustic, lexical, and syntactic evidence for automatic unsupervised prosody labeling.
Andrew Rosenberg, Julia Hirschberg: On the correlation between energy and pitch accent in read English speech.
Keikichi Hirose, Yasufumi Asano, Nobuaki Minematsu: Corpus-based generation of fundamental frequency contours using generation process model and considering emotional focuses.
Tomás Dubeda: Prosodic boundaries in Czech: an experiment based on delexicalized speech.
Lifu Yi, Jian Li, Xiaoyan Lou, Jie Hao: Totally data-driven intonation prediction model using a novel F0 contour parametric representation.
Laura Dilley, Mara Breen, Marti Bolivar, John Kraemer, Edward Gibson: A comparison of inter-transcriber reliability for two systems of prosodic annotation: rap (rhythm and pitch) and toBI (tones and break indices).
Spoken Information Retrieval

Kohei Iwata, Yoshiaki Itoh, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee: Open-vocabulary spoken document retrieval based on new subword models and subword phonetic similarity.
Xiang Li, Ea-Ee Jan, Cheng Wu, David Lubensky: Improved topic classification over maximum entropy model using k-norm based new objectives.
Yi-Cheng Pan, Jia-Yu Chen, Yen-shin Lee, Yi-Sheng Fu, Lin-Shan Lee: Efficient interactive retrieval of spoken documents with key terms ranked by reinforcement learning.
Katsuhito Sudoh, Hajime Tsukada, Hideki Isozaki: Discriminative named entity recognition of speech data using speech recognition confidence.
Ville T. Turunen, Mikko Kurimo: Using latent semantic indexing for morph-based spoken document retrieval.
Front-End Methods for ASR
Ralf Schlüter, András Zolnay, Hermann Ney: Feature combination using linear discriminant analysis and its pitfalls.

Frederik Stouten, Jean-Pierre Martens: Speech recognition with phonological features: some issues to attend.
Matthias Wölfel, Christian Fügen, Shajith Ikbal, John W. McDonough: Multi-source far-distance microphone selection and combination for automatic transcription of lectures.
Colin Breithaupt, Rainer Martin: Statistical analysis and performance of DFT domain noise reduction filters for robust speech recognition.
Luz García, José C. Segura, M. Carmen Benítez, Javier Ramírez, Ángel de la Torre: Normalization of the inter-frame information using smoothing filtering.
Muhammad Ghulam, Junsei Horikawa, Tsuneo Nitta: Comparative study on contributions of pitch-synchronization and peak-amplitude towards robustness issue of ASR.
Yasuo Ariki, Shunsuke Kato, Tetsuya Takiguchi: Phoneme recognition based on fisher weight map to higher-order local auto-correlation.
Hynek Boril, Petr Fousek, Petr Pollák: Data-driven design of front-end filter bank for Lombard speech recognition.
Andrej Ljolje: Optimization of class weights for LDA feature transformations.
Janne Pylkkönen: LDA based feature estimation methods for LVCSR.
Gholamreza Farahani, Seyed Mohammad Ahadi, Mohammad Mehdi Homayounpour: Robust feature extraction based on spectral peaks of group delay and autocorrelation function and phase domain analysis.
Sankaran Panchapagesan: Frequency warping by linear transformation of standard MFCC.
Language and Dialect Recognition
Ana Lilia Reyes-Herrera, Luis Villaseñor Pineda, Manuel Montes-y-Gómez: Automatic language identification using wavelets.
Josef G. Bauer, Ekaterina Timoshenko: Minimum classification error training of hidden Markov models for acoustic language identification.

Xi Yang, Lu-Feng Zhai, Man-Hung Siu, Herbert Gish: Improved language identification using support vector machines for language modeling.
Chi-Yueh Lin, Hsiao-Chuan Wang: Fusion of phonotactic and prosodic knowledge for language identification.
Víctor G. Guijarrubia, M. Inés Torres: Basque-Spanish language identification using phone-based methods.
Bianca Vieru-Dimulescu, Philippe Boula de Mareüil: Perceptual identification and phonetic analysis of 6 foreign accents in French.
Spoken Dialog Systems I, II
Petra Gieselmann, Alex Waibel: Dynamic extension of a grammar-based dialogue system: constructing an all-recipes knowing robot.
Alexander Gruenstein, Stephanie Seneff, Chao Wang: Scalable and portable web-based multimodal dialogue interaction with geographical databases.
Chantal Ackermann, Marion Libossek: System- versus user-initiative dialog strategy for driver information systems.
Filip Krsmanovic, Curtis Spencer, Daniel Jurafsky, Andrew Y. Ng: Have we met? MDP based speaker ID for robot dialogue.
Heriberto Cuayáhuitl, Steve Renals, Oliver Lemon, Hiroshi Shimodaira: Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces.
Jörg Mayer, Ekaterina Jasinskaja, Ulrike Kölsch: Pitch range and pause duration as markers of discourse hierarchy: perception experiments.
Antonio Roque, Anton Leuski, Vivek Kumar Rangarajan Sridhar, Susan Robinson, Ashish Vaswani, Shrikanth Narayanan, David R. Traum: Radiobot-CFF: a spoken dialogue system for military training.
Shinya Yamada, Toshihiko Itoh, Kenji Araki: Is voice quality enough? - study on how the situation and user²s awareness influence the utterance features.
Jozef Juhar, Stanislav Ondás, Anton Cizmar, Milan Rusko, Gregor Rozinaj, Roman Jarina: Development of slovak GALAXY/voiceXML based spoken language dialogue system to retrieve information from the internet.
Akinori Ito, Keisuke Shimada, Motoyuki Suzuki, Shozo Makino: A user simulator based on voiceXML for evaluation of spoken dialog systems.
Kristiina Jokinen, Topi Hurtig: User expectations and real experience on a multimodal interactive system.
Felix Burkhardt, Jitendra Ajmera, Roman Englert, Joachim Stegmann, Winslow Burleson: Detecting anger in automated voice portal dialogs.
Markku Turunen, Jaakko Hakulinen, Anssi Kainulainen: Evaluation of a spoken dialogue system with usability tests and long-term pilot studies: similarities and differences.
Fuliang Weng, Sebastian Varges, Badri Raghunathan, Florin Ratiu, Heather Pon-Barry, Brian Lathrop, Qi Zhang, Harry Bratt, Tobias Scheideck, Kui Xu, Matthew Purver, Rohit Mishra, Annie Lien, Madhuri Raya, Stanley Peters, Yao Meng, J. Russell, Lawrence Cavedon, Elizabeth Shriberg, Hauke Schmidt, R. Prieto: CHAT: a conversational helper for automotive tasks.
Kallirroi Georgila, James Henderson, Oliver Lemon: User simulation for spoken dialogue systems: learning and evaluation.
Speaker Characterization and Recognition I-IV
Yi-Hsiang Chao, Wei-Ho Tsai, Hsin-Min Wang, Ruei-Chuan Chang: Improving the characterization of the alternative hypothesis via kernel discriminant analysis for likelihood ratio-based speaker verification.
Zhenchun Lei, Yingchun Yang, Zhaohui Wu: A discriminative method for speaker verification using the difference information.
Nicolas Scheffer, Jean-François Bonastre: A multiclass framework for speaker verification within an acoustic event sequence system.
Bin Ma, Donglai Zhu, Rong Tong, Haizhou Li: Speaker cluster based GMM tokenization for speaker recognition.
Claudio Garretón, Néstor Becerra Yoma, Carlos Molina, Fernando Huenupán: Intra-speaker variability compensation in speaker verification with limited enrolling data.
Kishore Prahallad, Varanasi Sudhakar, Veluru Ranganatham, Krishna M. Bharat, S. Roy Debashish: Significance of formants from difference spectrum for speaker identification.
Maider Zamalloa, Germán Bordel, Luis Javier Rodríguez, Mikel Peñagarikano, Juan Pedro Uribe: Using genetic algorithms to weight acoustic features for speaker recognition.
Michael T. Padilla, Thomas F. Quatieri, Douglas A. Reynolds: Missing feature theory with soft spectral subtraction for speaker verification.
Ming Liu, Thomas S. Huang: Unsupervised learning of HMM topology for text-dependent speaker verification.
Jan Anguita, Javier Hernando: On the use of Jacobian adaptation in real speaker verification applications.
Ming Liu, Huazhong Ning, Thomas S. Huang, Zhengyou Zhang: A novel framework of text-independent speaker verification based on utterance transform and iterative cohort modeling.
Vinod Prakash, John H. L. Hansen: A cohort - UBM approach to mitigate data sparseness for in-set/out-of-set speaker recognition.
Vaishnevi S. Varadarajan, John H. L. Hansen: Analysis of lombard effect under different types and levels of noise with application to in-set speaker ID systems.
Alan McCree: Reducing speech coding distortion for speaker identification.
Tsuneo Kato, Hisashi Kawai: A text-prompted distributed speaker verification system implemented on a cellular phone and a mobile terminal.
Srikanth Vishnubhotla, Carol Y. Espy-Wilson: Automatic detection of irregular phonation in continuous speech.
V. Ramasubramanian, Deepak Vijaywargiay, Kumar V. Praveen: Highly noise robust text-dependent speaker recognition based on hypothesized wiener filtering.
Hiromasa Fujihara, Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno: Speaker identification under noisy environments by using harmonic structure extraction and reliable frame weighting.
Andreas Stergiou, Aristodemos Pnevmatikakis, Lazaros C. Polymenakos: Enhancing the performance of a GMM-based speaker identification system in a multi-microphone setup.
Andrew O. Hatch, Sachin S. Kajarekar, Andreas Stolcke: Within-class covariance normalization for SVM-based speaker recognition.
Carol Y. Espy-Wilson, Sandeep Manocha, Srikanth Vishnubhotla: A new set of features for text-independent speaker identification.
Uchechukwu O. Ofoegbu, Ananth N. Iyer, Robert E. Yantorno, Stanley J. Wenndt: Detection of a third speaker in telephone conversations.
Narayanaswamy Balakrishnan, Rashmi Gangadharaiah, Richard M. Stern: Voting for two speaker segmentation.
Rong Zheng, Shuwu Zhang, Bo Xu: A quality measure method using Gaussian mixture models and divergence measure for speaker identification.
Yushi Zhang, Waleed H. Abdulla: Gammatone auditory filterbank and independent component analysis for speaker identification.
Wei Wu, Thomas Fang Zheng, Ming-Xing Xu, Huanjun Bao: Study on speaker verification on emotional speech.
M. Farrs, Ainara Garde, Pascual Ejarque, Jordi Luque, Javier Hernando: On the fusion of prosody, voice spectrum and face features for multimodal person verification.
Tarun Pruthi, Carol Y. Espy-Wilson: An MRI based study of the acoustic effects of sinus cavities and its application to speaker recognition.
Mariko Kojima, Tomoko Matsui, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano: Speaker verification with non-audible murmur segments.
Christian Müller: Automatic recognition of speakers² age and gender on the basis of empirical studies.
Ilyas Potamitis, Todor Ganchev, Nikos Fakotakis: Automatic acoustic identification of insects inspired by the speaker recognition paradigm.
System Combination
Sabato Marco Siniscalchi, Jinyu Li, Chin-Hui Lee: A study on lattice rescoring with knowledge scores for automatic speech recognition.
Sebastian Stüker, Christian Fügen, Susanne Burger, Matthias Wölfel: Cross-system adaptation and combination for continuous speech recognition: the influence of phoneme set and acoustic front-end.
Rong Zhang, Alexander I. Rudnicky: Investigations of issues for using multiple acoustic models to improve continuous speech recognition.
I-Fan Chen, Lin-Shan Lee: A new framework for system combination based on integrated hypothesis space.
Björn Hoffmeister, Tobias Klein, Ralf Schlüter, Hermann Ney: Frame based system combination and a comparison with weighted ROVER and CNC.
Interpreting Prosodic Variation
Jiahong Yuan, Mark Liberman, Christopher Cieri: Towards an integrated understanding of speaking rate in conversation.
Minh-Quang Vu, Do Dat Tran, Eric Castelli: Prosody of interrogative and affirmative sentences in vietnamese language: analysis and perceptive results.
Jennifer J. Venditti, Julia Hirschberg, Jackson Liscombe: Intonational cues to student questions in tutoring dialogs.
Emiel Krahmer, Marc Swerts: Testing the effect of audiovisual cues to prominence via a reaction-time experiment.
Agustín Gravano, Julia Hirschberg: Effect of genre, speaker, and word class on the realization of given and new information.
Martti Vainio, Juhani Järvikivi, Stefan Werner: Word order and tonal shape in the production of focus in short Finnish utterances.
Articulatory Modeling
Bernd J. Kröger, Peter Birkholz, Jim Kannampuzha, Christiane Neuschaefer-Rube: Modeling sensory-to-motor mappings using neural nets and a 3d articulatory speech synthesizer.
Julie Fontecave, Frédéric Berthommier: Semi-automatic extraction of vocal tract movements from cineradiographic data.
Szu-Chen Stan Jou, Tanja Schultz, Matthias Walliczek, Florian Kraft, Alex Waibel: Towards continuous speech recognition using surface electromyography.
Korin Richmond: A trajectory mixture density network for the acoustic-articulatory inversion mapping.
Florian Metze: Articulatory features for "meeting" speech recognition.
Zdenek Krnoul, Milos Zelezný, Ludek Müller, Jakub Kanis: Training of coarticulation models using dominance functions and visual unit selection methods for audio-visual speech synthesis.
Acoustic Modeling I - Training and Topologies

Joseph Keshet, Shai Shalev-Shwartz, Samy Bengio, Yoram Singer, Dan Chazan: Discriminative kernel-based phoneme sequence recognition.
T. Nagarajan, Douglas D. O'Shaughnessy: Discriminative MLE training using a product of Gaussian likelihoods.
Xiaolong Li, Li Deng, Dong Yu, Alex Acero: A time-synchronous phonetic decoder for a long-contextual-Span hidden trajectory model.
Marta Casar, José A. R. Fonollosa: Analysis of HMM temporal evolution for automatic speech recognition and utterance verification.
Min Tang, Aravind Ganapathiraju: Improvements to bucket box intersection algorithm for fast GMM computation in embedded speech recognition systems.
Dirk Gehrig, Thomas Schaaf: A comparative study of Gaussian selection methods in large vocabulary continuous speech recognition.
Soo-Young Suk, Seong-Jun Hahm, Ho-Youl Jung, Hyun-Yeol Chung: A successive state and mixture splitting for optimizing the size of models in speech recognition.
Valentin Ion, Reinhold Haeb-Umbach: Improved source modeling and predictive classification for channel robust speech recognition.
Acoustic Signal Segmentation and Classification
Marco Kühne, Roberto Togneri: Automatic English stop consonants classification using wavelet analysis and hidden Markov models.
Tingyao Wu, Dirk Van Compernolle, Jacques Duchateau, Hugo Van Hamme: Single frame selection for phoneme classification.
Sorin Dusan, Lawrence R. Rabiner: On the relation between maximum spectral transition positions and phone boundaries.
T. Yingthawornsuk, H. Kaymaz Keskinpala, Daniel J. France, D. Mitchell Wilkes, Richard G. Shiavi, R. M. Salomon: Objective estimation of suicidal risk using vocal output characteristics.
E. Didiot, Irina Illina, Odile Mella, Dominique Fohr, Jean Paul Haton: A wavelet-based parameterization for speech/music segmentation.
Goshu Nagino, Makoto Shozakai: Distance measure between Gaussian distributions for discriminating speaking styles.
Franz Pernkopf, Tuan Van Pham: Bayesian networks for phonetic classification using time-scale features.
Nicole Beringer: Fast and effective retraining on contrastive vocal characteristics with bidirectional long short-term memory nets.
Ning Ma, Phil Green, André Coy: Exploiting dendritic autocorrelogram structure to identify spectro-temporal regions dominated by a single sound source.
Pairote Leelaphattarakij, Proadpran Punyabukkana, Atiwong Suchato: Locating phone boundaries from acoustic discontinuities using a two-staged approach.
Qiang Fu, Biing-Hwang Juang: Investigation on rescoring using minimum verification error (MVE) detectors.
Qiang Fu, Antonio Moreno-Daniel, Biing-Hwang Juang, Jian-Lai Zhou, Frank K. Soong: Generalization of the minimum classification error (MCE) training based on maximizing generalized posterior probability (GPP).
Michael A. Carlin, Brett Y. Smolenski, Stanley J. Wenndt: Unsupervised detection of whispered speech in the presence of normal phonation.
Xavier Anguera, Chuck Wooters, Javier Hernando: Friends and enemies: a novel initialization for speaker diarization.
Linguistics, Phonology, and Phonetics I, II
Kushan Surana, Janet Slifka: Acoustic cues for the classification of regular and irregular phonation.
Rattima Nitisaroj: Realizations and representations of Thai tones in monomoraic syllables.
Irene Jacobi, Louis C. W. Pols, Jan Stroop: Measuring and comparing vowel qualities in a Dutch spontaneous speech corpus.
Aijun Li, Qiang Fang, Ziyu Xiong: Phonetic research on accented Chinese in three dialectal regions: Shanghai, Wuhan and Xiamen.
Kuniko Y. Nielsen: Specificity and generalizability of spontaneous phonetic imitation.
Christophe Van Bael, Hans van Halteren: On the sufficiency of automatic phonetic transcriptions for pronunciation variation research.
Abe Kazemzadeh, Joseph Tepperman, Jorge F. Silva, Hong You, Sungbok Lee, Abeer Alwan, Shrikanth Narayanan: Automatic detection of voice onset time contrasts for use in pronunciation assessment.
Hiroko Hirano, Goh Kawai, Keikichi Hirose, Nobuaki Minematsu: Unfilled pauses in Japanese sentences read aloud by non-native learners.
Ryoji Hamabe, Kiyotaka Uchimoto, Tatsuya Kawahara, Hitoshi Isahara: Detection of quotations and inserted clauses and its application to dependency structure analysis in spontaneous Japanese.
Yoshimi Suzuki, Fumiyo Fukumoto: Thesaurus expansion using similar word pairs from patent documents.
Patrick Schone: Low-resource autodiacritization of abjads for speech keyword search.
Susan R. Hertz: A model of the regularities underlying speaker variation: evidence from hybrid synthesis.
Augustin Speyer: Pauses as a tool to ensure rhythmic wellformedness.
Michiko Watanabe, Yasuharu Den, Keikichi Hirose, Shusaku Miwa, Nobuaki Minematsu: Factors affecting speakers² choice of fillers in Japanese presentations.
Marelie H. Davel, Etienne Barnard: Developing consistent pronunciation models for phonemic variants.
Jinsik Lee, Seungwon Kim, Gary Geunbae Lee: Grapheme-to-phoneme conversion using automatically extracted associative rules for Korean TTS system.
Speech Translation
Jason Riesa, Behrang Mohit, Kevin Knight, Daniel Marcu: Building an English-iraqi Arabic machine translation system for spoken utterances with limited resources.
Sameer Maskey, Bowen Zhou, Yuqing Gao: A phrase-level machine translation approach for disfluency detection using weighted finite state transducers.
Jonghoon Lee, Donghyeon Lee, Gary Geunbae Lee: Improving phrase-based Korean-English statistical machine translation.
David Stallard, Fred Choi, Kriste Krstovski, Prem Natarajan, Rohit Prasad, Shirin Saleem: A hybrid phrase-based/statistical speech translation system.
Roger Hsiao, Ashish Venugopal, Thilo Köhler, Ying Zhang, Paisarn Charoenpornsawat, Andreas Zollmann, Stephan Vogel, Alan W. Black, Tanja Schultz, Alex Waibel: Optimizing components for handheld two-way speech translation for an English-iraqi Arabic system.
Acoustic Modeling II - Adaptation
Armin Sehr, Marcus Zeller, Walter Kellermann: Distant-talking continuous speech recognition based on a novel reverberation model in the feature domain.
Xin Lei, Jon Hamaker, Xiaodong He: Robust feature space adaptation for telephony speech recognition.
Nattanun Thatphithakkul, Boontee Kruatrachue, Chai Wutiwiwatchai, Sanparith Marukatat, Vataya Boonpiam: A simulated-data adaptation technique for robust speech recognition.
Hans-Günter Hirsch, Harald Finster: A new HMM adaptation approach for the case of a hands-free speech input in reverberant rooms.
Yu Tsao, Chin-Hui Lee: A vector space approach to environment modeling for robust speech recognition.
Emotional Speech and Speaker State
Björn Schuller, Niels Köhler, Ronald Müller, Gerhard Rigoll: Recognition of interest in human conversational speech.
Hua Ai, Diane J. Litman, Katherine Forbes-Riley, Mihai Rotaru, Joel R. Tetreault, Amruta Purandare: Using system and user performance features to improve emotion detection in spoken tutoring dialogs.
Laurence Devillers, Laurence Vidrascu: Real-life emotions detection with lexical and paralinguistic cues on human-human call center dialogs.
Daniel Neiberg, Kjell Elenius, Kornel Laskowski: Emotion recognition in spontaneous speech using GMMs.
Frank Enos, Stefan Benus, Robin L. Cautin, Martin Graciarena, Julia Hirschberg, Elizabeth Shriberg: Personality factors in human deception detection: comparing human to machine performance.
Speech and Language in Education
Leen Cleuren, Jacques Duchateau, Alain Sips, Pol Ghesquière, Hugo Van Hamme: Developing an automatic assessment tool for children²s oral reading.
Christopher J. Waple, Yasushi Tsubota, Masatake Dantsuji, Tatsuya Kawahara: Prototyping a call system for students of Japanese using dynamic diagram generation and interactive hints.
Dominic W. Massaro, Ying Liu, Trevor H. Chen, Charles Perfetti: A multilingual embodied conversational agent for tutoring speech and language learning.
Michael Heilman, Kevyn Collins-Thompson, Jamie Callan, Maxine Eskenazi: Classroom success of an intelligent tutoring system for lexical practice and reading comprehension.
Jack Mostow: Is ASR accurate enough for automated reading tutors, and how can we tell?
Chiharu Tsurutani, Yutaka Yamauchi, Nobuaki Minematsu, Dean Luo, Kazutaka Maruyama, Keikichi Hirose: Development of a program for self assessment of Japanese pronunciation by English learners.
Joseph Tepperman, Jorge F. Silva, Abe Kazemzadeh, Hong You, Sungbok Lee, Abeer Alwan, Shrikanth Narayanan: Pronunciation verification of children²s speech for automatic literacy assessment.
Sherif Mahdy Abdou, Salah Eldeen Hamid, Mohsen Rashwan, Abdurrahman Samir, Ossama Abdel-Hamid, Mostafa Shahin, Waleed Nazih: Computer aided pronunciation learning system using speech recognition techniques.
Speech Perception I, II

Geoffrey Stewart Morrison: An adaptive sampling procedure for speech perception experiments.
Navin Viswanathan, James S. Magnuson, Carol A. Fowler: Disentangling gestural and auditory contrast accounts of compensation for coarticulation.
Michael C. W. Yip: The role of positional probability in the segmentation of Cantonese speech.
Nao Hodoshima, Dawn M. Behne, Takayuki Arai: Steady-state suppression in reverberation: a comparison of native and nonnative speech perception.
Akiyo Joto: Effect of dynamic information of formants on discrimination of English vowels in consonantal contexts by Japanese listeners.
Yue Wang, Dawn M. Behne, Haisheng Jiang, Chad Danyluck: Native and nonnative audio-visual perception of English fricatives in quiet and cafe-noise backgrounds.
Sven Grawunder, Ines Bose, Birgit Hertha, Franziska Trauselt, Lutz Christian Anders: Perceptive and acoustic measurement of average speaking pitch of female and male speakers in German radio news.
Peter F. Assmann, Sophia Dembling, Terrance M. Nearey: Effects of frequency shifts on perceived naturalness and gender information in speech.
Hitomi Tohyama, Shigeki Matsubara: Influence of pause length on listeners² impressions in simultaneous interpretation.
Iris-Corinna Schwarz, Denis Burnham: New measures to chart toddlers² speech perception and language development: a test of the lexical restructuring hypothesis.
Ángel de la Torre, Cristina Roldán, Manuel Sainz: Perception of fundamental frequency in cochlear implant patients.
Sarah C. Creel, Delphine Dahan, Daniel Swingley: Effects of featural similarity and overlap position on lexical confusions and overt similarity judgments.
Cécile Woehrling, Philippe Boula de Mareüil: Identification of regional accents in French: perception and categorization.
Mirjam Broersma: Accident - execute: increased activation in nonnative listening.
Kirstin Scholz, Marcel Wältermann, Lu Huo, Alexander Raake, Sebastian Möller, Ulrich Heute: Estimation of the quality dimension "directness/frequency content" for the instrumental assessment of speech quality.
Speech Production, Physiology, and Pathology I, II
Mark Pluymaekers, Mirjam Ernestus, R. Harald Baayen: Effects of word frequency on the acoustic durations of affixes.
Xiaochuan Niu, Alexander Kain, Jan P. H. van Santen: A noninvasive, low-cost device to study the velopharyngeal port during speech and some preliminary results.
Noureddine Aboutabit, Denis Beautemps, Laurent Besacier: Characterization of cued speech vowels from the inner lip contour.
Christer Gobl: Modelling aspiration noise during phonation using the LF voice source model.
Jianguo Wei, Xugang Lu, Jianwu Dang: A simulation based parameter optimization for a coarticulation model.
Abdellah Kacha, Francis Grenez, Jean Schoentgen: Multivariate analysis of frame-based acoustic cues of dysperiodicities in connected speech.
Tom Kovacs, Donald S. Finan: Effects of midline tongue piercing on spectral centroid frequencies of sibilants.
P. Vijayalakshmi, M. Ramasubba Reddy, Douglas D. O'Shaughnessy: Assessment of articulatory sub-systems of dysarthric speech using an isolated-style phoneme recognition system.
Donald S. Finan, Carol A. Boliek: Respiratory/laryngeal interactions during sustained vowel production in children.
Oscar Saz, Antonio Miguel, Eduardo Lleida, Alfonso Ortega, Luis Buera: Study of time and frequency variability in pathological speech and error reduction methods for automatic speech recognition.
Markus Iseli, Yen-Liang Shue, Melissa A. Epstein, Patricia A. Keating, Jody Kreiman, Abeer Alwan: Voice source correlates of prosodic features in american English: a pilot study.
Louis ten Bosch, R. Harald Baayen, Mirjam Ernestus: On speech variation and word type differentiation by articulatory feature representations.
Sungbok Lee, Erik Bresch, Jason Adams, Abe Kazemzadeh, Shrikanth Narayanan: A study of emotional speech articulation using a fast magnetic resonance imaging technique.
Gang Feng, Cyril Kotenkoff: New considerations for vowel nasalization based on separate mouth-nose recording.
