Speaker & Noise Variability – Making Speech Systems Robust

John Hansen

University of Texas at Dallas

Tuesday, February 26
12:30 p.m., Conference Room 5A

There is significant interest in the development of effective human-machine interactive systems for a wider range of personal services. While Speech and Speaker Recognition research has advanced significantly in recent years, performance in real environments remains a major challenge. In this talk, we consider recent research efforts in the field of speech, speaker, and noise modeling for robust speech recognition that consider speaker and noise variability. We provide a brief overview of the research activities of CRSS – Center for Robust Speech Systems, including related research interests within the Erik Jonsson School of Engineering & Computer Science. The talk centers on two main parts.

Part 1: Differences in speech production have a significant impact on speech and speaker recognition systems. Here, we focus on speech production variability due to vocal effort (e.g., whisper, soft, neutral, loud, shout), Lombard Effect (speech produced in noise), singing vs. speaking, and how such speaker variability impacts speaker recognition systems.
Part 2: Robust speech and speaker recognition requires advancements in modeling the environment. In this area, we consider recent advancements in recognizing and monitoring human interaction for (i) the DARPA RATS program, (ii) Prof-Life-Log task including Spoken Document Retrieval (SpeechFind® http://SpeechFind.utdallas.edu) , and (iii) speech from earth to the moon.

Bio:

John H.L. Hansen received the Ph.D. & M.S. degrees in Electrical Engineering from Georgia Institute of Technology, and B.S.E.E. degree from Rutgers Univ., College of Engineering, N.J. He joined Univ. of Texas at Dallas (UTDallas), Erik Jonsson School of Engineering & Computer Science in 2005, where he is Associate Dean for Research, Professor of Electrical Engineering, Distinguished Univ. Chair in Telecommunications Engineering, and holds a joint appointment in the School of Behavioral & Brain Sciences (Speech & Hearing). At UTDallas, he established The Center for Robust Speech Systems (CRSS). He is an ISCA Fellow, IEEE Fellow, Member and past TC Chair of IEEE Signal Processing Society Speech & Language Processing Technical Committee, ISCA Distinguished Lecturer, and previously served as Technical Advisor to U.S. Delegate for NATO (IST/TG-01), Associate Editor for IEEE Trans. Speech & Audio Processing, Associate Editor for IEEE Signal Processing Letters, Editorial Board Member for the IEEE Signal Processing Magazine. He has supervised 59 PhD/MS thesis candidates, was recipient of 2005 University of Colorado Teacher Recognition Award, and author/co-author of 448 journal and conference papers, including 11 books in the field of speech processing and language technology. He served as General Chair for Interspeech-2002, and Co-Organizer and Technical Program Chair for IEEE ICASSP-2010, Dallas, TX.