September 2
Date: September 2
Session A1-O1: L2 Speech, Bilingualism and Code-Switching
Time: 11:30 – 12:10
Location: Panacea Amphitheater
Title: Second language (L2) Speech and Perception
Abstract: 6102
Author/Presenter: Ann Bradlow (Northwestern University)
Date: September 2
Session A3-O1: Paralinguistics
Time: 15:00 – 15:40
Location: Aegle B
Title: Paralinguistics: Past and Present
Abstract: 6302
Author/Presenter: Anton Batliner (University of Munich)
Date: September 2
Session A7-O2: Zero-shot TTS
Time: 15:00 – 15:40
Location: Aegle A
Title: Zero-shot text-to-speech
Abstract: 6702
Author/Presenter: Zhizheng Wu (Chinese University of Hong Kong, Shenzhen)
September 3
Date: September 3
Session A8-O3: LLM in ASR
Time: 13:30 – 14:10
Location: Iasso
Title: Development of Spoken Language Models
Abstract: 6802
Author/Presenter: Hung-yi Lee (National Taiwan University)
Date: September 3
Session A9-O2: Vision and Speech
Time: 13:30 – 14:10
Location: Aegle B
Title: Visually Grounded Speech Models
Abstract: 6901
Author/Presenter: David Harwath (The University of Texas at Austin)
September 4
Date: September 4
Session A3-O3: Speech Emotion Recognition
Time: 10:00 – 10:40
Location: Iasso
Title: Automatic Emotion Detection/Recognition
Abstract: 6301
Author/Presenter: Carlos Busso (University of Texas at Dallas)
Date: September 4
Session A5-O5: Speech Quality Assessment
Time: 10:00 – 10:40
Location: Aegle B
Title: Neural Speech Assessment Metrics
Abstract: 6503
Author/Presenter: Yu Tsao (Academia Sinica)
Date: September 4
Session A9-O3: Novel Architectures for ASR
Time: 10:00 – 10:40
Location: Hippopcrates
Title: Novel Architecture for ASR
Abstract: 6902
Author/Presenter: Rohit Prabhavalkar (Google)
Date: September 4
Session A5-O6: Speech and Audio Modelling
Time: 13:30 – 14:10
Location: Melambus
Title: Toward Speech and Audio Foundation Models
Abstract: 6501
Author/Presenter: Shinji Watanabe (Carnegie Mellon University)
September 5
Date: September 5
Session A7-O7: Privacy and Security in
Speech Communication 2
Time: 10:00 – 10:40
Location: Aegle A
Title: Fake Voice and Voice Privacy
Abstract: 6701
Author/Presenter: Junichi Yamagishi (National Institute of Informatics); Xin Wang (National Institute of Informatics)
Date: September 5
Session A4-O7: Self and Weakly-Labelled Speaker Verification
Time: 13:30 – 14:10
Location: Iasso
Title: Self-supervised Based Speaker Recognition
Abstract: 6404
Author/Presenter: Themos Stafylakis (Omilia – Conversational Intelligence)
Date: September 5
Session A6-O7: Deep Learning-Based
Speech Enhancement: Approaches, Scalability, and Evaluation
Time: 13:30 – 14:10
Location: Melambus
Title: Deep Learning Based Speech Enhancement
Abstract: 6601
Author/Presenter: Timo Gerkmann (Universität Hamburg)
Date: September 5
Session A8-O8: Self-Supervised Learning for ASR
Time: 13:30 – 14:10
Location: Aegle B
Title: Multilingual Speech Representations
Abstract: 6801
Author/Presenter: Bhuvana Ramabhadran (Google)