Program at a Glance

Sunday 1 September 2024

Time / Session Melambus Acesso Homer Syndicate 2.1
8:00 – 19:00 Registration at KICC First Floor (Level 1)
9:00 – 12:15 Tutorial 1: Seamless Communication: Towards a Universal Translation System Tutorial 2: Neural Speech and Audio Coding Tutorial 3: Responsible Development and Translation of Clinical Speech Analytics Tutorial 4: Embracing Inclusivity: Bridging Gaps, Breaking Barriers, and Beyond Challenges in developing real-time embedded conversational AI for assistive technology
12:15 – 13:45 Break
13:45 – 17:00 13:45 – 16:45 Tutorial 7: Intracortical Brain-Computer Interface (iBCI) Speech Neuro-prosthesis Foundations and Techniques 13:45 – 16:45 Tutorial 6: Recent Advances in Speech Language Models Tutorial 3: Responsible Development and Translation of Clinical Speech Analytics 13:45 – 16:45 Tutorial 8: Optimal transport (OT) in speech: OT meets speech

Monday 2 September 2024

Time / Session Hippocra-tes Panacea Amphithea-ther Acesso Aegle A Aegle B Iasso Melambus Corridor - Hippocrates right side Corridor - Hippocrates left side Yanis Club Exhibition
Poster Area 1A Poster Area 1B Poster Area 2A Poster Area 2B Poster Area 3A Poster Area 3B Poster Area 4A Poster Area 4B
7:30 – 19:00 Registration 07:30 - 19:00 - Registration at KICC First Floor (Level 1)
9:00 – 10:00 Opening Ceremony (Hippocrates)
10:00 – 11:00 Keynote 1 (Hippocrates) - Isabel Trancoso "Towards Responsible Speech Processing"
11:00 – 11:30 Break
11:30 – 13:30 A9-O1 Decoding Algorithms A1-O1 - L2 Speech, Bilingualism and Code-Switching A5-O2 Detection and Classification of Bioacoustic Signals A7-O1 Speech Synthesis: Voice Conversion 1 A4-O1 Speaker Diarization 1 A10-O1 Pronunciation Assessment A6-O1 Acoustic Echo Cancellation A8-P1 Neural Network Architectures for ASR 2 A5-P1-A Speech and Audio Analysis and Representa-tions A5-P1-B Acoustic Event Detection and Classifica-tion 2 A12-P1-A Spoken Language Processing A12-P1-B Spoken Machine Translation 2 SS2 Biosigna-lenabled Spoken Communica-tion
13:30 – 15:00 Break
15:00 – 17:00 A8-O1 Noise Robustness, FarField, and Multi-Talker ASR A2-O1 Individual and Social Factors in Phonetics A5-O1 Audio Event Detection and Classification 1 A7-O2 Zero-shot TTS A3-O1 Paralinguistics A11-O1 Spoken Language Understan-ding A12-O1 Spoken Machine Translation 1 A13-P1-A Hearing Disorders A13-P1-B Speech Disorders 2 A4-P1 Speaker Recognition: Adversarial and Spoofing Attacks A5-P2-B Source Separation 2 A9-P1 Contextual Biasing and Adaptation A6-P1-A Noise Reduction, Dereverbera-tion, and Echo Cancellation A6-P1-B Computa-tionally Efficient Speech Enhancement SS13 TAUKADIAL Challenge: Speech-Based Cognitive Assessment in Chinese and English (Special Session) ST1 Show and Tell 1
17:00 – 17:30 Break
17:30 – 19:00 GA ISCA General Assembly
19:00 – 20:30 Welcome Reception at Kos Island Marina

Tuesday 3 September 2024

Time / Session Hippocra-tes Panacea Amphithea-ther Acesso Aegle A Aegle B Iasso Melambus Corridor - Hippocrates right side Corridor - Hippocrates left side Yanis Club Exhibition
Poster Area 1A Poster Area 1B Poster Area 2A Poster Area 2B Poster Area 3A Poster Area 3B Poster Area 4A Poster Area 4B
8:00 – 17:00 Registration 8:00 - 17:00 - Registration at KICC First Floor (Level 1)
8:30 – 09:30 Keynote 2 (Hippocrates) - Shoko Araki "Frontier of Frontend for Conversational Speech Processing"
9:30 – 10:00 Break
10:00 – 12:00 A7-O3 Speech Synthesis: Evaluation A13-O1 Pathological Speech Analysis 1 A2-O2 Phonetics and Phonology of Second Language Acquisition A4-O2 Spoofing and Deepfake Detection A8-O2 Multilingual ASR A6-O2 Generative Speech Enhancement A3-P1-A Corpora-based Approaches in Automatic Emotion Recognition A3-P1-B Analysis of Speakers States and Traits A5-P2-A Audio Captioning, Tagging, and Audio-Text Retrieval A8-P2 General Topics in ASR A12-P2-A Spoken Language Understan-ding A12-P2-B Speech and Multimodal Resources SS5A Speech and Language in Health: from Remote Monitoring to Medical Conversations - 1 (Special Session)
12:00 – 13:30 Break
13:30 – 15:30 A7-O4 Speech Synthesis: Singing Voice Synthesis A5-O3 Audio-Text Retrieval A1-O2 Speech and Brain A3-O2 Emotion Recognition: Resources and Benchmarks A9-O2 Vision and Speech A8-O3 LLM in ASR A12-O2 Spoken Document Summarization A2-P1-A Innovative Methods in Phonetics and Phonology A2-P1-B Voice, Tones and F0 A6-P2-A Speech Enhance-ment A6-P2-B Speech Coding A4-P2 Speaker and Language Identification and Diarization A7-P1-A Speech Synthesis: Expressivity and Emotion A7-P1-B Speech Synthesis: Tools and Data SS5B Speech and Language in Health: from Remote Monitoring to Medical Conversations - 2 (Special Sessions) ST2 Show and Tell 2
15:30 – 16:00 Break
16:00 – 18:00 A7-O5 Speech Synthesis: Prosody A5-O4 Source Separation 1 A2-O3 Prosody A4-O3 Foundational Models for Deepfake and Spoofed Speech Detection A10-O2 ASR and LLMs A8-O4 Neural Network Adaptation SS10 Speech Processing Using Discrete Speech Units (Special Session) A13-P2-A Pathological Speech Analysis 3 A13-P2-B Speech Disorders 3 A8-P3 Accented Speech, Prosodic Features, Dialect, Emotion, Sound Classification A6-P3-A Audio-Visual and Generative Speech Enhance-ment A6-P3-B Speech Privacy and Bandwidth Expansion A4-P3 Speaker Recognition 1 SS9 Speech Recognition with Large Pretrained Speech Models for Under-represented Languages (Special Session)
19:00 – 21:00 Student Evening at the Olympic Pool of the Kipriotis Village Hotel / Reviewer Reception at the Kipriotis Panorama (by invitation only)

Wednesday 4 September 2024

Time / Session Hippocra-tes Panacea Amphithea-ther Acesso Aegle A Aegle B Iasso Melambus Corridor - Hippocrates right side Corridor - Hippocrates left side Yanis Club Exhibition
Poster Area 1A Poster Area 1B Poster Area 2A Poster Area 2B Poster Area 3A Poster Area 3B Poster Area 4A Poster Area 4B
8:00 – 17:00 Registration 8:00 - 17:00 - Registration at KICC First Floor (Level 1)
8:30 – 09:30 Keynote 3 (Hippocrates) - Elmar Nöth "Analysis of Pathological Speech – Pitfalls along the Way"
9:30 – 10:00 Break
10:00 – 12:00 A9-O3 Novel Architectures for ASR A13-O2 Pathological Speech Analysis 2 A4-O4 Self-Supervised Models in Speaker Recognition A6-O3 Privacy and Security in Speech Communication 1 A5-O5 Speech Quality Assessment A3-O3 Speech Emotion Recognition A11-O2 Spoken Dialogue Systems and Conversational Analysis 1 A1-P1-A Databases and Progress in Methodo-logy A1-P1-B Articulation, Conver-gence and Perception A8-P4 Training Methods, Self Supervised Learning, Adaptation A10-P1 Multimo-dality and Foundation Models A12-P4 Speech Technology A7-P2-A Speech Synthesis: Voice Conversion 2 A7-P2-B Speech Synthesis: Text Processing SS1 Speech Science, Speech Technology, and Gender (Special Session)
12:00 – 13:30 Break
13:30 – 15:30 A8-O5 Neural Network Architectures for ASR 1 A1-O3 Speech Production and Perception A2-O4 Phonetics and Phonology: Segmentals and Supraseg-mentals A6-O4 Multi-Channel Speech Enhancement A9-O4 Error Correction and Rescoring A4-O5 Speaker Verification A5-O6 Speech and Audio Modelling A3-P2 Topics in Paralinguistics A3-P2-B Emotion Recogni-tion: Fairness, Variability, Uncertainty A5-P3-A Spatial Audio and Acoustics A5-P3-B Generative Models for Speech and Audio A11-P1-A Spoken Language Understan-ding A11-P1-B Spoken Dialogue Systems and Conversational Analysis 2 A7-P3-A Speech Synthesis: Paradigms and Methods 1 A7-P3-B Speech Synthesis: Paradigms and Methods 2 SS3 Computa-tional Models of Human Language Acquisition, Perception, and Production (Special Session) ST3 Show and Tell 3
15:30 – 16:00 Break
16:00 – 18:00 A8-O6 ASR Model Training Methods A13-O3 Dysarthric Speech Assessment A5-O7 Speech and Audio Analysis A6-O5 Speech Quality and Intelligibility: Prediction and Enhan-cement A10-O3 Speech Assessment A3-O4 New Avenues in Emotion Recognition A7-O6 Speech Synthesis: Vocoders A4-P4-A Speaker Diarization 2 A4-P4-B Speaker Recognition 2 A9-P2 Cross-Lingual and Multilingual Processing A11-P2-A Question Answering from Speech and Spoken Dialogue Systems A11-P2-B Spoken Dialogue Systems and Conversational Analysis 3 A2-P2-A Phonetics, Phonology and Prosody A2-P2-B Segmentals SS6 Spoken Language Models for Universal Speech Processing (Special Session)
19:30 – 23:00 Conference Banquet at the Kipriotis Panorama's Pool

Thursday 5 September 2024

Time / Session Hippocra-tes Panacea Amphithea-ther Acesso Aegle A Aegle B Iasso Melambus Corridor - Hippocrates right side Corridor - Hippocrates left side Yanis Club Exhibition
Poster Area 1A Poster Area 1B Poster Area 2A Poster Area 2B Poster Area 3A Poster Area 3B Poster Area 4A Poster Area 4B
8:00 – 16:30 Registration 8:00 - 16:30 - Registration at KICC First Floor (Level 1)
8:30 – 09:30 Keynote 4 (Hippocrates) - Barbara Tillmann "Perception of Music and Speech: Focus on Rhythm Processing"
9:30 – 10:00 Break
10:00 – 12:00 A2-O5 Experimen-tal Phonetics and Laboratory Phonology A5-O8 Speech Type Classifica-tion SS4 Leveraging Large Language Models and Contextual Features for Phonetic Analysis (Special Session) A7-O7 Privacy and Security in Speech Communica-tion 2 A8-O7 Streaming ASR A4-O6 Speaker recognition evaluation and resources A6-O6 Target Speaker Extraction A1-P2-A L1/L2 Acquisition and CrossLinguistic Factors A1-P2-B Speaker Stance, Emotion and Language External Factors A9-P3 Computa-tional Resource Constrained ASR A12-P3-A Evaluation of Speech Technology Systems A12-P3-B Neural Network Training for Speech Recognition A7-P4-A Speech Synthesis: Voice Conversion 3 A7-P4-B Speech Synthesis: Paradigms and Methods 3 SS7 Responsible Speech Foundation Models (Special Session)
12:00 – 13:30 Break
13:30 – 15:30 A13-O4 Speech Disorders 1 A5-O9 Fake Audio Detection A12-O3 Spoken Term Detection and Speech Retrieval A7-O8 Speech synthesis: Crosslingual and multilingual aspects A8-O8 SelfSupervised Learning for ASR A4-O7 Self and Weakly-Labelled Speaker Verification A6-O7 Deep Learning Based Speech Enhance-ment: Approaches, Scalability, and Evaluation A3-P3-A Multimodal Para-linguistics A3-P3-B Automatic Emotion Recognition A5-P4-A Acoustic Event Detection, Segmenta-tion and Classifica-tion A5-P4-B Speech and Audio Modelling A7-P5-A Speech Synthesis: Other Topics 1 A7-P5-B Speech Synthesis: Other Topics 2 A8-P5 Noise, Far-Field, MultiTalker, Enhance-ment, Audio Classifica-tion SS8 Connecting Speech-science and Speech-technology for Children’s Speech (Special Session) ST4 Show and Tell 4
15:30 – 16:00 Break
16:00 – 17:00 Closing Ceremony (Hippocrates)