Sunday 1 September 2024
Time / Session | Melambus | Acesso | Homer | Syndicate 2.1 |
8:00 – 19:00 | Registration at KICC First Floor (Level 1) | |||
9:00 – 12:15 | Tutorial 1: Seamless Communication: Towards a Universal Translation System | Tutorial 2: Neural Speech and Audio Coding | Tutorial 3: Responsible Development and Translation of Clinical Speech Analytics | Tutorial 4: Embracing Inclusivity: Bridging Gaps, Breaking Barriers, and Beyond Challenges in developing real-time embedded conversational AI for assistive technology |
12:15 – 13:45 | Break | |||
13:45 – 17:00 | 13:45 – 16:45 Tutorial 7: Intracortical Brain-Computer Interface (iBCI) Speech Neuro-prosthesis Foundations and Techniques | 13:45 – 16:45 Tutorial 6: Recent Advances in Speech Language Models | Tutorial 3: Responsible Development and Translation of Clinical Speech Analytics | 13:45 – 16:45 Tutorial 8: Optimal transport (OT) in speech: OT meets speech |
Monday 2 September 2024
Time / Session | Hippocra-tes | Panacea Amphithea-ther | Acesso | Aegle A | Aegle B | Iasso | Melambus | Corridor - Hippocrates right side | Corridor - Hippocrates left side | Yanis Club | Exhibition | ||||||
Poster Area 1A | Poster Area 1B | Poster Area 2A | Poster Area 2B | Poster Area 3A | Poster Area 3B | Poster Area 4A | Poster Area 4B | ||||||||||
7:30 – 19:00 | Registration 07:30 - 19:00 - Registration at KICC First Floor (Level 1) | ||||||||||||||||
9:00 – 10:00 | Opening Ceremony (Hippocrates) | ||||||||||||||||
10:00 – 11:00 | Keynote 1 (Hippocrates) - Isabel Trancoso "Towards Responsible Speech Processing" | ||||||||||||||||
11:00 – 11:30 | Break | ||||||||||||||||
11:30 – 13:30 | A9-O1 Decoding Algorithms | A1-O1 - L2 Speech, Bilingualism and Code-Switching | A5-O2 Detection and Classification of Bioacoustic Signals | A7-O1 Speech Synthesis: Voice Conversion 1 | A4-O1 Speaker Diarization 1 | A10-O1 Pronunciation Assessment | A6-O1 Acoustic Echo Cancellation | A8-P1 Neural Network Architectures for ASR 2 | A5-P1-A Speech and Audio Analysis and Representa-tions | A5-P1-B Acoustic Event Detection and Classifica-tion 2 | A12-P1-A Spoken Language Processing | A12-P1-B Spoken Machine Translation 2 | SS2 Biosigna-lenabled Spoken Communica-tion | ||||
13:30 – 15:00 | Break | ||||||||||||||||
15:00 – 17:00 | A8-O1 Noise Robustness, FarField, and Multi-Talker ASR | A2-O1 Individual and Social Factors in Phonetics | A5-O1 Audio Event Detection and Classification 1 | A7-O2 Zero-shot TTS | A3-O1 Paralinguistics | A11-O1 Spoken Language Understan-ding | A12-O1 Spoken Machine Translation 1 | A13-P1-A Hearing Disorders | A13-P1-B Speech Disorders 2 | A4-P1 Speaker Recognition: Adversarial and Spoofing Attacks | A5-P2-B Source Separation 2 | A9-P1 Contextual Biasing and Adaptation | A6-P1-A Noise Reduction, Dereverbera-tion, and Echo Cancellation | A6-P1-B Computa-tionally Efficient Speech Enhancement | SS13 TAUKADIAL Challenge: Speech-Based Cognitive Assessment in Chinese and English (Special Session) | ST1 Show and Tell 1 | |
17:00 – 17:30 | Break | ||||||||||||||||
17:30 – 19:00 | GA ISCA General Assembly | ||||||||||||||||
19:00 – 20:30 | Welcome Reception at Kos Island Marina |
Tuesday 3 September 2024
Time / Session | Hippocra-tes | Panacea Amphithea-ther | Acesso | Aegle A | Aegle B | Iasso | Melambus | Corridor - Hippocrates right side | Corridor - Hippocrates left side | Yanis Club | Exhibition | ||||||
Poster Area 1A | Poster Area 1B | Poster Area 2A | Poster Area 2B | Poster Area 3A | Poster Area 3B | Poster Area 4A | Poster Area 4B | ||||||||||
8:00 – 17:00 | Registration 8:00 - 17:00 - Registration at KICC First Floor (Level 1) | ||||||||||||||||
8:30 – 09:30 | Keynote 2 (Hippocrates) - Shoko Araki "Frontier of Frontend for Conversational Speech Processing" | ||||||||||||||||
9:30 – 10:00 | Break | ||||||||||||||||
10:00 – 12:00 | A7-O3 Speech Synthesis: Evaluation | A13-O1 Pathological Speech Analysis 1 | A2-O2 Phonetics and Phonology of Second Language Acquisition | A4-O2 Spoofing and Deepfake Detection | A8-O2 Multilingual ASR | A6-O2 Generative Speech Enhancement | A3-P1-A Corpora-based Approaches in Automatic Emotion Recognition | A3-P1-B Analysis of Speakers States and Traits | A5-P2-A Audio Captioning, Tagging, and Audio-Text Retrieval | A8-P2 General Topics in ASR | A12-P2-A Spoken Language Understan-ding | A12-P2-B Speech and Multimodal Resources | SS5A Speech and Language in Health: from Remote Monitoring to Medical Conversations - 1 (Special Session) | ||||
12:00 – 13:30 | Break | ||||||||||||||||
13:30 – 15:30 | A7-O4 Speech Synthesis: Singing Voice Synthesis | A5-O3 Audio-Text Retrieval | A1-O2 Speech and Brain | A3-O2 Emotion Recognition: Resources and Benchmarks | A9-O2 Vision and Speech | A8-O3 LLM in ASR | A12-O2 Spoken Document Summarization | A2-P1-A Innovative Methods in Phonetics and Phonology | A2-P1-B Voice, Tones and F0 | A6-P2-A Speech Enhance-ment | A6-P2-B Speech Coding | A4-P2 Speaker and Language Identification and Diarization | A7-P1-A Speech Synthesis: Expressivity and Emotion | A7-P1-B Speech Synthesis: Tools and Data | SS5B Speech and Language in Health: from Remote Monitoring to Medical Conversations - 2 (Special Sessions) | ST2 Show and Tell 2 | |
15:30 – 16:00 | Break | ||||||||||||||||
16:00 – 18:00 | A7-O5 Speech Synthesis: Prosody | A5-O4 Source Separation 1 | A2-O3 Prosody | A4-O3 Foundational Models for Deepfake and Spoofed Speech Detection | A10-O2 ASR and LLMs | A8-O4 Neural Network Adaptation | SS10 Speech Processing Using Discrete Speech Units (Special Session) | A13-P2-A Pathological Speech Analysis 3 | A13-P2-B Speech Disorders 3 | A8-P3 Accented Speech, Prosodic Features, Dialect, Emotion, Sound Classification | A6-P3-A Audio-Visual and Generative Speech Enhance-ment | A6-P3-B Speech Privacy and Bandwidth Expansion | A4-P3 Speaker Recognition 1 | SS9 Speech Recognition with Large Pretrained Speech Models for Under-represented Languages (Special Session) | |||
19:00 – 21:00 | Student Evening at the Olympic Pool of the Kipriotis Village Hotel / Reviewer Reception at the Kipriotis Panorama (by invitation only) |
Wednesday 4 September 2024
Time / Session | Hippocra-tes | Panacea Amphithea-ther | Acesso | Aegle A | Aegle B | Iasso | Melambus | Corridor - Hippocrates right side | Corridor - Hippocrates left side | Yanis Club | Exhibition | ||||||
Poster Area 1A | Poster Area 1B | Poster Area 2A | Poster Area 2B | Poster Area 3A | Poster Area 3B | Poster Area 4A | Poster Area 4B | ||||||||||
8:00 – 17:00 | Registration 8:00 - 17:00 - Registration at KICC First Floor (Level 1) | ||||||||||||||||
8:30 – 09:30 | Keynote 3 (Hippocrates) - Elmar Nöth "Analysis of Pathological Speech – Pitfalls along the Way" | ||||||||||||||||
9:30 – 10:00 | Break | ||||||||||||||||
10:00 – 12:00 | A9-O3 Novel Architectures for ASR | A13-O2 Pathological Speech Analysis 2 | A4-O4 Self-Supervised Models in Speaker Recognition | A6-O3 Privacy and Security in Speech Communication 1 | A5-O5 Speech Quality Assessment | A3-O3 Speech Emotion Recognition | A11-O2 Spoken Dialogue Systems and Conversational Analysis 1 | A1-P1-A Databases and Progress in Methodo-logy | A1-P1-B Articulation, Conver-gence and Perception | A8-P4 Training Methods, Self Supervised Learning, Adaptation | A10-P1 Multimo-dality and Foundation Models | A12-P4 Speech Technology | A7-P2-A Speech Synthesis: Voice Conversion 2 | A7-P2-B Speech Synthesis: Text Processing | SS1 Speech Science, Speech Technology, and Gender (Special Session) | ||
12:00 – 13:30 | Break | ||||||||||||||||
13:30 – 15:30 | A8-O5 Neural Network Architectures for ASR 1 | A1-O3 Speech Production and Perception | A2-O4 Phonetics and Phonology: Segmentals and Supraseg-mentals | A6-O4 Multi-Channel Speech Enhancement | A9-O4 Error Correction and Rescoring | A4-O5 Speaker Verification | A5-O6 Speech and Audio Modelling | A3-P2 Topics in Paralinguistics | A3-P2-B Emotion Recogni-tion: Fairness, Variability, Uncertainty | A5-P3-A Spatial Audio and Acoustics | A5-P3-B Generative Models for Speech and Audio | A11-P1-A Spoken Language Understan-ding | A11-P1-B Spoken Dialogue Systems and Conversational Analysis 2 | A7-P3-A Speech Synthesis: Paradigms and Methods 1 | A7-P3-B Speech Synthesis: Paradigms and Methods 2 | SS3 Computa-tional Models of Human Language Acquisition, Perception, and Production (Special Session) | ST3 Show and Tell 3 |
15:30 – 16:00 | Break | ||||||||||||||||
16:00 – 18:00 | A8-O6 ASR Model Training Methods | A13-O3 Dysarthric Speech Assessment | A5-O7 Speech and Audio Analysis | A6-O5 Speech Quality and Intelligibility: Prediction and Enhan-cement | A10-O3 Speech Assessment | A3-O4 New Avenues in Emotion Recognition | A7-O6 Speech Synthesis: Vocoders | A4-P4-A Speaker Diarization 2 | A4-P4-B Speaker Recognition 2 | A9-P2 Cross-Lingual and Multilingual Processing | A11-P2-A Question Answering from Speech and Spoken Dialogue Systems | A11-P2-B Spoken Dialogue Systems and Conversational Analysis 3 | A2-P2-A Phonetics, Phonology and Prosody | A2-P2-B Segmentals | SS6 Spoken Language Models for Universal Speech Processing (Special Session) | ||
19:30 – 23:00 | Conference Banquet at the Kipriotis Panorama's Pool |
Thursday 5 September 2024
Time / Session | Hippocra-tes | Panacea Amphithea-ther | Acesso | Aegle A | Aegle B | Iasso | Melambus | Corridor - Hippocrates right side | Corridor - Hippocrates left side | Yanis Club | Exhibition | ||||||
Poster Area 1A | Poster Area 1B | Poster Area 2A | Poster Area 2B | Poster Area 3A | Poster Area 3B | Poster Area 4A | Poster Area 4B | ||||||||||
8:00 – 16:30 | Registration 8:00 - 16:30 - Registration at KICC First Floor (Level 1) | ||||||||||||||||
8:30 – 09:30 | Keynote 4 (Hippocrates) - Barbara Tillmann "Perception of Music and Speech: Focus on Rhythm Processing" | ||||||||||||||||
9:30 – 10:00 | Break | ||||||||||||||||
10:00 – 12:00 | A2-O5 Experimen-tal Phonetics and Laboratory Phonology | A5-O8 Speech Type Classifica-tion | SS4 Leveraging Large Language Models and Contextual Features for Phonetic Analysis (Special Session) | A7-O7 Privacy and Security in Speech Communica-tion 2 | A8-O7 Streaming ASR | A4-O6 Speaker recognition evaluation and resources | A6-O6 Target Speaker Extraction | A1-P2-A L1/L2 Acquisition and CrossLinguistic Factors | A1-P2-B Speaker Stance, Emotion and Language External Factors | A9-P3 Computa-tional Resource Constrained ASR | A12-P3-A Evaluation of Speech Technology Systems | A12-P3-B Neural Network Training for Speech Recognition | A7-P4-A Speech Synthesis: Voice Conversion 3 | A7-P4-B Speech Synthesis: Paradigms and Methods 3 | SS7 Responsible Speech Foundation Models (Special Session) | ||
12:00 – 13:30 | Break | ||||||||||||||||
13:30 – 15:30 | A13-O4 Speech Disorders 1 | A5-O9 Fake Audio Detection | A12-O3 Spoken Term Detection and Speech Retrieval | A7-O8 Speech synthesis: Crosslingual and multilingual aspects | A8-O8 SelfSupervised Learning for ASR | A4-O7 Self and Weakly-Labelled Speaker Verification | A6-O7 Deep Learning Based Speech Enhance-ment: Approaches, Scalability, and Evaluation | A3-P3-A Multimodal Para-linguistics | A3-P3-B Automatic Emotion Recognition | A5-P4-A Acoustic Event Detection, Segmenta-tion and Classifica-tion | A5-P4-B Speech and Audio Modelling | A7-P5-A Speech Synthesis: Other Topics 1 | A7-P5-B Speech Synthesis: Other Topics 2 | A8-P5 Noise, Far-Field, MultiTalker, Enhance-ment, Audio Classifica-tion | SS8 Connecting Speech-science and Speech-technology for Children’s Speech (Special Session) | ST4 Show and Tell 4 | |
15:30 – 16:00 | Break | ||||||||||||||||
16:00 – 17:00 | Closing Ceremony (Hippocrates) |