Program at a glance

The following program is not final and some (minor) alterations may still be introduced.

Wednesday, November 23

ALBAYZIN Evaluations

Session Chair: Luis Javier Rodríguez-Fuentes

  • E1.1: GTM-UVigo System for Albayzin 2016 Speaker Diarisation Evaluation – Paula Lopez-Otero, Laura Docio-Fernandez, Carmen Garcia-Mateo
  • E1.2: Aholab Speaker Diarization System for Albayzin 2016 Evaluation Campaign – David Tavárez, Xabier Sarasola, Eva Navas, Luis Serrano, Agustin Alonso, Ibon Saratxaga, Inma Hernaez
  • E1.3: EURECOM submission to the Albayzin 2016 Speaker Diarization Evaluation – Jose Patino, Héctor Delgado, Nicholas Evans, Xavier Anguera
  • E1.4: ATVS-UAM System Description for the Albayzin 2016 Speaker Diarization Evaluation – Pablo Ramirez Hereza, Javier Franco-Pedroso, Joaquin Gonzalez-Rodriguez
  • E2.1: Aholab system for Albayzin 2016 Search-on-Speech Evaluation – Luis Serrano, David Tavárez, Igor Odriozola, Inma Hernaez, Ibon Saratxaga
  • E2.2: The SPL-IT-UC QbESTD systems for Albayzin 2016 Search on Speech – Jorge Proença, Fernando Perdigão
  • E2.3: The ATVS-FOCUS STD System for ALBAYZIN 2016 Search-on-Speech Evaluation – Marı́a Pilar Fernández-Gallego, Doroteo T. Toledano, Javier Tejedor
  • E2.4: GTH-UPM System for Albayzin 2016 Search on Speech Evaluation – Alejandro Coucheiro-Limeres, Javier Ferreiros-López
  • E2.5: GTM-UVigo Systems for Albayzin 2016 Search on Speech Evaluation – Paula Lopez-Otero, Laura Docio-Fernandez, Carmen Garcia-Mateo
  • E2.6: The ViVoLab-I3A-UZ System for Albayzin 2016 Search on Speech Evaluation – Julia Olcoz, Jorge Llombart, Antonio Miguel, Alfonso Ortega, Eduardo Lleida
  • E2.7: The ELiRF Query-by-Example STD systems for the Albayzin 2016 Search on Speech Evaluation – Sergio Laguna, Emilio Sanchis, Lluı́s-F. Hurtado, Fernando Garcı́a
  • E2.8: The L2F Query-by-Example Spoken Term Detection system for the ALBAYZIN 2016 –  Anna Pompili, Alberto Abad

P1 – Speech Processing in Different Application Fields

Session Chair: Antonio Peinado

  • P1.1: Towards aural saliency detection with logarithmic Bayesian Surprise under different spectro-temporal representations – Antonio Rodriguez-Hidalgo, Ascensión Gallardo-Antolín and Carmen Peláez-Moreno
  • P1.2:  Phone-gram units in RNN-LM for language identification with vocabulary reduction based on neural embeddings – Christian Salamea, Luis Fernando D’Haro, Ricardo Córdoba and Juan Montero
  • P1.3: Articulatory-based Audiovisual Speech Synthesis: Proof of Concept for European Portuguese – Samuel Silva, António Teixeira and Verónica Orvalho
  • P1.4: Generating Storytelling Suspense from Neutral Speech using a Hybrid TTS Synthesis framework driven by a Rule-based Prosodic Model – Raúl Montaño, Marc Freixes, Francesc Alías and Joan Claudi Socoró
  • P1.5: Automatic Text-to-Audio Alignment of Multimedia Broadcast Content – Julia Olcoz, Pablo Gimeno, Alfonso Ortega, Adolfo Arguedas, Antonio Miguel and Eduardo Lleida
  • P1.6: Some ASR experiments using Deep Neural Networks on Spanish databases -M. Inés Torres, Asier López Zorrilla, Nazim Dugan, Neil Glackin, Gerard Chollet and Nigel Cannings
  • P1.7: Phrase Verification on the RSR2015 Corpus – Álvaro Mesa-Castellanos, María Pilar Fernández-Gallego, Alicia Lozano-Diez and Doroteo T. Toledano
  • P1.8: SecuVoice: A Spanish Speech Corpus for Secure Applications with Smartphones – Juan M. Martín-Doñas, Iván López-Espejo, Carlos R. González-Lao, David Gallardo-Jiménez, Angel M. Gomez, José L. Pérez-Córdoba, Victoria Sánchez, Juan A. Morales-Cordovilla and Antonio M. Peinado
  • P1.9: Improving L2 Production with a Gamified Computer-Assisted Pronunciation Training Tool, TipTopTalk! – Cristian Tejedor-García, David Escudero-Mancebo, César González-Ferreras, Enrique Cámara-Arenas and Valentín Cardeñoso-Payo
  • P1.10: Application of the Kaldi toolkit for continuous speech recognition using Hidden-Markov Models and Deep Neural Networks – Simon Guiroy, Ricardo de Cordoba and Amelia Villegas
  • P1.11: Rich Transcription and Automatic Subtitling for Basque and Spanish – Aitor Álvarez, Haritz Arzelus, Santiago Prieto and Arantza Del Pozo

O1 – Speech Processing

Session Chair: Fernando Perdigão

  • O1.1: A Dynamic FEC for Improved Robustness of CELP-Based Codec – Nadir Benamirouche, Bachir Boudraa, Angel M. Gomez, José Luis Pérez Córdoba and Iván López-Espejo
  • O1.2: A novel error mitigation scheme based on replacement vectors and FEC codes for speech recovery in loss-prone channels – Domingo López-Oller, Ángel Gómez García and José Luis Pérez-Córdoba
  • O1.3: Automatic speech feature learning for continuous prediction of customer satisfaction in contact center phone calls – Carlos Segura, Jordi Luque Serrano, Martí Umbert Morist, Daniel Balcells Eichenberger and Javier Arias Losada
  • O1.4: Evaluating different non-native pronunciation scoring metrics with the Japanese speakers of the SAMPLE Corpus – Vandria Álvarez Álvarez, David Escudero Mancebo, César González-Ferreras and Valentín Cardeñoso-Payo
  • O1.5: Reversible speech de-identication using parametric transformations and watermarking – Aitor Valdivielso, Daniel Erro and Inma Hernaez
  • O1.6: Bottleneck Based Front-end for Diarization Systems – Ignacio Viñals, Jesús Villalba, Alfonso Ortega, Antonio Miguel and Eduardo Lleida

Thursday, November 24

O2 – Speaker paralinguistic characterisation

Session Chair: Inma Hernaez

  • O2.1: Acoustic Analysis of Anomalous Use of Prosodic Features in a Corpus of People with Intellectual Disability – Mario Corrales Astorgano, David Escudero Mancebo and César González Ferreras
  • O2.2: Automatic Annotation of Disfluent Speech in Children’s Reading Tasks – Jorge Proença, Dirce Celorico, Carla Lopes, Sara Candeias and Fernando Perdigão
  • O2.3: Detecting psychological distress in adults through transcriptions of clinical interviews – Joana Correia, Isabel Trancoso and Bhiksha Raj
  • O2.4: Acoustic-Prosodic Automatic Personality Trait Assessment for Adults and Children – Rubén Solera-Ureña, Helena Moniz, Fernando Batista, Ramón Fernández-Astudillo, Joana Campos, Ana Paiva and Isabel Trancoso
  • O2.5: Automatic Detection of Hyperarticulated Speech – Eugénio Ribeiro, Fernando Batista, Isabel Trancoso, Ricardo Ribeiro and David Martins de Matos

P2 – Natural Language Processing in Different Application Fields

Session Chair: Carlos D. Martínez Hinarejos

  • P2.1: A train-on-target strategy for Multilingual Spoken Language Understanding – Fernando Garcia-Granada, Encarna Segarra, Carlos Millán, Emilio Sanchis and Lluís-F. Hurtado
  • P2.2: Making better use of data selection methods – Mara Chinea Rios, Germán Sanchis-Trilles and Francisco Casacuberta
  • P2.3: Dialogue Act Annotation of a Multiparty Meeting Corpus with Discriminative Models – Rosa-M. Giménez-Pérez, Iván Sánchez-Padilla and Carlos-D. Martinez-Hinarejos
  • P2.4: Comparing rule-based and statistical methods in automatic subtitle segmentation for Basque and Spanish – Aitor Alvarez, Carlos-D. Martínez-Hinarejos and Haritz Arzelus
  • P2.5: Do Word Embeddings Capture Sarcasm in Online Dialogues? – Unai Unda and Raquel Justo
  • P2.6: From Web to Persons – Providing Useful Information on Hotels Combining Information Extraction and Natural Language Generation – António Teixeira, Pedro Miguel, Mário Rodrigues, José Casimiro Pereira and Marlene Amorim
  • P2.7: Character Sequence to Sequence Applications: Subtitle Segmentation and Part-of-Speech Tagging – Jorge Llombart, Antonio Miguel, Eduardo Lleida and Alfonso Ortega
  • P2.8: Towards Integration of Fusion in a W3C-based Multimodal Interaction Framework: Fusion of Events – Nuno Almeida, António Teixeira, Samuel Silva and João Freitas

O3 – Speech Synthesis

Session Chair: Antonio Bonafonte

  • O3.1: Surgery of Speech Synthesis Models to Overcome the Scarcity of Training Data – Arnaud Pierard, Daniel Erro, Inma Hernaez, Eva Navas and Thierry Dutoit
  • O3.2: Language-Independent Acoustic Cloning of HTS Voices: an Objective Evaluation – Carmen Magariños, Daniel Erro, Paula Lopez-Otero and Eduardo R. Banga
  • O3.3: Objective comparison of four GMM-based methods for PMA-to-speech conversion – Daniel Erro, Inma Hernaez, Luis Serrano, Ibon Saratxaga and Eva Navas
  • O3.4: Study of the effect of reducing training data in speech synthesis adaptation based on Frequency Warping – Agustin Alonso, Daniel Erro, Eva Navas and Inma Hernaez
  • O3.5: Prosodic Break Prediction with RNNs – Santiago Pascual and Antonio Bonafonte
  • O3.6: Adding singing capabilities to Unit Selection TTS through HNM-based conversion – Marc Freixes, Joan Claudi Socoró and Francesc Alías

Special Session: Projects, Demos & PhDs

Session Chairs: Doroteo T. Toledano & Xavier Anguera

  • S1.1: Context, multimodality, and user collaboration in handwritten text processing: the CoMUN-HaT project – Carlos David Martinez Hinarejos, Josep Lladós, Alicia Fornés, Francisco Casacuberta, Lluis de Las Heras, Joan Mas, Moisés Pastor, Oriol Ramos, Joan Andreu Sánchez, Enrique Vidal and Fernando Vilariño
  • S1.2: Multi-style Text-to-Speech using Recurrent Neural Networks for Chilean Spanish – Pilar Oplustil Gallegos
  • S1.3: Computer Assisted Pronunciation Training of Spanish as Second Language with a Social Videogame – David Escudero, Valentín Cardeñoso-Payo, Eva Estebas Vilaplana, César González-Ferreras, Lourdes Aguilar, Valle Flórez-Lucas, Joaquim Llisterri-Boix, Mario Carranza, María Machuca and Antonio Rios Mestre
  • S1.4: A graphic adventure video game to develop pragmatic and prosodic skills in individuals affected by Down Syndrome – Lourdes Aguilar, Ferrán Adell, Valentín Cardeñoso-Payo, David Escudero, César González-Ferreras, Valle Flórez-Lucas, Mario Corrales Astorgano and Pastora Martínez Castilla
  • S1.5: CrowdSience: Crowdsourcing for research and development – Raquel Justo, José M Alcaide and M. Inés Torres
  • S2.1: Read4SpeechExperiments: A Tool for Speech Acquisition from Mobile Devices – Emilio Granell and Carlos David Martinez Hinarejos
  • S2.2: The Magic Stone: a video game for training language skills of people with Down syndrome – Mario Corrales Astorgano, David Escudero Mancebo, César González Ferreras, Valentín Cardeñoso Payo, Yurena Gutiérrez González, Valle Flores Lucas, Lourdes Aguilar Cuevas and Patricia Sinobas
  • S2.3: TipTopTalk! Mobile application for speech training using minimal pairs and gamification – Cristian Tejedor-García, David Escudero, César González-Ferreras, Enrique Cámara-Arenas and Valentín Cardeñoso-Payo
  • S2.4: LetsRead demo – Automatic Evaluation of Children’s Reading Aloud Performance – Jorge Proença, Carla Lopes, Sara Candeias and Fernando Perdigão
  • S2.5: ELSA: English Language Speech Assistant – Xavier Anguera and Vu Van
  • S3.1: Different Contributions to Cost-Effective Transcription and Translation of Video Lectures – Joan Albert Silvestre-Cerdà, Alfons Juan and Jorge Civera
  • S3.2: Advances on Speaker Recognition in non Collaborative Environments – Jesus Antonio Villalba Lopez and Eduardo Lleida
  • S3.3: Use of the harmonic phase in synthetic speech detection – Jon Sanchez, Inma Hernaez and Ibon Saratxaga
  • S3.4: Non-negative Matrix Factorization Applications to Speech Technologies – Jimmy Diestin Ludeña Choez and Ascensión Gallardo Antolín
  • S3.5: A Strategy for Multilingual Spoken Language Understanding Based on Graphs of Linguistic Units – Marcos Calvo, Fernando Garcia Granada and Emilio Sanchis
  • S3.6: Numerical production of vowels and diphthongs using finite element methods – Marc Arnela

Friday, November 25

O4 – Speech Recognition

Session Chair: Carmen García Mateo

  • O4.1: Deep Neural Network-Based Noise Estimation for Robust ASR in Dual-Microphone Smartphones – Iván López-Espejo, Antonio M. Peinado, Angel M. Gomez and Juan M. Martín-Doñas
  • O4.2: Automatic Speech Recognition with Deep Neural Networks for Impaired Speech – Cristina España-Bonet and José A. R. Fonollosa
  • O4.3: An Analysis of Deep Neural Networks in Broad Phonetic Classes for Noisy Speech Recognition – Fernando de La Calle Silos, Ascensión Gallardo Antolín and Carmen Peláez Moreno
  • O4.4: Detection of publicity mentions in broadcast radio: preliminary results – María Pilar Fernández-Gallego, Álvaro Mesa-Castellanos, Alicia Lozano-Díez and Doroteo T. Toledano
  • O4.5: Better Phoneme Recognisers Lead to Better Phoneme Posteriorgrams for Search on Speech? An Experimental Analysis – Paula Lopez-Otero, Laura Docio-Fernandez and Carmen Garcia-Mateo

O5- Speech and Natural language processing applications

Session Chair: António Teixeira

  • O5.1: Crowdsourced Video Subtitling with Adaptation based on User-Corrected Lattices – João Miranda, Ramon Astudillo, Ângela Costa, André Silva, Hugo Silva, João Graça and Bhiksha Raj
  • O5.2: Collaborator Effort Optimisation in Multimodal Crowdsourcing for Transcribing Historical Manuscripts – Emilio Granell and Carlos David Martinez Hinarejos
  • O5.3: Global analysis of entrainment in dialogues – Vera Cabarrão, Isabel Trancoso, Ana Isabel Mata, Helena Moniz and Fernando Batista
  • O5.4: Assessing User Expertise in Spoken Dialog System Interactions – Eugénio Ribeiro, Fernando Batista, Isabel Trancoso, José Lopes, Ricardo Ribeiro and David Martins de Matos