The Spanish Thematic Network on Speech Technology (RTTH) and the ISCA Special Interest Group on Iberian Languages (SIG-IL) are pleased to announce the ALBAYZIN 2016 Evaluation Campaign, which will be part of the program of the IberSPEECH 2016 conference. Research groups worldwide are invited to participate in the following evaluations:
ALBAYZIN 2016 SEARCH ON SPEECH EVALUATION
The ALBAYZIN 2016 Search on Speech evaluation involves searching in audio content a list of terms/queries. This evaluation focuses on retrieving the appropriate audio files that contain any of those terms/queries. Two different tasks are defined:
- SPOKEN TERM DETECTION (STD), where the input to the system is a list of terms, but this is unknown when processing the audio. This is the same task as in NIST STD 2006 evaluation [2] and Open Keyword Search in 2013 [3], 2014 [4], 2015 [5], and 2016 [6].
- QUERY-BY-EXAMPLE SPOKEN TERM DETECTION (QbE STD), where the input to the system is an acoustic example per query and hence a prior knowledge of the correct word/phone transcription corresponding to each query cannot be made. This task must generate a set of occurrences for each query detected in the audio files, along with their timestamps as output, as in the STD task. QbE STD is the same task as those proposed in MediaEval 2011, 2012 and 2013 [1].
The full evaluation plan can be found here.
Registration
Interested groups must register for the evaluation before July 15th 2016, by contacting the organizing team at javiertejedornoguerales@gmail.com with CC to ALBAYZIN 2016 Evaluations Organising Committee. The contact should contain the following information:
- Research group (name and acronym)
- Institution (university, research center, etc.)
- Contact person (name)
Schedule
- June 30, 2016: Release of training and development data
- July 15, 2016: Registration deadline
- September 15, 2016: Release of evaluation data
- October 15, 2016: Deadline for the submission of system outputs and description papers
- October 31, 2016: Results distributed to participants
- November 23-25, 2016: Evaluation Workshop at Iberspeech 2016
References
[1] Metze, F., Anguera, X., Barnard, E., Davel, M., Gravier, G.: Language independent search in Mediaeval’s spoken web search task. Computer Speech and Language (2014).
[2] NIST: The spoken term detection (STD) 2006 evaluation plan. National Institute of Standards and Technology (NIST), Gaithersburg, MD, USA, 10 edn. (September 2006).
[3] NIST: NIST Open Keyword Search 2013 Evaluation (OpenKWS13). National Institute of Standards and Technology (NIST), Washington DC, USA, 1 edn. (July 2013).
[4] NIST: NIST Open Keyword Search 2014 Evaluation (OpenKWS14). National Institute of Standards and Technology (NIST), Washington DC, USA, 1 edn. (July 2014).
[5] NIST: NIST Open Keyword Search 2015 Evaluation (OpenKWS15). National Institute of Standards and Technology (NIST), Washington DC, USA, 1 edn. (July 2015).
[6] NIST: NIST Open Keyword Search 2016 Evaluation (OpenKWS16). National Institute of Standards and Technology (NIST), Washington DC, USA, 1 edn. (July 2016).
ALBAYZIN 2016 SPEAKER DIARIZATION EVALUATION
The ALBAYZIN 2016 Speaker Diarization evaluation consists of segmenting broadcast audio documents according to different speakers and linking those segments which originate from the same speaker. For this evaluation, the database donated by the Corporación Aragonesa de Radio y Televisión (CARTV) will be used. Around four hours of the Aragón Radio database will be used for development and another sixteen hours of the Aragón Radio database will be used for testing. For training, the Catalan broadcast news database from the 3/24 TV channel proposed for the 2010 Albayzin Audio Segmentation Evaluation [1,2] is provided. This database was recorded by the TALP Research Center from the UPC in 2009 under the Tecnoparla project [3] funded by the Generalitat the Catalunya.
No a priori knowledge is provided about the number or the identity of the speakers participating in the audio to be analysed. In the provided data (train and test), information regarding the presence of noise, music and speech will be annotated. The Diarization Error Rate will be used as scoring metric as defined in the RT evaluations organized by NIST [4]. Two different conditions are proposed this year, a closed-set condition in which only data provided within this Albayzin evaluation can be used for training and an open-set condition in which external data can be used for training as long as they are publicly accessible to everyone (not necessarily free). Participants can submit systems in one or both conditions in an independent way.
The full evaluation plan can be found here.
Registration
Interested groups must register for the evaluation before July 15th 2016, by contacting the organizing team at ortega@unizar.es with CC to ALBAYZIN 2016 Evaluations Organising Committee. The contact should contain the following information:
- Research group (name and acronym)
- Institution (university, research center, etc.)
- Contact person (name)
Schedule
- June 15, 2016: Release of training data
- July 15, 2016: Registration deadline
- September 15, 2016: Release of evaluation data
- October 15, 2016: Deadline for the submission of system outputs and description papers
- October 31, 2016: Results distributed to participants
- November 23-25, 2016: Evaluation Workshop at Iberspeech 2016
References
[1] Zelenak, M., Schulz, Hernando, J., Albayzin 2010 Evaluation Campaign: Speaker Diarization. VI Jorandas en Tecnologías del Habla, FALA 2010. Vigo, Noviembre 2010.
[2] Zelenak M., M., Schulz, Hernando, J., Speaker diarization of broadcast news in Albayzin 2010 evaluation campaign. EURASIP Journal on Audio, Speech, and Music Processing. December 2012.
[3] Tecnoparla Project, accessed on June 2, 2016.
[4] The 2009 (RT-09) Rich Transcription Meeting Recognition Evaluation Plan, accessed on June 2, 2016.