An Approach to the Main Task of QA4MRE-2013
Autor(a) principal: | |
---|---|
Data de Publicação: | 2013 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10174/10323 |
Resumo: | This article describes the participation of a group from the University of Évora in the CLEF2013 QA4MRE main task. Our system has a superficial text analysis based approach. The methodology starts with the preprocessing of background collection documents, whose texts are lemmatized and then indexed. Named entities and numerical expressions are sought in questions and their candidate answers. Then the lemmatizer is applied and stop words are removed. Answer patterns are formed for each question+answer pair, with a search query for document retrieval. Original search terms are expanded with synonyms and hyperonyms. Finally, the texts retrieved for each candidate response are segmented and scored for answer selection. Considering only the main questions, the system best result was obtained in the third run, having answered to 206 questions, with 0.24 c@1 and 51 correct answers. When evaluating main and auxiliary questions, the final run continued to have our better results, being answered 245 questions, with 64 right answers and 0.26 for c@1. The use of hypernyms proved to be an improvement factor in the third run, which results had a 12% increase of correct answers and a 0.02 gain in c@1. |
id |
RCAP_61cf0d4d47b2d350b270a9dac36b6506 |
---|---|
oai_identifier_str |
oai:dspace.uevora.pt:10174/10323 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
An Approach to the Main Task of QA4MRE-2013Question AnsweringNLPMachine ReadingThis article describes the participation of a group from the University of Évora in the CLEF2013 QA4MRE main task. Our system has a superficial text analysis based approach. The methodology starts with the preprocessing of background collection documents, whose texts are lemmatized and then indexed. Named entities and numerical expressions are sought in questions and their candidate answers. Then the lemmatizer is applied and stop words are removed. Answer patterns are formed for each question+answer pair, with a search query for document retrieval. Original search terms are expanded with synonyms and hyperonyms. Finally, the texts retrieved for each candidate response are segmented and scored for answer selection. Considering only the main questions, the system best result was obtained in the third run, having answered to 206 questions, with 0.24 c@1 and 51 correct answers. When evaluating main and auxiliary questions, the final run continued to have our better results, being answered 245 questions, with 64 right answers and 0.26 for c@1. The use of hypernyms proved to be an improvement factor in the third run, which results had a 12% increase of correct answers and a 0.02 gain in c@1.clef2013.org2014-01-29T16:49:05Z2014-01-292013-09-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10174/10323http://hdl.handle.net/10174/10323engMarilia Santos, Jose Saias, and Paulo Quaresma. An approach to the main task of qa4mre-2013. In Pamela Forner, Roberto Navigli, and Dan Tufis, editors, CLEF 2013 Evaluation Labs and Workshop Online Working Notes - Question Answering for Machine Reading Evaluation (QA4MRE), Valencia, Spain, September 2013. ISBN 978-88-904810-5-5.http://www.clef-initiative.eu/documents/71612/75d1893f-1714-487d-a641-353ec8d86063m9210@alunos.uevora.ptjsaias@uevora.ptpq@uevora.pt283Santos, MaríliaSaias, JoséQuaresma, Pauloinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-03T18:53:03Zoai:dspace.uevora.pt:10174/10323Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:04:13.492210Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
An Approach to the Main Task of QA4MRE-2013 |
title |
An Approach to the Main Task of QA4MRE-2013 |
spellingShingle |
An Approach to the Main Task of QA4MRE-2013 Santos, Marília Question Answering NLP Machine Reading |
title_short |
An Approach to the Main Task of QA4MRE-2013 |
title_full |
An Approach to the Main Task of QA4MRE-2013 |
title_fullStr |
An Approach to the Main Task of QA4MRE-2013 |
title_full_unstemmed |
An Approach to the Main Task of QA4MRE-2013 |
title_sort |
An Approach to the Main Task of QA4MRE-2013 |
author |
Santos, Marília |
author_facet |
Santos, Marília Saias, José Quaresma, Paulo |
author_role |
author |
author2 |
Saias, José Quaresma, Paulo |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Santos, Marília Saias, José Quaresma, Paulo |
dc.subject.por.fl_str_mv |
Question Answering NLP Machine Reading |
topic |
Question Answering NLP Machine Reading |
description |
This article describes the participation of a group from the University of Évora in the CLEF2013 QA4MRE main task. Our system has a superficial text analysis based approach. The methodology starts with the preprocessing of background collection documents, whose texts are lemmatized and then indexed. Named entities and numerical expressions are sought in questions and their candidate answers. Then the lemmatizer is applied and stop words are removed. Answer patterns are formed for each question+answer pair, with a search query for document retrieval. Original search terms are expanded with synonyms and hyperonyms. Finally, the texts retrieved for each candidate response are segmented and scored for answer selection. Considering only the main questions, the system best result was obtained in the third run, having answered to 206 questions, with 0.24 c@1 and 51 correct answers. When evaluating main and auxiliary questions, the final run continued to have our better results, being answered 245 questions, with 64 right answers and 0.26 for c@1. The use of hypernyms proved to be an improvement factor in the third run, which results had a 12% increase of correct answers and a 0.02 gain in c@1. |
publishDate |
2013 |
dc.date.none.fl_str_mv |
2013-09-01T00:00:00Z 2014-01-29T16:49:05Z 2014-01-29 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10174/10323 http://hdl.handle.net/10174/10323 |
url |
http://hdl.handle.net/10174/10323 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Marilia Santos, Jose Saias, and Paulo Quaresma. An approach to the main task of qa4mre-2013. In Pamela Forner, Roberto Navigli, and Dan Tufis, editors, CLEF 2013 Evaluation Labs and Workshop Online Working Notes - Question Answering for Machine Reading Evaluation (QA4MRE), Valencia, Spain, September 2013. ISBN 978-88-904810-5-5. http://www.clef-initiative.eu/documents/71612/75d1893f-1714-487d-a641-353ec8d86063 m9210@alunos.uevora.pt jsaias@uevora.pt pq@uevora.pt 283 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
clef2013.org |
publisher.none.fl_str_mv |
clef2013.org |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799136526371651584 |