An Approach to the Main Task of QA4MRE-2013

Detalhes bibliográficos
Autor(a) principal: Santos, Marília
Data de Publicação: 2013
Outros Autores: Saias, José, Quaresma, Paulo
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10174/10323
Resumo: This article describes the participation of a group from the University of Évora in the CLEF2013 QA4MRE main task. Our system has a superficial text analysis based approach. The methodology starts with the preprocessing of background collection documents, whose texts are lemmatized and then indexed. Named entities and numerical expressions are sought in questions and their candidate answers. Then the lemmatizer is applied and stop words are removed. Answer patterns are formed for each question+answer pair, with a search query for document retrieval. Original search terms are expanded with synonyms and hyperonyms. Finally, the texts retrieved for each candidate response are segmented and scored for answer selection. Considering only the main questions, the system best result was obtained in the third run, having answered to 206 questions, with 0.24 c@1 and 51 correct answers. When evaluating main and auxiliary questions, the final run continued to have our better results, being answered 245 questions, with 64 right answers and 0.26 for c@1. The use of hypernyms proved to be an improvement factor in the third run, which results had a 12% increase of correct answers and a 0.02 gain in c@1.
id RCAP_61cf0d4d47b2d350b270a9dac36b6506
oai_identifier_str oai:dspace.uevora.pt:10174/10323
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling An Approach to the Main Task of QA4MRE-2013Question AnsweringNLPMachine ReadingThis article describes the participation of a group from the University of Évora in the CLEF2013 QA4MRE main task. Our system has a superficial text analysis based approach. The methodology starts with the preprocessing of background collection documents, whose texts are lemmatized and then indexed. Named entities and numerical expressions are sought in questions and their candidate answers. Then the lemmatizer is applied and stop words are removed. Answer patterns are formed for each question+answer pair, with a search query for document retrieval. Original search terms are expanded with synonyms and hyperonyms. Finally, the texts retrieved for each candidate response are segmented and scored for answer selection. Considering only the main questions, the system best result was obtained in the third run, having answered to 206 questions, with 0.24 c@1 and 51 correct answers. When evaluating main and auxiliary questions, the final run continued to have our better results, being answered 245 questions, with 64 right answers and 0.26 for c@1. The use of hypernyms proved to be an improvement factor in the third run, which results had a 12% increase of correct answers and a 0.02 gain in c@1.clef2013.org2014-01-29T16:49:05Z2014-01-292013-09-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10174/10323http://hdl.handle.net/10174/10323engMarilia Santos, Jose Saias, and Paulo Quaresma. An approach to the main task of qa4mre-2013. In Pamela Forner, Roberto Navigli, and Dan Tufis, editors, CLEF 2013 Evaluation Labs and Workshop Online Working Notes - Question Answering for Machine Reading Evaluation (QA4MRE), Valencia, Spain, September 2013. ISBN 978-88-904810-5-5.http://www.clef-initiative.eu/documents/71612/75d1893f-1714-487d-a641-353ec8d86063m9210@alunos.uevora.ptjsaias@uevora.ptpq@uevora.pt283Santos, MaríliaSaias, JoséQuaresma, Pauloinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-03T18:53:03Zoai:dspace.uevora.pt:10174/10323Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:04:13.492210Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv An Approach to the Main Task of QA4MRE-2013
title An Approach to the Main Task of QA4MRE-2013
spellingShingle An Approach to the Main Task of QA4MRE-2013
Santos, Marília
Question Answering
NLP
Machine Reading
title_short An Approach to the Main Task of QA4MRE-2013
title_full An Approach to the Main Task of QA4MRE-2013
title_fullStr An Approach to the Main Task of QA4MRE-2013
title_full_unstemmed An Approach to the Main Task of QA4MRE-2013
title_sort An Approach to the Main Task of QA4MRE-2013
author Santos, Marília
author_facet Santos, Marília
Saias, José
Quaresma, Paulo
author_role author
author2 Saias, José
Quaresma, Paulo
author2_role author
author
dc.contributor.author.fl_str_mv Santos, Marília
Saias, José
Quaresma, Paulo
dc.subject.por.fl_str_mv Question Answering
NLP
Machine Reading
topic Question Answering
NLP
Machine Reading
description This article describes the participation of a group from the University of Évora in the CLEF2013 QA4MRE main task. Our system has a superficial text analysis based approach. The methodology starts with the preprocessing of background collection documents, whose texts are lemmatized and then indexed. Named entities and numerical expressions are sought in questions and their candidate answers. Then the lemmatizer is applied and stop words are removed. Answer patterns are formed for each question+answer pair, with a search query for document retrieval. Original search terms are expanded with synonyms and hyperonyms. Finally, the texts retrieved for each candidate response are segmented and scored for answer selection. Considering only the main questions, the system best result was obtained in the third run, having answered to 206 questions, with 0.24 c@1 and 51 correct answers. When evaluating main and auxiliary questions, the final run continued to have our better results, being answered 245 questions, with 64 right answers and 0.26 for c@1. The use of hypernyms proved to be an improvement factor in the third run, which results had a 12% increase of correct answers and a 0.02 gain in c@1.
publishDate 2013
dc.date.none.fl_str_mv 2013-09-01T00:00:00Z
2014-01-29T16:49:05Z
2014-01-29
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10174/10323
http://hdl.handle.net/10174/10323
url http://hdl.handle.net/10174/10323
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Marilia Santos, Jose Saias, and Paulo Quaresma. An approach to the main task of qa4mre-2013. In Pamela Forner, Roberto Navigli, and Dan Tufis, editors, CLEF 2013 Evaluation Labs and Workshop Online Working Notes - Question Answering for Machine Reading Evaluation (QA4MRE), Valencia, Spain, September 2013. ISBN 978-88-904810-5-5.
http://www.clef-initiative.eu/documents/71612/75d1893f-1714-487d-a641-353ec8d86063
m9210@alunos.uevora.pt
jsaias@uevora.pt
pq@uevora.pt
283
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv clef2013.org
publisher.none.fl_str_mv clef2013.org
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799136526371651584