Improving author name disambiguation with user relevance feedback.
Autor(a) principal: | |
---|---|
Data de Publicação: | 2012 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFOP |
Texto Completo: | http://www.repositorio.ufop.br/handle/123456789/4334 |
Resumo: | Author name ambiguity in the context of bibliographic citations is a very hard problem. It occurs when there are citation records of a same author under distinct names or when there exists citation records belonging to distinct authors with very similar names. Among the several methods proposed in the literature, the most effective ones are those that perform a direct assignment of the records to their respective authors by means of the application of supervised machine learning techniques. However, those methods usually need large amounts of labeled training examples to properly disambiguate the author names. To deal with this issue, in previous work, we have proposed a method that automatically obtains and labels the training examples, showing competitive performance compared to representative author name disambiguation methods. In this work, we propose to improve our previous method by exploiting user relevance feedback. In more details we select a very small portion of the citation records for which our method was mostly unsure about the correct authorship and ask the administrators for labeling them. This feedback is then used to improve the effectiveness of the whole process. In our experimental evaluation, we observed that with a very small labeling effort (usually around 5% of the records), the overall disambiguation effectiveness improves by almost 10% on average, with gains of up to 61% in some of the largest ambiguous groups. |
id |
UFOP_d6a4a4d9c968b867310cdbab3e8503a5 |
---|---|
oai_identifier_str |
oai:repositorio.ufop.br:123456789/4334 |
network_acronym_str |
UFOP |
network_name_str |
Repositório Institucional da UFOP |
repository_id_str |
3233 |
spelling |
Improving author name disambiguation with user relevance feedback.Bibliographic citationDigital libraryName disambiguationRelevance feedbackAuthor name ambiguity in the context of bibliographic citations is a very hard problem. It occurs when there are citation records of a same author under distinct names or when there exists citation records belonging to distinct authors with very similar names. Among the several methods proposed in the literature, the most effective ones are those that perform a direct assignment of the records to their respective authors by means of the application of supervised machine learning techniques. However, those methods usually need large amounts of labeled training examples to properly disambiguate the author names. To deal with this issue, in previous work, we have proposed a method that automatically obtains and labels the training examples, showing competitive performance compared to representative author name disambiguation methods. In this work, we propose to improve our previous method by exploiting user relevance feedback. In more details we select a very small portion of the citation records for which our method was mostly unsure about the correct authorship and ask the administrators for labeling them. This feedback is then used to improve the effectiveness of the whole process. In our experimental evaluation, we observed that with a very small labeling effort (usually around 5% of the records), the overall disambiguation effectiveness improves by almost 10% on average, with gains of up to 61% in some of the largest ambiguous groups.2015-01-22T14:56:35Z2015-01-22T14:56:35Z2012info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfFERREIRA, A. A.; MACHADO, T. M.; GONÇALVES, M. A. Improving author name disambiguation with user relevance feedback. Journal of Information and Data Management - JIDM, v. 3, p. 332-347, 2012. Disponível em: <https://seer.lcc.ufmg.br/index.php/jidm/article/view/200/135>. Acesso em: 21 jan. 2015.21787107http://www.repositorio.ufop.br/handle/123456789/4334Copyright 2012 Permission to copy without fee all or part of the material printed in JIDM is granted provided that the copies are not made or distributed for commercial advantage, and that notice is given that copying is by permission of the Sociedade Brasileira de Computação. Fonte: Informação contida no artigo.info:eu-repo/semantics/openAccessFerreira, Anderson AlmeidaMachado, Tales MotaGonçalves, Marcos Andréengreponame:Repositório Institucional da UFOPinstname:Universidade Federal de Ouro Preto (UFOP)instacron:UFOP2015-01-22T14:56:35Zoai:repositorio.ufop.br:123456789/4334Repositório InstitucionalPUBhttp://www.repositorio.ufop.br/oai/requestrepositorio@ufop.edu.bropendoar:32332015-01-22T14:56:35Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP)false |
dc.title.none.fl_str_mv |
Improving author name disambiguation with user relevance feedback. |
title |
Improving author name disambiguation with user relevance feedback. |
spellingShingle |
Improving author name disambiguation with user relevance feedback. Ferreira, Anderson Almeida Bibliographic citation Digital library Name disambiguation Relevance feedback |
title_short |
Improving author name disambiguation with user relevance feedback. |
title_full |
Improving author name disambiguation with user relevance feedback. |
title_fullStr |
Improving author name disambiguation with user relevance feedback. |
title_full_unstemmed |
Improving author name disambiguation with user relevance feedback. |
title_sort |
Improving author name disambiguation with user relevance feedback. |
author |
Ferreira, Anderson Almeida |
author_facet |
Ferreira, Anderson Almeida Machado, Tales Mota Gonçalves, Marcos André |
author_role |
author |
author2 |
Machado, Tales Mota Gonçalves, Marcos André |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Ferreira, Anderson Almeida Machado, Tales Mota Gonçalves, Marcos André |
dc.subject.por.fl_str_mv |
Bibliographic citation Digital library Name disambiguation Relevance feedback |
topic |
Bibliographic citation Digital library Name disambiguation Relevance feedback |
description |
Author name ambiguity in the context of bibliographic citations is a very hard problem. It occurs when there are citation records of a same author under distinct names or when there exists citation records belonging to distinct authors with very similar names. Among the several methods proposed in the literature, the most effective ones are those that perform a direct assignment of the records to their respective authors by means of the application of supervised machine learning techniques. However, those methods usually need large amounts of labeled training examples to properly disambiguate the author names. To deal with this issue, in previous work, we have proposed a method that automatically obtains and labels the training examples, showing competitive performance compared to representative author name disambiguation methods. In this work, we propose to improve our previous method by exploiting user relevance feedback. In more details we select a very small portion of the citation records for which our method was mostly unsure about the correct authorship and ask the administrators for labeling them. This feedback is then used to improve the effectiveness of the whole process. In our experimental evaluation, we observed that with a very small labeling effort (usually around 5% of the records), the overall disambiguation effectiveness improves by almost 10% on average, with gains of up to 61% in some of the largest ambiguous groups. |
publishDate |
2012 |
dc.date.none.fl_str_mv |
2012 2015-01-22T14:56:35Z 2015-01-22T14:56:35Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
FERREIRA, A. A.; MACHADO, T. M.; GONÇALVES, M. A. Improving author name disambiguation with user relevance feedback. Journal of Information and Data Management - JIDM, v. 3, p. 332-347, 2012. Disponível em: <https://seer.lcc.ufmg.br/index.php/jidm/article/view/200/135>. Acesso em: 21 jan. 2015. 21787107 http://www.repositorio.ufop.br/handle/123456789/4334 |
identifier_str_mv |
FERREIRA, A. A.; MACHADO, T. M.; GONÇALVES, M. A. Improving author name disambiguation with user relevance feedback. Journal of Information and Data Management - JIDM, v. 3, p. 332-347, 2012. Disponível em: <https://seer.lcc.ufmg.br/index.php/jidm/article/view/200/135>. Acesso em: 21 jan. 2015. 21787107 |
url |
http://www.repositorio.ufop.br/handle/123456789/4334 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFOP instname:Universidade Federal de Ouro Preto (UFOP) instacron:UFOP |
instname_str |
Universidade Federal de Ouro Preto (UFOP) |
instacron_str |
UFOP |
institution |
UFOP |
reponame_str |
Repositório Institucional da UFOP |
collection |
Repositório Institucional da UFOP |
repository.name.fl_str_mv |
Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP) |
repository.mail.fl_str_mv |
repositorio@ufop.edu.br |
_version_ |
1813002843311833088 |