Improving author name disambiguation with user relevance feedback.

Detalhes bibliográficos
Autor(a) principal: Ferreira, Anderson Almeida
Data de Publicação: 2012
Outros Autores: Machado, Tales Mota, Gonçalves, Marcos André
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UFOP
Texto Completo: http://www.repositorio.ufop.br/handle/123456789/4334
Resumo: Author name ambiguity in the context of bibliographic citations is a very hard problem. It occurs when there are citation records of a same author under distinct names or when there exists citation records belonging to distinct authors with very similar names. Among the several methods proposed in the literature, the most effective ones are those that perform a direct assignment of the records to their respective authors by means of the application of supervised machine learning techniques. However, those methods usually need large amounts of labeled training examples to properly disambiguate the author names. To deal with this issue, in previous work, we have proposed a method that automatically obtains and labels the training examples, showing competitive performance compared to representative author name disambiguation methods. In this work, we propose to improve our previous method by exploiting user relevance feedback. In more details we select a very small portion of the citation records for which our method was mostly unsure about the correct authorship and ask the administrators for labeling them. This feedback is then used to improve the effectiveness of the whole process. In our experimental evaluation, we observed that with a very small labeling effort (usually around 5% of the records), the overall disambiguation effectiveness improves by almost 10% on average, with gains of up to 61% in some of the largest ambiguous groups.
id UFOP_d6a4a4d9c968b867310cdbab3e8503a5
oai_identifier_str oai:repositorio.ufop.br:123456789/4334
network_acronym_str UFOP
network_name_str Repositório Institucional da UFOP
repository_id_str 3233
spelling Improving author name disambiguation with user relevance feedback.Bibliographic citationDigital libraryName disambiguationRelevance feedbackAuthor name ambiguity in the context of bibliographic citations is a very hard problem. It occurs when there are citation records of a same author under distinct names or when there exists citation records belonging to distinct authors with very similar names. Among the several methods proposed in the literature, the most effective ones are those that perform a direct assignment of the records to their respective authors by means of the application of supervised machine learning techniques. However, those methods usually need large amounts of labeled training examples to properly disambiguate the author names. To deal with this issue, in previous work, we have proposed a method that automatically obtains and labels the training examples, showing competitive performance compared to representative author name disambiguation methods. In this work, we propose to improve our previous method by exploiting user relevance feedback. In more details we select a very small portion of the citation records for which our method was mostly unsure about the correct authorship and ask the administrators for labeling them. This feedback is then used to improve the effectiveness of the whole process. In our experimental evaluation, we observed that with a very small labeling effort (usually around 5% of the records), the overall disambiguation effectiveness improves by almost 10% on average, with gains of up to 61% in some of the largest ambiguous groups.2015-01-22T14:56:35Z2015-01-22T14:56:35Z2012info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfFERREIRA, A. A.; MACHADO, T. M.; GONÇALVES, M. A. Improving author name disambiguation with user relevance feedback. Journal of Information and Data Management - JIDM, v. 3, p. 332-347, 2012. Disponível em: <https://seer.lcc.ufmg.br/index.php/jidm/article/view/200/135>. Acesso em: 21 jan. 2015.21787107http://www.repositorio.ufop.br/handle/123456789/4334Copyright 2012 Permission to copy without fee all or part of the material printed in JIDM is granted provided that the copies are not made or distributed for commercial advantage, and that notice is given that copying is by permission of the Sociedade Brasileira de Computação. Fonte: Informação contida no artigo.info:eu-repo/semantics/openAccessFerreira, Anderson AlmeidaMachado, Tales MotaGonçalves, Marcos Andréengreponame:Repositório Institucional da UFOPinstname:Universidade Federal de Ouro Preto (UFOP)instacron:UFOP2015-01-22T14:56:35Zoai:repositorio.ufop.br:123456789/4334Repositório InstitucionalPUBhttp://www.repositorio.ufop.br/oai/requestrepositorio@ufop.edu.bropendoar:32332015-01-22T14:56:35Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP)false
dc.title.none.fl_str_mv Improving author name disambiguation with user relevance feedback.
title Improving author name disambiguation with user relevance feedback.
spellingShingle Improving author name disambiguation with user relevance feedback.
Ferreira, Anderson Almeida
Bibliographic citation
Digital library
Name disambiguation
Relevance feedback
title_short Improving author name disambiguation with user relevance feedback.
title_full Improving author name disambiguation with user relevance feedback.
title_fullStr Improving author name disambiguation with user relevance feedback.
title_full_unstemmed Improving author name disambiguation with user relevance feedback.
title_sort Improving author name disambiguation with user relevance feedback.
author Ferreira, Anderson Almeida
author_facet Ferreira, Anderson Almeida
Machado, Tales Mota
Gonçalves, Marcos André
author_role author
author2 Machado, Tales Mota
Gonçalves, Marcos André
author2_role author
author
dc.contributor.author.fl_str_mv Ferreira, Anderson Almeida
Machado, Tales Mota
Gonçalves, Marcos André
dc.subject.por.fl_str_mv Bibliographic citation
Digital library
Name disambiguation
Relevance feedback
topic Bibliographic citation
Digital library
Name disambiguation
Relevance feedback
description Author name ambiguity in the context of bibliographic citations is a very hard problem. It occurs when there are citation records of a same author under distinct names or when there exists citation records belonging to distinct authors with very similar names. Among the several methods proposed in the literature, the most effective ones are those that perform a direct assignment of the records to their respective authors by means of the application of supervised machine learning techniques. However, those methods usually need large amounts of labeled training examples to properly disambiguate the author names. To deal with this issue, in previous work, we have proposed a method that automatically obtains and labels the training examples, showing competitive performance compared to representative author name disambiguation methods. In this work, we propose to improve our previous method by exploiting user relevance feedback. In more details we select a very small portion of the citation records for which our method was mostly unsure about the correct authorship and ask the administrators for labeling them. This feedback is then used to improve the effectiveness of the whole process. In our experimental evaluation, we observed that with a very small labeling effort (usually around 5% of the records), the overall disambiguation effectiveness improves by almost 10% on average, with gains of up to 61% in some of the largest ambiguous groups.
publishDate 2012
dc.date.none.fl_str_mv 2012
2015-01-22T14:56:35Z
2015-01-22T14:56:35Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv FERREIRA, A. A.; MACHADO, T. M.; GONÇALVES, M. A. Improving author name disambiguation with user relevance feedback. Journal of Information and Data Management - JIDM, v. 3, p. 332-347, 2012. Disponível em: <https://seer.lcc.ufmg.br/index.php/jidm/article/view/200/135>. Acesso em: 21 jan. 2015.
21787107
http://www.repositorio.ufop.br/handle/123456789/4334
identifier_str_mv FERREIRA, A. A.; MACHADO, T. M.; GONÇALVES, M. A. Improving author name disambiguation with user relevance feedback. Journal of Information and Data Management - JIDM, v. 3, p. 332-347, 2012. Disponível em: <https://seer.lcc.ufmg.br/index.php/jidm/article/view/200/135>. Acesso em: 21 jan. 2015.
21787107
url http://www.repositorio.ufop.br/handle/123456789/4334
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFOP
instname:Universidade Federal de Ouro Preto (UFOP)
instacron:UFOP
instname_str Universidade Federal de Ouro Preto (UFOP)
instacron_str UFOP
institution UFOP
reponame_str Repositório Institucional da UFOP
collection Repositório Institucional da UFOP
repository.name.fl_str_mv Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP)
repository.mail.fl_str_mv repositorio@ufop.edu.br
_version_ 1813002843311833088