A tool for generating synthetic authorship records for evaluating author name disambiguation methods.

Detalhes bibliográficos
Autor(a) principal: Ferreira, Anderson Almeida
Data de Publicação: 2012
Outros Autores: Gonçalves, Marcos André, Almeida, Jussara Marques de, Laender, Alberto Henrique Frade, Veloso, Adriano Alonso
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UFOP
Texto Completo: http://www.repositorio.ufop.br/handle/123456789/1728
Resumo: The author name disambiguation task has to deal with uncertainties related to the possible many-to-many correspondences between ambiguous names and unique authors. Despite the variety of name disambiguation methods available in the literature to solve the problem, most of them are rarely compared against each other. Moreover, they are often evaluated without considering a time evolving digital library, susceptible to dynamic (and therefore challenging) patterns such as the introduction of new authors and the change of research-ers’ interests over time. In order to facilitate the evaluation of name disambiguation meth-ods in various realistic scenarios and under controlled conditions, in this article we propose SyGAR, a new Synthetic Generator of Authorship Records that generates citation records based on author profiles. SyGAR can be used to generate successive loads of citation records simulating a living digital library that evolves according to various publication pat-terns. We validate SyGAR by comparing the results produced by three representative name disambiguation methods on real as well as synthetically generated collections of citation records. We also demonstrate its applicability by evaluating those methods on a time evolving digital library collection generated with the tool, considering several dynamic and realistic scenarios.
id UFOP_f281579899baa39550b58d2386fc6f32
oai_identifier_str oai:repositorio.ufop.br:123456789/1728
network_acronym_str UFOP
network_name_str Repositório Institucional da UFOP
repository_id_str 3233
spelling A tool for generating synthetic authorship records for evaluating author name disambiguation methods.Author name disambiguationDigital libraryBibliographic citationSynthetic generatorThe author name disambiguation task has to deal with uncertainties related to the possible many-to-many correspondences between ambiguous names and unique authors. Despite the variety of name disambiguation methods available in the literature to solve the problem, most of them are rarely compared against each other. Moreover, they are often evaluated without considering a time evolving digital library, susceptible to dynamic (and therefore challenging) patterns such as the introduction of new authors and the change of research-ers’ interests over time. In order to facilitate the evaluation of name disambiguation meth-ods in various realistic scenarios and under controlled conditions, in this article we propose SyGAR, a new Synthetic Generator of Authorship Records that generates citation records based on author profiles. SyGAR can be used to generate successive loads of citation records simulating a living digital library that evolves according to various publication pat-terns. We validate SyGAR by comparing the results produced by three representative name disambiguation methods on real as well as synthetically generated collections of citation records. We also demonstrate its applicability by evaluating those methods on a time evolving digital library collection generated with the tool, considering several dynamic and realistic scenarios.2012-10-22T17:14:31Z2012-10-22T17:14:31Z2012info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfFERREIRA, A. A. et al. A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Information Sciences, v. 206, p. 42-62, 2012. Disponível em: <https://www.sciencedirect.com/science/article/pii/S0020025512002861>. Acesso em: 22 out. 2012.00200255http://www.repositorio.ufop.br/handle/123456789/1728O periódico Information Sciences concede permissão para depósito do artigo no Repositório Institucional da UFOP. Número da licença: 3303030527825.info:eu-repo/semantics/openAccessFerreira, Anderson AlmeidaGonçalves, Marcos AndréAlmeida, Jussara Marques deLaender, Alberto Henrique FradeVeloso, Adriano Alonsoengreponame:Repositório Institucional da UFOPinstname:Universidade Federal de Ouro Preto (UFOP)instacron:UFOP2019-03-13T14:55:07Zoai:repositorio.ufop.br:123456789/1728Repositório InstitucionalPUBhttp://www.repositorio.ufop.br/oai/requestrepositorio@ufop.edu.bropendoar:32332019-03-13T14:55:07Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP)false
dc.title.none.fl_str_mv A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
title A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
spellingShingle A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
Ferreira, Anderson Almeida
Author name disambiguation
Digital library
Bibliographic citation
Synthetic generator
title_short A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
title_full A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
title_fullStr A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
title_full_unstemmed A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
title_sort A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
author Ferreira, Anderson Almeida
author_facet Ferreira, Anderson Almeida
Gonçalves, Marcos André
Almeida, Jussara Marques de
Laender, Alberto Henrique Frade
Veloso, Adriano Alonso
author_role author
author2 Gonçalves, Marcos André
Almeida, Jussara Marques de
Laender, Alberto Henrique Frade
Veloso, Adriano Alonso
author2_role author
author
author
author
dc.contributor.author.fl_str_mv Ferreira, Anderson Almeida
Gonçalves, Marcos André
Almeida, Jussara Marques de
Laender, Alberto Henrique Frade
Veloso, Adriano Alonso
dc.subject.por.fl_str_mv Author name disambiguation
Digital library
Bibliographic citation
Synthetic generator
topic Author name disambiguation
Digital library
Bibliographic citation
Synthetic generator
description The author name disambiguation task has to deal with uncertainties related to the possible many-to-many correspondences between ambiguous names and unique authors. Despite the variety of name disambiguation methods available in the literature to solve the problem, most of them are rarely compared against each other. Moreover, they are often evaluated without considering a time evolving digital library, susceptible to dynamic (and therefore challenging) patterns such as the introduction of new authors and the change of research-ers’ interests over time. In order to facilitate the evaluation of name disambiguation meth-ods in various realistic scenarios and under controlled conditions, in this article we propose SyGAR, a new Synthetic Generator of Authorship Records that generates citation records based on author profiles. SyGAR can be used to generate successive loads of citation records simulating a living digital library that evolves according to various publication pat-terns. We validate SyGAR by comparing the results produced by three representative name disambiguation methods on real as well as synthetically generated collections of citation records. We also demonstrate its applicability by evaluating those methods on a time evolving digital library collection generated with the tool, considering several dynamic and realistic scenarios.
publishDate 2012
dc.date.none.fl_str_mv 2012-10-22T17:14:31Z
2012-10-22T17:14:31Z
2012
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv FERREIRA, A. A. et al. A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Information Sciences, v. 206, p. 42-62, 2012. Disponível em: <https://www.sciencedirect.com/science/article/pii/S0020025512002861>. Acesso em: 22 out. 2012.
00200255
http://www.repositorio.ufop.br/handle/123456789/1728
identifier_str_mv FERREIRA, A. A. et al. A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Information Sciences, v. 206, p. 42-62, 2012. Disponível em: <https://www.sciencedirect.com/science/article/pii/S0020025512002861>. Acesso em: 22 out. 2012.
00200255
url http://www.repositorio.ufop.br/handle/123456789/1728
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFOP
instname:Universidade Federal de Ouro Preto (UFOP)
instacron:UFOP
instname_str Universidade Federal de Ouro Preto (UFOP)
instacron_str UFOP
institution UFOP
reponame_str Repositório Institucional da UFOP
collection Repositório Institucional da UFOP
repository.name.fl_str_mv Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP)
repository.mail.fl_str_mv repositorio@ufop.edu.br
_version_ 1813002802752913408