A tool for generating synthetic authorship records for evaluating author name disambiguation methods.

Detalhes bibliográficos
Autor(a) principal: Ferreira, Anderson Almeida
Data de Publicação: 2012
Outros Autores: Gonçalves, Marcos André, Almeida, Jussara Marques de, Laender, Alberto Henrique Frade, Veloso, Adriano Alonso
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UFOP
Texto Completo: http://www.repositorio.ufop.br/handle/123456789/1728
Resumo: The author name disambiguation task has to deal with uncertainties related to the possible many-to-many correspondences between ambiguous names and unique authors. Despite the variety of name disambiguation methods available in the literature to solve the problem, most of them are rarely compared against each other. Moreover, they are often evaluated without considering a time evolving digital library, susceptible to dynamic (and therefore challenging) patterns such as the introduction of new authors and the change of research-ers’ interests over time. In order to facilitate the evaluation of name disambiguation meth-ods in various realistic scenarios and under controlled conditions, in this article we propose SyGAR, a new Synthetic Generator of Authorship Records that generates citation records based on author profiles. SyGAR can be used to generate successive loads of citation records simulating a living digital library that evolves according to various publication pat-terns. We validate SyGAR by comparing the results produced by three representative name disambiguation methods on real as well as synthetically generated collections of citation records. We also demonstrate its applicability by evaluating those methods on a time evolving digital library collection generated with the tool, considering several dynamic and realistic scenarios.
id UFOP_0b85270635a17683f03ff1cd40c04139
oai_identifier_str oai:localhost:123456789/1728
network_acronym_str UFOP
network_name_str Repositório Institucional da UFOP
repository_id_str 3233
spelling Ferreira, Anderson AlmeidaGonçalves, Marcos AndréAlmeida, Jussara Marques deLaender, Alberto Henrique FradeVeloso, Adriano Alonso2012-10-22T17:14:31Z2012-10-22T17:14:31Z2012FERREIRA, A. A. et al. A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Information Sciences, v. 206, p. 42-62, 2012. Disponível em: <https://www.sciencedirect.com/science/article/pii/S0020025512002861>. Acesso em: 22 out. 2012.00200255http://www.repositorio.ufop.br/handle/123456789/1728The author name disambiguation task has to deal with uncertainties related to the possible many-to-many correspondences between ambiguous names and unique authors. Despite the variety of name disambiguation methods available in the literature to solve the problem, most of them are rarely compared against each other. Moreover, they are often evaluated without considering a time evolving digital library, susceptible to dynamic (and therefore challenging) patterns such as the introduction of new authors and the change of research-ers’ interests over time. In order to facilitate the evaluation of name disambiguation meth-ods in various realistic scenarios and under controlled conditions, in this article we propose SyGAR, a new Synthetic Generator of Authorship Records that generates citation records based on author profiles. SyGAR can be used to generate successive loads of citation records simulating a living digital library that evolves according to various publication pat-terns. We validate SyGAR by comparing the results produced by three representative name disambiguation methods on real as well as synthetically generated collections of citation records. We also demonstrate its applicability by evaluating those methods on a time evolving digital library collection generated with the tool, considering several dynamic and realistic scenarios.Author name disambiguationDigital libraryBibliographic citationSynthetic generatorA tool for generating synthetic authorship records for evaluating author name disambiguation methods.info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleO periódico Information Sciences concede permissão para depósito do artigo no Repositório Institucional da UFOP. Número da licença: 3303030527825.info:eu-repo/semantics/openAccessengreponame:Repositório Institucional da UFOPinstname:Universidade Federal de Ouro Preto (UFOP)instacron:UFOPLICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://www.repositorio.ufop.br/bitstream/123456789/1728/5/license.txt8a4605be74aa9ea9d79846c1fba20a33MD55ORIGINALARTIGO_ToolGeneratingSynthetic.pdfARTIGO_ToolGeneratingSynthetic.pdfapplication/pdf674496http://www.repositorio.ufop.br/bitstream/123456789/1728/1/ARTIGO_ToolGeneratingSynthetic.pdf2e6a56b51f1d7ebcf03677a1049518dfMD51123456789/17282019-03-13 10:55:07.487oai:localhost:123456789/1728Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttp://www.repositorio.ufop.br/oai/requestrepositorio@ufop.edu.bropendoar:32332019-03-13T14:55:07Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP)false
dc.title.pt_BR.fl_str_mv A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
title A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
spellingShingle A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
Ferreira, Anderson Almeida
Author name disambiguation
Digital library
Bibliographic citation
Synthetic generator
title_short A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
title_full A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
title_fullStr A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
title_full_unstemmed A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
title_sort A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
author Ferreira, Anderson Almeida
author_facet Ferreira, Anderson Almeida
Gonçalves, Marcos André
Almeida, Jussara Marques de
Laender, Alberto Henrique Frade
Veloso, Adriano Alonso
author_role author
author2 Gonçalves, Marcos André
Almeida, Jussara Marques de
Laender, Alberto Henrique Frade
Veloso, Adriano Alonso
author2_role author
author
author
author
dc.contributor.author.fl_str_mv Ferreira, Anderson Almeida
Gonçalves, Marcos André
Almeida, Jussara Marques de
Laender, Alberto Henrique Frade
Veloso, Adriano Alonso
dc.subject.por.fl_str_mv Author name disambiguation
Digital library
Bibliographic citation
Synthetic generator
topic Author name disambiguation
Digital library
Bibliographic citation
Synthetic generator
description The author name disambiguation task has to deal with uncertainties related to the possible many-to-many correspondences between ambiguous names and unique authors. Despite the variety of name disambiguation methods available in the literature to solve the problem, most of them are rarely compared against each other. Moreover, they are often evaluated without considering a time evolving digital library, susceptible to dynamic (and therefore challenging) patterns such as the introduction of new authors and the change of research-ers’ interests over time. In order to facilitate the evaluation of name disambiguation meth-ods in various realistic scenarios and under controlled conditions, in this article we propose SyGAR, a new Synthetic Generator of Authorship Records that generates citation records based on author profiles. SyGAR can be used to generate successive loads of citation records simulating a living digital library that evolves according to various publication pat-terns. We validate SyGAR by comparing the results produced by three representative name disambiguation methods on real as well as synthetically generated collections of citation records. We also demonstrate its applicability by evaluating those methods on a time evolving digital library collection generated with the tool, considering several dynamic and realistic scenarios.
publishDate 2012
dc.date.accessioned.fl_str_mv 2012-10-22T17:14:31Z
dc.date.available.fl_str_mv 2012-10-22T17:14:31Z
dc.date.issued.fl_str_mv 2012
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.citation.fl_str_mv FERREIRA, A. A. et al. A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Information Sciences, v. 206, p. 42-62, 2012. Disponível em: <https://www.sciencedirect.com/science/article/pii/S0020025512002861>. Acesso em: 22 out. 2012.
dc.identifier.uri.fl_str_mv http://www.repositorio.ufop.br/handle/123456789/1728
dc.identifier.issn.none.fl_str_mv 00200255
identifier_str_mv FERREIRA, A. A. et al. A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Information Sciences, v. 206, p. 42-62, 2012. Disponível em: <https://www.sciencedirect.com/science/article/pii/S0020025512002861>. Acesso em: 22 out. 2012.
00200255
url http://www.repositorio.ufop.br/handle/123456789/1728
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFOP
instname:Universidade Federal de Ouro Preto (UFOP)
instacron:UFOP
instname_str Universidade Federal de Ouro Preto (UFOP)
instacron_str UFOP
institution UFOP
reponame_str Repositório Institucional da UFOP
collection Repositório Institucional da UFOP
bitstream.url.fl_str_mv http://www.repositorio.ufop.br/bitstream/123456789/1728/5/license.txt
http://www.repositorio.ufop.br/bitstream/123456789/1728/1/ARTIGO_ToolGeneratingSynthetic.pdf
bitstream.checksum.fl_str_mv 8a4605be74aa9ea9d79846c1fba20a33
2e6a56b51f1d7ebcf03677a1049518df
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP)
repository.mail.fl_str_mv repositorio@ufop.edu.br
_version_ 1801685711993176064