A tool for generating synthetic authorship records for evaluating author name disambiguation methods.
Autor(a) principal: | |
---|---|
Data de Publicação: | 2012 |
Outros Autores: | , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFOP |
Texto Completo: | http://www.repositorio.ufop.br/handle/123456789/1728 |
Resumo: | The author name disambiguation task has to deal with uncertainties related to the possible many-to-many correspondences between ambiguous names and unique authors. Despite the variety of name disambiguation methods available in the literature to solve the problem, most of them are rarely compared against each other. Moreover, they are often evaluated without considering a time evolving digital library, susceptible to dynamic (and therefore challenging) patterns such as the introduction of new authors and the change of research-ers’ interests over time. In order to facilitate the evaluation of name disambiguation meth-ods in various realistic scenarios and under controlled conditions, in this article we propose SyGAR, a new Synthetic Generator of Authorship Records that generates citation records based on author profiles. SyGAR can be used to generate successive loads of citation records simulating a living digital library that evolves according to various publication pat-terns. We validate SyGAR by comparing the results produced by three representative name disambiguation methods on real as well as synthetically generated collections of citation records. We also demonstrate its applicability by evaluating those methods on a time evolving digital library collection generated with the tool, considering several dynamic and realistic scenarios. |
id |
UFOP_0b85270635a17683f03ff1cd40c04139 |
---|---|
oai_identifier_str |
oai:localhost:123456789/1728 |
network_acronym_str |
UFOP |
network_name_str |
Repositório Institucional da UFOP |
repository_id_str |
3233 |
spelling |
Ferreira, Anderson AlmeidaGonçalves, Marcos AndréAlmeida, Jussara Marques deLaender, Alberto Henrique FradeVeloso, Adriano Alonso2012-10-22T17:14:31Z2012-10-22T17:14:31Z2012FERREIRA, A. A. et al. A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Information Sciences, v. 206, p. 42-62, 2012. Disponível em: <https://www.sciencedirect.com/science/article/pii/S0020025512002861>. Acesso em: 22 out. 2012.00200255http://www.repositorio.ufop.br/handle/123456789/1728The author name disambiguation task has to deal with uncertainties related to the possible many-to-many correspondences between ambiguous names and unique authors. Despite the variety of name disambiguation methods available in the literature to solve the problem, most of them are rarely compared against each other. Moreover, they are often evaluated without considering a time evolving digital library, susceptible to dynamic (and therefore challenging) patterns such as the introduction of new authors and the change of research-ers’ interests over time. In order to facilitate the evaluation of name disambiguation meth-ods in various realistic scenarios and under controlled conditions, in this article we propose SyGAR, a new Synthetic Generator of Authorship Records that generates citation records based on author profiles. SyGAR can be used to generate successive loads of citation records simulating a living digital library that evolves according to various publication pat-terns. We validate SyGAR by comparing the results produced by three representative name disambiguation methods on real as well as synthetically generated collections of citation records. We also demonstrate its applicability by evaluating those methods on a time evolving digital library collection generated with the tool, considering several dynamic and realistic scenarios.Author name disambiguationDigital libraryBibliographic citationSynthetic generatorA tool for generating synthetic authorship records for evaluating author name disambiguation methods.info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleO periódico Information Sciences concede permissão para depósito do artigo no Repositório Institucional da UFOP. Número da licença: 3303030527825.info:eu-repo/semantics/openAccessengreponame:Repositório Institucional da UFOPinstname:Universidade Federal de Ouro Preto (UFOP)instacron:UFOPLICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://www.repositorio.ufop.br/bitstream/123456789/1728/5/license.txt8a4605be74aa9ea9d79846c1fba20a33MD55ORIGINALARTIGO_ToolGeneratingSynthetic.pdfARTIGO_ToolGeneratingSynthetic.pdfapplication/pdf674496http://www.repositorio.ufop.br/bitstream/123456789/1728/1/ARTIGO_ToolGeneratingSynthetic.pdf2e6a56b51f1d7ebcf03677a1049518dfMD51123456789/17282019-03-13 10:55:07.487oai:localhost:123456789/1728Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttp://www.repositorio.ufop.br/oai/requestrepositorio@ufop.edu.bropendoar:32332019-03-13T14:55:07Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP)false |
dc.title.pt_BR.fl_str_mv |
A tool for generating synthetic authorship records for evaluating author name disambiguation methods. |
title |
A tool for generating synthetic authorship records for evaluating author name disambiguation methods. |
spellingShingle |
A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Ferreira, Anderson Almeida Author name disambiguation Digital library Bibliographic citation Synthetic generator |
title_short |
A tool for generating synthetic authorship records for evaluating author name disambiguation methods. |
title_full |
A tool for generating synthetic authorship records for evaluating author name disambiguation methods. |
title_fullStr |
A tool for generating synthetic authorship records for evaluating author name disambiguation methods. |
title_full_unstemmed |
A tool for generating synthetic authorship records for evaluating author name disambiguation methods. |
title_sort |
A tool for generating synthetic authorship records for evaluating author name disambiguation methods. |
author |
Ferreira, Anderson Almeida |
author_facet |
Ferreira, Anderson Almeida Gonçalves, Marcos André Almeida, Jussara Marques de Laender, Alberto Henrique Frade Veloso, Adriano Alonso |
author_role |
author |
author2 |
Gonçalves, Marcos André Almeida, Jussara Marques de Laender, Alberto Henrique Frade Veloso, Adriano Alonso |
author2_role |
author author author author |
dc.contributor.author.fl_str_mv |
Ferreira, Anderson Almeida Gonçalves, Marcos André Almeida, Jussara Marques de Laender, Alberto Henrique Frade Veloso, Adriano Alonso |
dc.subject.por.fl_str_mv |
Author name disambiguation Digital library Bibliographic citation Synthetic generator |
topic |
Author name disambiguation Digital library Bibliographic citation Synthetic generator |
description |
The author name disambiguation task has to deal with uncertainties related to the possible many-to-many correspondences between ambiguous names and unique authors. Despite the variety of name disambiguation methods available in the literature to solve the problem, most of them are rarely compared against each other. Moreover, they are often evaluated without considering a time evolving digital library, susceptible to dynamic (and therefore challenging) patterns such as the introduction of new authors and the change of research-ers’ interests over time. In order to facilitate the evaluation of name disambiguation meth-ods in various realistic scenarios and under controlled conditions, in this article we propose SyGAR, a new Synthetic Generator of Authorship Records that generates citation records based on author profiles. SyGAR can be used to generate successive loads of citation records simulating a living digital library that evolves according to various publication pat-terns. We validate SyGAR by comparing the results produced by three representative name disambiguation methods on real as well as synthetically generated collections of citation records. We also demonstrate its applicability by evaluating those methods on a time evolving digital library collection generated with the tool, considering several dynamic and realistic scenarios. |
publishDate |
2012 |
dc.date.accessioned.fl_str_mv |
2012-10-22T17:14:31Z |
dc.date.available.fl_str_mv |
2012-10-22T17:14:31Z |
dc.date.issued.fl_str_mv |
2012 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
FERREIRA, A. A. et al. A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Information Sciences, v. 206, p. 42-62, 2012. Disponível em: <https://www.sciencedirect.com/science/article/pii/S0020025512002861>. Acesso em: 22 out. 2012. |
dc.identifier.uri.fl_str_mv |
http://www.repositorio.ufop.br/handle/123456789/1728 |
dc.identifier.issn.none.fl_str_mv |
00200255 |
identifier_str_mv |
FERREIRA, A. A. et al. A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Information Sciences, v. 206, p. 42-62, 2012. Disponível em: <https://www.sciencedirect.com/science/article/pii/S0020025512002861>. Acesso em: 22 out. 2012. 00200255 |
url |
http://www.repositorio.ufop.br/handle/123456789/1728 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFOP instname:Universidade Federal de Ouro Preto (UFOP) instacron:UFOP |
instname_str |
Universidade Federal de Ouro Preto (UFOP) |
instacron_str |
UFOP |
institution |
UFOP |
reponame_str |
Repositório Institucional da UFOP |
collection |
Repositório Institucional da UFOP |
bitstream.url.fl_str_mv |
http://www.repositorio.ufop.br/bitstream/123456789/1728/5/license.txt http://www.repositorio.ufop.br/bitstream/123456789/1728/1/ARTIGO_ToolGeneratingSynthetic.pdf |
bitstream.checksum.fl_str_mv |
8a4605be74aa9ea9d79846c1fba20a33 2e6a56b51f1d7ebcf03677a1049518df |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFOP - Universidade Federal de Ouro Preto (UFOP) |
repository.mail.fl_str_mv |
repositorio@ufop.edu.br |
_version_ |
1801685711993176064 |