Thesaurus-Based Tag Clouds for Test-Driven Code Search

Detalhes bibliográficos
Autor(a) principal: Lemos, Otávio Augusto Lazzarini [UNIFESP]
Data de Publicação: 2014
Outros Autores: Paula, Adriano Carvalho de [UNIFESP], Konishi, Gustavo [UNIFESP], Bajracharya, Sushil, Ossher, Joel, Lopes, Cristina Videira
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNIFESP
Texto Completo: http://dx.doi.org/10.3217/jucs-020-05-0772
http://repositorio.unifesp.br/handle/11600/44780
Resumo: Test-driven code search (TDCS) is an approach to code search and reuse that uses test cases as inputs to form the search query. Together with the test cases that provide more semantics to the search task, keywords taken from class and method names are still required. Therefore, the effectiveness of the approach also relies on how good these keywords are, i.e., how frequently they are chosen by developers to name the desired functions. To help users choose adequate words in their query test cases, visual aids can be used. In this paper we propose thesaurus-based tag clouds to show developers terms that are more frequently used in the code repository to improve their search. Terms are generated by looking up words similar to the initial keywords on a thesaurus. Tag clouds are then formed based on the frequency in which these terms appear in the code base. Our approach was implemented with an English thesaurus as an extension to CodeGenie, a Java- and Eclipse-based TDCS tool. Our evaluation shows evidence that the approach can help improve the number of returned results, recall (by similar to 28%, on average), and precision (by similar to 14%, on average). We also noticed the visual aid can be especially useful for non-native speakers of the language in which the code repository is written. These users are frequently unaware of the most common terms used to name specific functionality in the code, in the given language.
id UFSP_fa35a25d0bd0faef1841cdd1c6573597
oai_identifier_str oai:repositorio.unifesp.br/:11600/44780
network_acronym_str UFSP
network_name_str Repositório Institucional da UNIFESP
repository_id_str 3465
spelling Thesaurus-Based Tag Clouds for Test-Driven Code SearchTest-driven code searchcode searchsoftware reusetag cloudsTest-driven code search (TDCS) is an approach to code search and reuse that uses test cases as inputs to form the search query. Together with the test cases that provide more semantics to the search task, keywords taken from class and method names are still required. Therefore, the effectiveness of the approach also relies on how good these keywords are, i.e., how frequently they are chosen by developers to name the desired functions. To help users choose adequate words in their query test cases, visual aids can be used. In this paper we propose thesaurus-based tag clouds to show developers terms that are more frequently used in the code repository to improve their search. Terms are generated by looking up words similar to the initial keywords on a thesaurus. Tag clouds are then formed based on the frequency in which these terms appear in the code base. Our approach was implemented with an English thesaurus as an extension to CodeGenie, a Java- and Eclipse-based TDCS tool. Our evaluation shows evidence that the approach can help improve the number of returned results, recall (by similar to 28%, on average), and precision (by similar to 14%, on average). We also noticed the visual aid can be especially useful for non-native speakers of the language in which the code repository is written. These users are frequently unaware of the most common terms used to name specific functionality in the code, in the given language.Univ Fed Sao Paulo, Sao Jose Dos Campos, SP, BrazilBlack Duck Software Inc, Burlington, MA USAUniv Calif Irvine, Irvine, CA USAUniv Fed Sao Paulo, Sao Jose Dos Campos, SP, BrazilWeb of ScienceFundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)FAPESP: 2010/15540-2Graz Univ Technolgoy, Inst Information Systems Computer Media-iicmInnolut Sistemas InformatUniversidade Federal de São Paulo (UNIFESP)Universidade de São Paulo (USP)Lemos, Otávio Augusto Lazzarini [UNIFESP]Paula, Adriano Carvalho de [UNIFESP]Konishi, Gustavo [UNIFESP]Bajracharya, SushilOssher, JoelLopes, Cristina Videira2018-06-18T10:54:32Z2018-06-18T10:54:32Z2014-01-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersion1154-1173http://dx.doi.org/10.3217/jucs-020-05-0772Journal Of Universal Computer Science. Graz: Graz Univ Technolgoy, Inst Information Systems Computer Media-iicm, v. 20, n. 5, p. 772-796, 2014.10.3217/jucs-020-05-07720948-695Xhttp://repositorio.unifesp.br/handle/11600/44780WOS:000339391100009engJournal Of Universal Computer Scienceinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UNIFESPinstname:Universidade Federal de São Paulo (UNIFESP)instacron:UNIFESP2024-05-02T15:52:16Zoai:repositorio.unifesp.br/:11600/44780Repositório InstitucionalPUBhttp://www.repositorio.unifesp.br/oai/requestbiblioteca.csp@unifesp.bropendoar:34652024-05-02T15:52:16Repositório Institucional da UNIFESP - Universidade Federal de São Paulo (UNIFESP)false
dc.title.none.fl_str_mv Thesaurus-Based Tag Clouds for Test-Driven Code Search
title Thesaurus-Based Tag Clouds for Test-Driven Code Search
spellingShingle Thesaurus-Based Tag Clouds for Test-Driven Code Search
Lemos, Otávio Augusto Lazzarini [UNIFESP]
Test-driven code search
code search
software reuse
tag clouds
title_short Thesaurus-Based Tag Clouds for Test-Driven Code Search
title_full Thesaurus-Based Tag Clouds for Test-Driven Code Search
title_fullStr Thesaurus-Based Tag Clouds for Test-Driven Code Search
title_full_unstemmed Thesaurus-Based Tag Clouds for Test-Driven Code Search
title_sort Thesaurus-Based Tag Clouds for Test-Driven Code Search
author Lemos, Otávio Augusto Lazzarini [UNIFESP]
author_facet Lemos, Otávio Augusto Lazzarini [UNIFESP]
Paula, Adriano Carvalho de [UNIFESP]
Konishi, Gustavo [UNIFESP]
Bajracharya, Sushil
Ossher, Joel
Lopes, Cristina Videira
author_role author
author2 Paula, Adriano Carvalho de [UNIFESP]
Konishi, Gustavo [UNIFESP]
Bajracharya, Sushil
Ossher, Joel
Lopes, Cristina Videira
author2_role author
author
author
author
author
dc.contributor.none.fl_str_mv Innolut Sistemas Informat
Universidade Federal de São Paulo (UNIFESP)
Universidade de São Paulo (USP)
dc.contributor.author.fl_str_mv Lemos, Otávio Augusto Lazzarini [UNIFESP]
Paula, Adriano Carvalho de [UNIFESP]
Konishi, Gustavo [UNIFESP]
Bajracharya, Sushil
Ossher, Joel
Lopes, Cristina Videira
dc.subject.por.fl_str_mv Test-driven code search
code search
software reuse
tag clouds
topic Test-driven code search
code search
software reuse
tag clouds
description Test-driven code search (TDCS) is an approach to code search and reuse that uses test cases as inputs to form the search query. Together with the test cases that provide more semantics to the search task, keywords taken from class and method names are still required. Therefore, the effectiveness of the approach also relies on how good these keywords are, i.e., how frequently they are chosen by developers to name the desired functions. To help users choose adequate words in their query test cases, visual aids can be used. In this paper we propose thesaurus-based tag clouds to show developers terms that are more frequently used in the code repository to improve their search. Terms are generated by looking up words similar to the initial keywords on a thesaurus. Tag clouds are then formed based on the frequency in which these terms appear in the code base. Our approach was implemented with an English thesaurus as an extension to CodeGenie, a Java- and Eclipse-based TDCS tool. Our evaluation shows evidence that the approach can help improve the number of returned results, recall (by similar to 28%, on average), and precision (by similar to 14%, on average). We also noticed the visual aid can be especially useful for non-native speakers of the language in which the code repository is written. These users are frequently unaware of the most common terms used to name specific functionality in the code, in the given language.
publishDate 2014
dc.date.none.fl_str_mv 2014-01-01
2018-06-18T10:54:32Z
2018-06-18T10:54:32Z
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.3217/jucs-020-05-0772
Journal Of Universal Computer Science. Graz: Graz Univ Technolgoy, Inst Information Systems Computer Media-iicm, v. 20, n. 5, p. 772-796, 2014.
10.3217/jucs-020-05-0772
0948-695X
http://repositorio.unifesp.br/handle/11600/44780
WOS:000339391100009
url http://dx.doi.org/10.3217/jucs-020-05-0772
http://repositorio.unifesp.br/handle/11600/44780
identifier_str_mv Journal Of Universal Computer Science. Graz: Graz Univ Technolgoy, Inst Information Systems Computer Media-iicm, v. 20, n. 5, p. 772-796, 2014.
10.3217/jucs-020-05-0772
0948-695X
WOS:000339391100009
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Journal Of Universal Computer Science
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 1154-1173
dc.publisher.none.fl_str_mv Graz Univ Technolgoy, Inst Information Systems Computer Media-iicm
publisher.none.fl_str_mv Graz Univ Technolgoy, Inst Information Systems Computer Media-iicm
dc.source.none.fl_str_mv reponame:Repositório Institucional da UNIFESP
instname:Universidade Federal de São Paulo (UNIFESP)
instacron:UNIFESP
instname_str Universidade Federal de São Paulo (UNIFESP)
instacron_str UNIFESP
institution UNIFESP
reponame_str Repositório Institucional da UNIFESP
collection Repositório Institucional da UNIFESP
repository.name.fl_str_mv Repositório Institucional da UNIFESP - Universidade Federal de São Paulo (UNIFESP)
repository.mail.fl_str_mv biblioteca.csp@unifesp.br
_version_ 1814268442458456064