Learning to detect text-code inconsistencies with weak and manual supervision
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFPE |
dARK ID: | ark:/64986/001300000pd0z |
Texto Completo: | https://repositorio.ufpe.br/handle/123456789/49318 |
Resumo: | Source code often is associated with a natural language summary, enabling developers to understand the behavior and intent of the code. For example, method-level comments summarize the behavior of a method and test descriptions summarize the intent of a test case. Unfortunately, the text and its corresponding code sometimes are inconsistent, which may hinder code understanding, code reuse, and code maintenance. We propose TCID, an approach for Text-Code Inconsistency Detection, which trains a neural model to distinguish consistent from inconsistent text-code pairs. Our key contribution is to combine two ways of training such a model. First, TCID performs weakly supervised pre-training based on large amounts of consistent examples extracted from code as-is and inconsistent examples created by randomly recombining text-code pairs. Then, TCID fine-tunes the model based on a small and curated set of manually labeled examples. This combination is motivated by the observation that weak supervision alone leads to models that generalize poorly to real-world inconsistencies. Our evaluation applies the two-step training procedure to four state-of-the-art models and evaluates it on two text-vs-code problems: 40.7K method-level comments checked against the corresponding Java method body, and—as a problem not considered in prior work— 338.8K test case descriptions checked against corresponding JavaScript implementations. Our results show that a small amount of manual labeling enables the approach to significantly improve effectiveness, outperforming the current state of the art and improving the F1 score by 5% in Java and by 17% in JavaScript. We validate the usefulness of TCID’s predictions by submitting pull requests, of which 10 have been accepted so far. |
id |
UFPE_daf23fd7bf6b2e3473064c3b514e0b1e |
---|---|
oai_identifier_str |
oai:repositorio.ufpe.br:123456789/49318 |
network_acronym_str |
UFPE |
network_name_str |
Repositório Institucional da UFPE |
repository_id_str |
2221 |
spelling |
SOUZA, Beatriz Bezerra dehttp://lattes.cnpq.br/2008820285345452http://lattes.cnpq.br/3762670242328435D'AMORIM, Marcelo Bezerra2023-03-10T13:08:35Z2023-03-10T13:08:35Z2023-02-15SOUZA, Beatriz Bezerra de. Learning to detect text-code inconsistencies with weak and manual supervision. 2023. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2023.https://repositorio.ufpe.br/handle/123456789/49318ark:/64986/001300000pd0zSource code often is associated with a natural language summary, enabling developers to understand the behavior and intent of the code. For example, method-level comments summarize the behavior of a method and test descriptions summarize the intent of a test case. Unfortunately, the text and its corresponding code sometimes are inconsistent, which may hinder code understanding, code reuse, and code maintenance. We propose TCID, an approach for Text-Code Inconsistency Detection, which trains a neural model to distinguish consistent from inconsistent text-code pairs. Our key contribution is to combine two ways of training such a model. First, TCID performs weakly supervised pre-training based on large amounts of consistent examples extracted from code as-is and inconsistent examples created by randomly recombining text-code pairs. Then, TCID fine-tunes the model based on a small and curated set of manually labeled examples. This combination is motivated by the observation that weak supervision alone leads to models that generalize poorly to real-world inconsistencies. Our evaluation applies the two-step training procedure to four state-of-the-art models and evaluates it on two text-vs-code problems: 40.7K method-level comments checked against the corresponding Java method body, and—as a problem not considered in prior work— 338.8K test case descriptions checked against corresponding JavaScript implementations. Our results show that a small amount of manual labeling enables the approach to significantly improve effectiveness, outperforming the current state of the art and improving the F1 score by 5% in Java and by 17% in JavaScript. We validate the usefulness of TCID’s predictions by submitting pull requests, of which 10 have been accepted so far.CNPqO código-fonte geralmente está associado a um resumo em linguagem natural, permitindo que os desenvolvedores entendam o comportamento e a intenção do código. Por exemplo, co- mentários em nível de método resumem o comportamento de um método e descrições de teste resumem a intenção de um caso de teste. Infelizmente, o texto e seu código correspondente às vezes são inconsistentes, o que pode atrapalhar a compreensão do código, a reutilização do código e a manutenção do código. Propomos TCID, uma abordagem para Detecção de Inconsistência de Código e Texto, que treina um modelo neural para distinguir pares de texto- código consistentes de inconsistentes. Nossa principal contribuição é combinar duas formas de treinar tal modelo. Primeiro, o TCID executa pré-treinamento fracamente supervisionado com base em grandes quantidades de exemplos consistentes extraídos do código como está e exem- plos inconsistentes criados pela recombinação aleatória de pares texto-código. Em seguida, o TCID faz o ajuste fino no modelo baseado em um conjunto pequeno e curado de exemplos ro- tulados manualmente. Esta combinação é motivada pela observação de que a supervisão fraca por si só leva a modelos que generalizam mal a inconsistências do mundo real. Nossa avaliação aplica o procedimento de treinamento em duas etapas a quatro modelos de última geração e avalia-os em dois problemas de texto versus código: 40.7K comentários em nível de método verificados em relação ao corpo do método Java correspondente e—como um problema não considerado em trabalhos anteriores—338.8K as descrições dos casos de teste são verificadas em relação às implementações JavaScript correspondentes. Nossos resultados mostram que uma pequena quantidade de rotulagem manual permite que a eficácia da abordagem melhore significativamente, superando o estado da arte atual e melhorando a pontuação de F1 em 5% em Java e em 17% em JavaScript. Validamos a utilidade das previsões do TCID por envio de pull requests, dos quais 10 foram aceitos até o momento.engUniversidade Federal de PernambucoPrograma de Pos Graduacao em Ciencia da ComputacaoUFPEBrasilAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessEngenharia de softwareDetecção de inconsistênciaLearning to detect text-code inconsistencies with weak and manual supervisioninfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesismestradoreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPEORIGINALDISSERTAÇÃO Beatriz Bezerra de Souza.pdfDISSERTAÇÃO Beatriz Bezerra de Souza.pdfapplication/pdf684719https://repositorio.ufpe.br/bitstream/123456789/49318/1/DISSERTA%c3%87%c3%83O%20Beatriz%20Bezerra%20de%20Souza.pdf0df3a684d568b9b7551e3229dc9fcc28MD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufpe.br/bitstream/123456789/49318/2/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82362https://repositorio.ufpe.br/bitstream/123456789/49318/3/license.txt5e89a1613ddc8510c6576f4b23a78973MD53TEXTDISSERTAÇÃO Beatriz Bezerra de Souza.pdf.txtDISSERTAÇÃO Beatriz Bezerra de Souza.pdf.txtExtracted texttext/plain80708https://repositorio.ufpe.br/bitstream/123456789/49318/4/DISSERTA%c3%87%c3%83O%20Beatriz%20Bezerra%20de%20Souza.pdf.txt9a44b214df53c1bf7b3426437f6065f0MD54THUMBNAILDISSERTAÇÃO Beatriz Bezerra de Souza.pdf.jpgDISSERTAÇÃO Beatriz Bezerra de Souza.pdf.jpgGenerated Thumbnailimage/jpeg1236https://repositorio.ufpe.br/bitstream/123456789/49318/5/DISSERTA%c3%87%c3%83O%20Beatriz%20Bezerra%20de%20Souza.pdf.jpg5971ac83e9a0fb40c70ad94bc2da6b6dMD55123456789/493182023-03-11 02:22:03.209oai:repositorio.ufpe.br:123456789/49318VGVybW8gZGUgRGVww7NzaXRvIExlZ2FsIGUgQXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2l6YcOnw6NvIGRlIERvY3VtZW50b3Mgbm8gUmVwb3NpdMOzcmlvIERpZ2l0YWwgZGEgVUZQRQoKCkRlY2xhcm8gZXN0YXIgY2llbnRlIGRlIHF1ZSBlc3RlIFRlcm1vIGRlIERlcMOzc2l0byBMZWdhbCBlIEF1dG9yaXphw6fDo28gdGVtIG8gb2JqZXRpdm8gZGUgZGl2dWxnYcOnw6NvIGRvcyBkb2N1bWVudG9zIGRlcG9zaXRhZG9zIG5vIFJlcG9zaXTDs3JpbyBEaWdpdGFsIGRhIFVGUEUgZSBkZWNsYXJvIHF1ZToKCkkgLSBvcyBkYWRvcyBwcmVlbmNoaWRvcyBubyBmb3JtdWzDoXJpbyBkZSBkZXDDs3NpdG8gc8OjbyB2ZXJkYWRlaXJvcyBlIGF1dMOqbnRpY29zOwoKSUkgLSAgbyBjb250ZcO6ZG8gZGlzcG9uaWJpbGl6YWRvIMOpIGRlIHJlc3BvbnNhYmlsaWRhZGUgZGUgc3VhIGF1dG9yaWE7CgpJSUkgLSBvIGNvbnRlw7pkbyDDqSBvcmlnaW5hbCwgZSBzZSBvIHRyYWJhbGhvIGUvb3UgcGFsYXZyYXMgZGUgb3V0cmFzIHBlc3NvYXMgZm9yYW0gdXRpbGl6YWRvcywgZXN0YXMgZm9yYW0gZGV2aWRhbWVudGUgcmVjb25oZWNpZGFzOwoKSVYgLSBxdWFuZG8gdHJhdGFyLXNlIGRlIG9icmEgY29sZXRpdmEgKG1haXMgZGUgdW0gYXV0b3IpOiB0b2RvcyBvcyBhdXRvcmVzIGVzdMOjbyBjaWVudGVzIGRvIGRlcMOzc2l0byBlIGRlIGFjb3JkbyBjb20gZXN0ZSB0ZXJtbzsKClYgLSBxdWFuZG8gdHJhdGFyLXNlIGRlIFRyYWJhbGhvIGRlIENvbmNsdXPDo28gZGUgQ3Vyc28sIERpc3NlcnRhw6fDo28gb3UgVGVzZTogbyBhcnF1aXZvIGRlcG9zaXRhZG8gY29ycmVzcG9uZGUgw6AgdmVyc8OjbyBmaW5hbCBkbyB0cmFiYWxobzsKClZJIC0gcXVhbmRvIHRyYXRhci1zZSBkZSBUcmFiYWxobyBkZSBDb25jbHVzw6NvIGRlIEN1cnNvLCBEaXNzZXJ0YcOnw6NvIG91IFRlc2U6IGVzdG91IGNpZW50ZSBkZSBxdWUgYSBhbHRlcmHDp8OjbyBkYSBtb2RhbGlkYWRlIGRlIGFjZXNzbyBhbyBkb2N1bWVudG8gYXDDs3MgbyBkZXDDs3NpdG8gZSBhbnRlcyBkZSBmaW5kYXIgbyBwZXLDrW9kbyBkZSBlbWJhcmdvLCBxdWFuZG8gZm9yIGVzY29saGlkbyBhY2Vzc28gcmVzdHJpdG8sIHNlcsOhIHBlcm1pdGlkYSBtZWRpYW50ZSBzb2xpY2l0YcOnw6NvIGRvIChhKSBhdXRvciAoYSkgYW8gU2lzdGVtYSBJbnRlZ3JhZG8gZGUgQmlibGlvdGVjYXMgZGEgVUZQRSAoU0lCL1VGUEUpLgoKIApQYXJhIHRyYWJhbGhvcyBlbSBBY2Vzc28gQWJlcnRvOgoKTmEgcXVhbGlkYWRlIGRlIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIGRlIGF1dG9yIHF1ZSByZWNhZW0gc29icmUgZXN0ZSBkb2N1bWVudG8sIGZ1bmRhbWVudGFkbyBuYSBMZWkgZGUgRGlyZWl0byBBdXRvcmFsIG5vIDkuNjEwLCBkZSAxOSBkZSBmZXZlcmVpcm8gZGUgMTk5OCwgYXJ0LiAyOSwgaW5jaXNvIElJSSwgYXV0b3Jpem8gYSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBQZXJuYW1idWNvIGEgZGlzcG9uaWJpbGl6YXIgZ3JhdHVpdGFtZW50ZSwgc2VtIHJlc3NhcmNpbWVudG8gZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCBwYXJhIGZpbnMgZGUgbGVpdHVyYSwgaW1wcmVzc8OjbyBlL291IGRvd25sb2FkIChhcXVpc2nDp8OjbykgYXRyYXbDqXMgZG8gc2l0ZSBkbyBSZXBvc2l0w7NyaW8gRGlnaXRhbCBkYSBVRlBFIG5vIGVuZGVyZcOnbyBodHRwOi8vd3d3LnJlcG9zaXRvcmlvLnVmcGUuYnIsIGEgcGFydGlyIGRhIGRhdGEgZGUgZGVww7NzaXRvLgoKIApQYXJhIHRyYWJhbGhvcyBlbSBBY2Vzc28gUmVzdHJpdG86CgpOYSBxdWFsaWRhZGUgZGUgdGl0dWxhciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGUgYXV0b3IgcXVlIHJlY2FlbSBzb2JyZSBlc3RlIGRvY3VtZW50bywgZnVuZGFtZW50YWRvIG5hIExlaSBkZSBEaXJlaXRvIEF1dG9yYWwgbm8gOS42MTAgZGUgMTkgZGUgZmV2ZXJlaXJvIGRlIDE5OTgsIGFydC4gMjksIGluY2lzbyBJSUksIGF1dG9yaXpvIGEgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZGUgUGVybmFtYnVjbyBhIGRpc3BvbmliaWxpemFyIGdyYXR1aXRhbWVudGUsIHNlbSByZXNzYXJjaW1lbnRvIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgcGFyYSBmaW5zIGRlIGxlaXR1cmEsIGltcHJlc3PDo28gZS9vdSBkb3dubG9hZCAoYXF1aXNpw6fDo28pIGF0cmF2w6lzIGRvIHNpdGUgZG8gUmVwb3NpdMOzcmlvIERpZ2l0YWwgZGEgVUZQRSBubyBlbmRlcmXDp28gaHR0cDovL3d3dy5yZXBvc2l0b3Jpby51ZnBlLmJyLCBxdWFuZG8gZmluZGFyIG8gcGVyw61vZG8gZGUgZW1iYXJnbyBjb25kaXplbnRlIGFvIHRpcG8gZGUgZG9jdW1lbnRvLCBjb25mb3JtZSBpbmRpY2FkbyBubyBjYW1wbyBEYXRhIGRlIEVtYmFyZ28uCg==Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212023-03-11T05:22:03Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false |
dc.title.pt_BR.fl_str_mv |
Learning to detect text-code inconsistencies with weak and manual supervision |
title |
Learning to detect text-code inconsistencies with weak and manual supervision |
spellingShingle |
Learning to detect text-code inconsistencies with weak and manual supervision SOUZA, Beatriz Bezerra de Engenharia de software Detecção de inconsistência |
title_short |
Learning to detect text-code inconsistencies with weak and manual supervision |
title_full |
Learning to detect text-code inconsistencies with weak and manual supervision |
title_fullStr |
Learning to detect text-code inconsistencies with weak and manual supervision |
title_full_unstemmed |
Learning to detect text-code inconsistencies with weak and manual supervision |
title_sort |
Learning to detect text-code inconsistencies with weak and manual supervision |
author |
SOUZA, Beatriz Bezerra de |
author_facet |
SOUZA, Beatriz Bezerra de |
author_role |
author |
dc.contributor.authorLattes.pt_BR.fl_str_mv |
http://lattes.cnpq.br/2008820285345452 |
dc.contributor.advisorLattes.pt_BR.fl_str_mv |
http://lattes.cnpq.br/3762670242328435 |
dc.contributor.author.fl_str_mv |
SOUZA, Beatriz Bezerra de |
dc.contributor.advisor1.fl_str_mv |
D'AMORIM, Marcelo Bezerra |
contributor_str_mv |
D'AMORIM, Marcelo Bezerra |
dc.subject.por.fl_str_mv |
Engenharia de software Detecção de inconsistência |
topic |
Engenharia de software Detecção de inconsistência |
description |
Source code often is associated with a natural language summary, enabling developers to understand the behavior and intent of the code. For example, method-level comments summarize the behavior of a method and test descriptions summarize the intent of a test case. Unfortunately, the text and its corresponding code sometimes are inconsistent, which may hinder code understanding, code reuse, and code maintenance. We propose TCID, an approach for Text-Code Inconsistency Detection, which trains a neural model to distinguish consistent from inconsistent text-code pairs. Our key contribution is to combine two ways of training such a model. First, TCID performs weakly supervised pre-training based on large amounts of consistent examples extracted from code as-is and inconsistent examples created by randomly recombining text-code pairs. Then, TCID fine-tunes the model based on a small and curated set of manually labeled examples. This combination is motivated by the observation that weak supervision alone leads to models that generalize poorly to real-world inconsistencies. Our evaluation applies the two-step training procedure to four state-of-the-art models and evaluates it on two text-vs-code problems: 40.7K method-level comments checked against the corresponding Java method body, and—as a problem not considered in prior work— 338.8K test case descriptions checked against corresponding JavaScript implementations. Our results show that a small amount of manual labeling enables the approach to significantly improve effectiveness, outperforming the current state of the art and improving the F1 score by 5% in Java and by 17% in JavaScript. We validate the usefulness of TCID’s predictions by submitting pull requests, of which 10 have been accepted so far. |
publishDate |
2023 |
dc.date.accessioned.fl_str_mv |
2023-03-10T13:08:35Z |
dc.date.available.fl_str_mv |
2023-03-10T13:08:35Z |
dc.date.issued.fl_str_mv |
2023-02-15 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
SOUZA, Beatriz Bezerra de. Learning to detect text-code inconsistencies with weak and manual supervision. 2023. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2023. |
dc.identifier.uri.fl_str_mv |
https://repositorio.ufpe.br/handle/123456789/49318 |
dc.identifier.dark.fl_str_mv |
ark:/64986/001300000pd0z |
identifier_str_mv |
SOUZA, Beatriz Bezerra de. Learning to detect text-code inconsistencies with weak and manual supervision. 2023. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2023. ark:/64986/001300000pd0z |
url |
https://repositorio.ufpe.br/handle/123456789/49318 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/ info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/ |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de Pernambuco |
dc.publisher.program.fl_str_mv |
Programa de Pos Graduacao em Ciencia da Computacao |
dc.publisher.initials.fl_str_mv |
UFPE |
dc.publisher.country.fl_str_mv |
Brasil |
publisher.none.fl_str_mv |
Universidade Federal de Pernambuco |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFPE instname:Universidade Federal de Pernambuco (UFPE) instacron:UFPE |
instname_str |
Universidade Federal de Pernambuco (UFPE) |
instacron_str |
UFPE |
institution |
UFPE |
reponame_str |
Repositório Institucional da UFPE |
collection |
Repositório Institucional da UFPE |
bitstream.url.fl_str_mv |
https://repositorio.ufpe.br/bitstream/123456789/49318/1/DISSERTA%c3%87%c3%83O%20Beatriz%20Bezerra%20de%20Souza.pdf https://repositorio.ufpe.br/bitstream/123456789/49318/2/license_rdf https://repositorio.ufpe.br/bitstream/123456789/49318/3/license.txt https://repositorio.ufpe.br/bitstream/123456789/49318/4/DISSERTA%c3%87%c3%83O%20Beatriz%20Bezerra%20de%20Souza.pdf.txt https://repositorio.ufpe.br/bitstream/123456789/49318/5/DISSERTA%c3%87%c3%83O%20Beatriz%20Bezerra%20de%20Souza.pdf.jpg |
bitstream.checksum.fl_str_mv |
0df3a684d568b9b7551e3229dc9fcc28 e39d27027a6cc9cb039ad269a5db8e34 5e89a1613ddc8510c6576f4b23a78973 9a44b214df53c1bf7b3426437f6065f0 5971ac83e9a0fb40c70ad94bc2da6b6d |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE) |
repository.mail.fl_str_mv |
attena@ufpe.br |
_version_ |
1815172875587420160 |