Automatic syntax error reporting and recovery in parsing expression grammars

Detalhes bibliográficos
Autor(a) principal: Medeiros, Sérgio Queiroz de
Data de Publicação: 2020
Outros Autores: Junior, Gilney de Azevedo Alvez, Mascarenhas, Fabio
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UFRN
Texto Completo: https://repositorio.ufrn.br/handle/123456789/30867
Resumo: Error recovery is an essential feature for a parser that should be plugged in Integrated Development Environments (IDEs), which must build Abstract Syntax Trees (ASTs) even for syntactically invalid programs in order to offer features such as automated refactoring and code completion. Parsing Expressions Grammars (PEGs) are a formalism that naturally describes recursive top-down parsers using a restricted form of backtracking. Labeled failures are a conservative extension of PEGs that adds an error reporting mechanism for PEG parsers, and these labels can also be associated with recovery expressions to provide an error recovery mechanism. These expressions can use the full expressivity of PEGs to recover from syntactic errors. Manually annotating a large grammar with labels and recovery expressions can be difficult. In this work, we present two approaches, Standard and Unique, to automatically annotate a PEG with labels, and to build their corresponding recovery expressions. The Standard approach annotates a grammar in a way similar to manual annotation, but it may insert labels incorrectly, while the Unique approach is more conservative to annotate a grammar and does not insert labels incorrectly. We evaluate both approaches by using them to generate error recovering parsers for four programming languages: Titan, C, Pascal and Java. In our evaluation, the parsers produced using the Standard approach, after a manual intervention to remove the labels incorrectly added, gave an acceptable recovery for at least 70% of the files in each language. By it turn, the acceptable recovery rate of the parsers produced via the Unique approach, without the need of manual intervention, ranged from 41% to 76%
id UFRN_b3e094d4523b445ac74754b16736705c
oai_identifier_str oai:https://repositorio.ufrn.br:123456789/30867
network_acronym_str UFRN
network_name_str Repositório Institucional da UFRN
repository_id_str
spelling Medeiros, Sérgio Queiroz deJunior, Gilney de Azevedo AlvezMascarenhas, Fabio2020-12-07T18:39:54Z2020-12-07T18:39:54Z2020-02-15MEDEIROS, Sérgio Queiroz de; ALVEZ JUNIOR, Gilney de Azevedo; MASCARENHAS, Fabio. Automatic syntax error reporting and recovery in parsing expression grammars. Science of Computer Programming, [S.L.], v. 187, p. 102373-102373, fev. 2020. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0167642319301662?via%3Dihub. Acesso em: 06 out. 2020. http://dx.doi.org/10.1016/j.scico.2019.102373.0167-6423https://repositorio.ufrn.br/handle/123456789/3086710.1016/j.scico.2019.102373ElsevierAttribution 3.0 Brazilhttp://creativecommons.org/licenses/by/3.0/br/info:eu-repo/semantics/openAccessParsing expression grammarsLabeled failuresError reportingError recoveryAutomatic syntax error reporting and recovery in parsing expression grammarsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleError recovery is an essential feature for a parser that should be plugged in Integrated Development Environments (IDEs), which must build Abstract Syntax Trees (ASTs) even for syntactically invalid programs in order to offer features such as automated refactoring and code completion. Parsing Expressions Grammars (PEGs) are a formalism that naturally describes recursive top-down parsers using a restricted form of backtracking. Labeled failures are a conservative extension of PEGs that adds an error reporting mechanism for PEG parsers, and these labels can also be associated with recovery expressions to provide an error recovery mechanism. These expressions can use the full expressivity of PEGs to recover from syntactic errors. Manually annotating a large grammar with labels and recovery expressions can be difficult. In this work, we present two approaches, Standard and Unique, to automatically annotate a PEG with labels, and to build their corresponding recovery expressions. The Standard approach annotates a grammar in a way similar to manual annotation, but it may insert labels incorrectly, while the Unique approach is more conservative to annotate a grammar and does not insert labels incorrectly. We evaluate both approaches by using them to generate error recovering parsers for four programming languages: Titan, C, Pascal and Java. In our evaluation, the parsers produced using the Standard approach, after a manual intervention to remove the labels incorrectly added, gave an acceptable recovery for at least 70% of the files in each language. By it turn, the acceptable recovery rate of the parsers produced via the Unique approach, without the need of manual intervention, ranged from 41% to 76%engreponame:Repositório Institucional da UFRNinstname:Universidade Federal do Rio Grande do Norte (UFRN)instacron:UFRNCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8914https://repositorio.ufrn.br/bitstream/123456789/30867/2/license_rdf4d2950bda3d176f570a9f8b328dfbbefMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-81484https://repositorio.ufrn.br/bitstream/123456789/30867/3/license.txte9597aa2854d128fd968be5edc8a28d9MD53ORIGINALAutomaticSyntaxError_MEDEIROS_2019.pdfAutomaticSyntaxError_MEDEIROS_2019.pdfapplication/pdf625308https://repositorio.ufrn.br/bitstream/123456789/30867/1/AutomaticSyntaxError_MEDEIROS_2019.pdfc50d39058589f204964cbadd72fd9edfMD51TEXTAutomaticSyntaxError_MEDEIROS_2019.pdf.txtAutomaticSyntaxError_MEDEIROS_2019.pdf.txtExtracted texttext/plain98982https://repositorio.ufrn.br/bitstream/123456789/30867/4/AutomaticSyntaxError_MEDEIROS_2019.pdf.txt9274ebf19335bc8df9b6bff0f2df02e5MD54THUMBNAILAutomaticSyntaxError_MEDEIROS_2019.pdf.jpgAutomaticSyntaxError_MEDEIROS_2019.pdf.jpgGenerated Thumbnailimage/jpeg1624https://repositorio.ufrn.br/bitstream/123456789/30867/5/AutomaticSyntaxError_MEDEIROS_2019.pdf.jpg39aab08363a571c2d7bb7a3beb2e5253MD55123456789/308672020-12-13 05:01:17.442oai:https://repositorio.ufrn.br:123456789/30867Tk9OLUVYQ0xVU0lWRSBESVNUUklCVVRJT04gTElDRU5TRQoKCkJ5IHNpZ25pbmcgYW5kIGRlbGl2ZXJpbmcgdGhpcyBsaWNlbnNlLCBNci4gKGF1dGhvciBvciBjb3B5cmlnaHQgaG9sZGVyKToKCgphKSBHcmFudHMgdGhlIFVuaXZlcnNpZGFkZSBGZWRlcmFsIFJpbyBHcmFuZGUgZG8gTm9ydGUgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgb2YKcmVwcm9kdWNlLCBjb252ZXJ0IChhcyBkZWZpbmVkIGJlbG93KSwgY29tbXVuaWNhdGUgYW5kIC8gb3IKZGlzdHJpYnV0ZSB0aGUgZGVsaXZlcmVkIGRvY3VtZW50IChpbmNsdWRpbmcgYWJzdHJhY3QgLyBhYnN0cmFjdCkgaW4KZGlnaXRhbCBvciBwcmludGVkIGZvcm1hdCBhbmQgaW4gYW55IG1lZGl1bS4KCmIpIERlY2xhcmVzIHRoYXQgdGhlIGRvY3VtZW50IHN1Ym1pdHRlZCBpcyBpdHMgb3JpZ2luYWwgd29yaywgYW5kIHRoYXQKeW91IGhhdmUgdGhlIHJpZ2h0IHRvIGdyYW50IHRoZSByaWdodHMgY29udGFpbmVkIGluIHRoaXMgbGljZW5zZS4gRGVjbGFyZXMKdGhhdCB0aGUgZGVsaXZlcnkgb2YgdGhlIGRvY3VtZW50IGRvZXMgbm90IGluZnJpbmdlLCBhcyBmYXIgYXMgaXQgaXMKdGhlIHJpZ2h0cyBvZiBhbnkgb3RoZXIgcGVyc29uIG9yIGVudGl0eS4KCmMpIElmIHRoZSBkb2N1bWVudCBkZWxpdmVyZWQgY29udGFpbnMgbWF0ZXJpYWwgd2hpY2ggZG9lcyBub3QKcmlnaHRzLCBkZWNsYXJlcyB0aGF0IGl0IGhhcyBvYnRhaW5lZCBhdXRob3JpemF0aW9uIGZyb20gdGhlIGhvbGRlciBvZiB0aGUKY29weXJpZ2h0IHRvIGdyYW50IHRoZSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkbyBSaW8gR3JhbmRlIGRvIE5vcnRlIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdCB0aGlzIG1hdGVyaWFsIHdob3NlIHJpZ2h0cyBhcmUgb2YKdGhpcmQgcGFydGllcyBpcyBjbGVhcmx5IGlkZW50aWZpZWQgYW5kIHJlY29nbml6ZWQgaW4gdGhlIHRleHQgb3IKY29udGVudCBvZiB0aGUgZG9jdW1lbnQgZGVsaXZlcmVkLgoKSWYgdGhlIGRvY3VtZW50IHN1Ym1pdHRlZCBpcyBiYXNlZCBvbiBmdW5kZWQgb3Igc3VwcG9ydGVkIHdvcmsKYnkgYW5vdGhlciBpbnN0aXR1dGlvbiBvdGhlciB0aGFuIHRoZSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkbyBSaW8gR3JhbmRlIGRvIE5vcnRlLCBkZWNsYXJlcyB0aGF0IGl0IGhhcyBmdWxmaWxsZWQgYW55IG9ibGlnYXRpb25zIHJlcXVpcmVkIGJ5IHRoZSByZXNwZWN0aXZlIGFncmVlbWVudCBvciBhZ3JlZW1lbnQuCgpUaGUgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZG8gUmlvIEdyYW5kZSBkbyBOb3J0ZSB3aWxsIGNsZWFybHkgaWRlbnRpZnkgaXRzIG5hbWUgKHMpIGFzIHRoZSBhdXRob3IgKHMpIG9yIGhvbGRlciAocykgb2YgdGhlIGRvY3VtZW50J3MgcmlnaHRzCmRlbGl2ZXJlZCwgYW5kIHdpbGwgbm90IG1ha2UgYW55IGNoYW5nZXMsIG90aGVyIHRoYW4gdGhvc2UgcGVybWl0dGVkIGJ5CnRoaXMgbGljZW5zZQo=Repositório de PublicaçõesPUBhttp://repositorio.ufrn.br/oai/opendoar:2020-12-13T08:01:17Repositório Institucional da UFRN - Universidade Federal do Rio Grande do Norte (UFRN)false
dc.title.pt_BR.fl_str_mv Automatic syntax error reporting and recovery in parsing expression grammars
title Automatic syntax error reporting and recovery in parsing expression grammars
spellingShingle Automatic syntax error reporting and recovery in parsing expression grammars
Medeiros, Sérgio Queiroz de
Parsing expression grammars
Labeled failures
Error reporting
Error recovery
title_short Automatic syntax error reporting and recovery in parsing expression grammars
title_full Automatic syntax error reporting and recovery in parsing expression grammars
title_fullStr Automatic syntax error reporting and recovery in parsing expression grammars
title_full_unstemmed Automatic syntax error reporting and recovery in parsing expression grammars
title_sort Automatic syntax error reporting and recovery in parsing expression grammars
author Medeiros, Sérgio Queiroz de
author_facet Medeiros, Sérgio Queiroz de
Junior, Gilney de Azevedo Alvez
Mascarenhas, Fabio
author_role author
author2 Junior, Gilney de Azevedo Alvez
Mascarenhas, Fabio
author2_role author
author
dc.contributor.author.fl_str_mv Medeiros, Sérgio Queiroz de
Junior, Gilney de Azevedo Alvez
Mascarenhas, Fabio
dc.subject.por.fl_str_mv Parsing expression grammars
Labeled failures
Error reporting
Error recovery
topic Parsing expression grammars
Labeled failures
Error reporting
Error recovery
description Error recovery is an essential feature for a parser that should be plugged in Integrated Development Environments (IDEs), which must build Abstract Syntax Trees (ASTs) even for syntactically invalid programs in order to offer features such as automated refactoring and code completion. Parsing Expressions Grammars (PEGs) are a formalism that naturally describes recursive top-down parsers using a restricted form of backtracking. Labeled failures are a conservative extension of PEGs that adds an error reporting mechanism for PEG parsers, and these labels can also be associated with recovery expressions to provide an error recovery mechanism. These expressions can use the full expressivity of PEGs to recover from syntactic errors. Manually annotating a large grammar with labels and recovery expressions can be difficult. In this work, we present two approaches, Standard and Unique, to automatically annotate a PEG with labels, and to build their corresponding recovery expressions. The Standard approach annotates a grammar in a way similar to manual annotation, but it may insert labels incorrectly, while the Unique approach is more conservative to annotate a grammar and does not insert labels incorrectly. We evaluate both approaches by using them to generate error recovering parsers for four programming languages: Titan, C, Pascal and Java. In our evaluation, the parsers produced using the Standard approach, after a manual intervention to remove the labels incorrectly added, gave an acceptable recovery for at least 70% of the files in each language. By it turn, the acceptable recovery rate of the parsers produced via the Unique approach, without the need of manual intervention, ranged from 41% to 76%
publishDate 2020
dc.date.accessioned.fl_str_mv 2020-12-07T18:39:54Z
dc.date.available.fl_str_mv 2020-12-07T18:39:54Z
dc.date.issued.fl_str_mv 2020-02-15
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.citation.fl_str_mv MEDEIROS, Sérgio Queiroz de; ALVEZ JUNIOR, Gilney de Azevedo; MASCARENHAS, Fabio. Automatic syntax error reporting and recovery in parsing expression grammars. Science of Computer Programming, [S.L.], v. 187, p. 102373-102373, fev. 2020. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0167642319301662?via%3Dihub. Acesso em: 06 out. 2020. http://dx.doi.org/10.1016/j.scico.2019.102373.
dc.identifier.uri.fl_str_mv https://repositorio.ufrn.br/handle/123456789/30867
dc.identifier.issn.none.fl_str_mv 0167-6423
dc.identifier.doi.none.fl_str_mv 10.1016/j.scico.2019.102373
identifier_str_mv MEDEIROS, Sérgio Queiroz de; ALVEZ JUNIOR, Gilney de Azevedo; MASCARENHAS, Fabio. Automatic syntax error reporting and recovery in parsing expression grammars. Science of Computer Programming, [S.L.], v. 187, p. 102373-102373, fev. 2020. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0167642319301662?via%3Dihub. Acesso em: 06 out. 2020. http://dx.doi.org/10.1016/j.scico.2019.102373.
0167-6423
10.1016/j.scico.2019.102373
url https://repositorio.ufrn.br/handle/123456789/30867
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv Attribution 3.0 Brazil
http://creativecommons.org/licenses/by/3.0/br/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution 3.0 Brazil
http://creativecommons.org/licenses/by/3.0/br/
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Elsevier
publisher.none.fl_str_mv Elsevier
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFRN
instname:Universidade Federal do Rio Grande do Norte (UFRN)
instacron:UFRN
instname_str Universidade Federal do Rio Grande do Norte (UFRN)
instacron_str UFRN
institution UFRN
reponame_str Repositório Institucional da UFRN
collection Repositório Institucional da UFRN
bitstream.url.fl_str_mv https://repositorio.ufrn.br/bitstream/123456789/30867/2/license_rdf
https://repositorio.ufrn.br/bitstream/123456789/30867/3/license.txt
https://repositorio.ufrn.br/bitstream/123456789/30867/1/AutomaticSyntaxError_MEDEIROS_2019.pdf
https://repositorio.ufrn.br/bitstream/123456789/30867/4/AutomaticSyntaxError_MEDEIROS_2019.pdf.txt
https://repositorio.ufrn.br/bitstream/123456789/30867/5/AutomaticSyntaxError_MEDEIROS_2019.pdf.jpg
bitstream.checksum.fl_str_mv 4d2950bda3d176f570a9f8b328dfbbef
e9597aa2854d128fd968be5edc8a28d9
c50d39058589f204964cbadd72fd9edf
9274ebf19335bc8df9b6bff0f2df02e5
39aab08363a571c2d7bb7a3beb2e5253
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFRN - Universidade Federal do Rio Grande do Norte (UFRN)
repository.mail.fl_str_mv
_version_ 1797777124936908800