Automatic syntax error reporting and recovery in parsing expression grammars
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFRN |
Texto Completo: | https://repositorio.ufrn.br/handle/123456789/30867 |
Resumo: | Error recovery is an essential feature for a parser that should be plugged in Integrated Development Environments (IDEs), which must build Abstract Syntax Trees (ASTs) even for syntactically invalid programs in order to offer features such as automated refactoring and code completion. Parsing Expressions Grammars (PEGs) are a formalism that naturally describes recursive top-down parsers using a restricted form of backtracking. Labeled failures are a conservative extension of PEGs that adds an error reporting mechanism for PEG parsers, and these labels can also be associated with recovery expressions to provide an error recovery mechanism. These expressions can use the full expressivity of PEGs to recover from syntactic errors. Manually annotating a large grammar with labels and recovery expressions can be difficult. In this work, we present two approaches, Standard and Unique, to automatically annotate a PEG with labels, and to build their corresponding recovery expressions. The Standard approach annotates a grammar in a way similar to manual annotation, but it may insert labels incorrectly, while the Unique approach is more conservative to annotate a grammar and does not insert labels incorrectly. We evaluate both approaches by using them to generate error recovering parsers for four programming languages: Titan, C, Pascal and Java. In our evaluation, the parsers produced using the Standard approach, after a manual intervention to remove the labels incorrectly added, gave an acceptable recovery for at least 70% of the files in each language. By it turn, the acceptable recovery rate of the parsers produced via the Unique approach, without the need of manual intervention, ranged from 41% to 76% |
id |
UFRN_b3e094d4523b445ac74754b16736705c |
---|---|
oai_identifier_str |
oai:https://repositorio.ufrn.br:123456789/30867 |
network_acronym_str |
UFRN |
network_name_str |
Repositório Institucional da UFRN |
repository_id_str |
|
spelling |
Medeiros, Sérgio Queiroz deJunior, Gilney de Azevedo AlvezMascarenhas, Fabio2020-12-07T18:39:54Z2020-12-07T18:39:54Z2020-02-15MEDEIROS, Sérgio Queiroz de; ALVEZ JUNIOR, Gilney de Azevedo; MASCARENHAS, Fabio. Automatic syntax error reporting and recovery in parsing expression grammars. Science of Computer Programming, [S.L.], v. 187, p. 102373-102373, fev. 2020. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0167642319301662?via%3Dihub. Acesso em: 06 out. 2020. http://dx.doi.org/10.1016/j.scico.2019.102373.0167-6423https://repositorio.ufrn.br/handle/123456789/3086710.1016/j.scico.2019.102373ElsevierAttribution 3.0 Brazilhttp://creativecommons.org/licenses/by/3.0/br/info:eu-repo/semantics/openAccessParsing expression grammarsLabeled failuresError reportingError recoveryAutomatic syntax error reporting and recovery in parsing expression grammarsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleError recovery is an essential feature for a parser that should be plugged in Integrated Development Environments (IDEs), which must build Abstract Syntax Trees (ASTs) even for syntactically invalid programs in order to offer features such as automated refactoring and code completion. Parsing Expressions Grammars (PEGs) are a formalism that naturally describes recursive top-down parsers using a restricted form of backtracking. Labeled failures are a conservative extension of PEGs that adds an error reporting mechanism for PEG parsers, and these labels can also be associated with recovery expressions to provide an error recovery mechanism. These expressions can use the full expressivity of PEGs to recover from syntactic errors. Manually annotating a large grammar with labels and recovery expressions can be difficult. In this work, we present two approaches, Standard and Unique, to automatically annotate a PEG with labels, and to build their corresponding recovery expressions. The Standard approach annotates a grammar in a way similar to manual annotation, but it may insert labels incorrectly, while the Unique approach is more conservative to annotate a grammar and does not insert labels incorrectly. We evaluate both approaches by using them to generate error recovering parsers for four programming languages: Titan, C, Pascal and Java. In our evaluation, the parsers produced using the Standard approach, after a manual intervention to remove the labels incorrectly added, gave an acceptable recovery for at least 70% of the files in each language. By it turn, the acceptable recovery rate of the parsers produced via the Unique approach, without the need of manual intervention, ranged from 41% to 76%engreponame:Repositório Institucional da UFRNinstname:Universidade Federal do Rio Grande do Norte (UFRN)instacron:UFRNCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8914https://repositorio.ufrn.br/bitstream/123456789/30867/2/license_rdf4d2950bda3d176f570a9f8b328dfbbefMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-81484https://repositorio.ufrn.br/bitstream/123456789/30867/3/license.txte9597aa2854d128fd968be5edc8a28d9MD53ORIGINALAutomaticSyntaxError_MEDEIROS_2019.pdfAutomaticSyntaxError_MEDEIROS_2019.pdfapplication/pdf625308https://repositorio.ufrn.br/bitstream/123456789/30867/1/AutomaticSyntaxError_MEDEIROS_2019.pdfc50d39058589f204964cbadd72fd9edfMD51TEXTAutomaticSyntaxError_MEDEIROS_2019.pdf.txtAutomaticSyntaxError_MEDEIROS_2019.pdf.txtExtracted texttext/plain98982https://repositorio.ufrn.br/bitstream/123456789/30867/4/AutomaticSyntaxError_MEDEIROS_2019.pdf.txt9274ebf19335bc8df9b6bff0f2df02e5MD54THUMBNAILAutomaticSyntaxError_MEDEIROS_2019.pdf.jpgAutomaticSyntaxError_MEDEIROS_2019.pdf.jpgGenerated Thumbnailimage/jpeg1624https://repositorio.ufrn.br/bitstream/123456789/30867/5/AutomaticSyntaxError_MEDEIROS_2019.pdf.jpg39aab08363a571c2d7bb7a3beb2e5253MD55123456789/308672020-12-13 05:01:17.442oai:https://repositorio.ufrn.br:123456789/30867Tk9OLUVYQ0xVU0lWRSBESVNUUklCVVRJT04gTElDRU5TRQoKCkJ5IHNpZ25pbmcgYW5kIGRlbGl2ZXJpbmcgdGhpcyBsaWNlbnNlLCBNci4gKGF1dGhvciBvciBjb3B5cmlnaHQgaG9sZGVyKToKCgphKSBHcmFudHMgdGhlIFVuaXZlcnNpZGFkZSBGZWRlcmFsIFJpbyBHcmFuZGUgZG8gTm9ydGUgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgb2YKcmVwcm9kdWNlLCBjb252ZXJ0IChhcyBkZWZpbmVkIGJlbG93KSwgY29tbXVuaWNhdGUgYW5kIC8gb3IKZGlzdHJpYnV0ZSB0aGUgZGVsaXZlcmVkIGRvY3VtZW50IChpbmNsdWRpbmcgYWJzdHJhY3QgLyBhYnN0cmFjdCkgaW4KZGlnaXRhbCBvciBwcmludGVkIGZvcm1hdCBhbmQgaW4gYW55IG1lZGl1bS4KCmIpIERlY2xhcmVzIHRoYXQgdGhlIGRvY3VtZW50IHN1Ym1pdHRlZCBpcyBpdHMgb3JpZ2luYWwgd29yaywgYW5kIHRoYXQKeW91IGhhdmUgdGhlIHJpZ2h0IHRvIGdyYW50IHRoZSByaWdodHMgY29udGFpbmVkIGluIHRoaXMgbGljZW5zZS4gRGVjbGFyZXMKdGhhdCB0aGUgZGVsaXZlcnkgb2YgdGhlIGRvY3VtZW50IGRvZXMgbm90IGluZnJpbmdlLCBhcyBmYXIgYXMgaXQgaXMKdGhlIHJpZ2h0cyBvZiBhbnkgb3RoZXIgcGVyc29uIG9yIGVudGl0eS4KCmMpIElmIHRoZSBkb2N1bWVudCBkZWxpdmVyZWQgY29udGFpbnMgbWF0ZXJpYWwgd2hpY2ggZG9lcyBub3QKcmlnaHRzLCBkZWNsYXJlcyB0aGF0IGl0IGhhcyBvYnRhaW5lZCBhdXRob3JpemF0aW9uIGZyb20gdGhlIGhvbGRlciBvZiB0aGUKY29weXJpZ2h0IHRvIGdyYW50IHRoZSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkbyBSaW8gR3JhbmRlIGRvIE5vcnRlIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdCB0aGlzIG1hdGVyaWFsIHdob3NlIHJpZ2h0cyBhcmUgb2YKdGhpcmQgcGFydGllcyBpcyBjbGVhcmx5IGlkZW50aWZpZWQgYW5kIHJlY29nbml6ZWQgaW4gdGhlIHRleHQgb3IKY29udGVudCBvZiB0aGUgZG9jdW1lbnQgZGVsaXZlcmVkLgoKSWYgdGhlIGRvY3VtZW50IHN1Ym1pdHRlZCBpcyBiYXNlZCBvbiBmdW5kZWQgb3Igc3VwcG9ydGVkIHdvcmsKYnkgYW5vdGhlciBpbnN0aXR1dGlvbiBvdGhlciB0aGFuIHRoZSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkbyBSaW8gR3JhbmRlIGRvIE5vcnRlLCBkZWNsYXJlcyB0aGF0IGl0IGhhcyBmdWxmaWxsZWQgYW55IG9ibGlnYXRpb25zIHJlcXVpcmVkIGJ5IHRoZSByZXNwZWN0aXZlIGFncmVlbWVudCBvciBhZ3JlZW1lbnQuCgpUaGUgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZG8gUmlvIEdyYW5kZSBkbyBOb3J0ZSB3aWxsIGNsZWFybHkgaWRlbnRpZnkgaXRzIG5hbWUgKHMpIGFzIHRoZSBhdXRob3IgKHMpIG9yIGhvbGRlciAocykgb2YgdGhlIGRvY3VtZW50J3MgcmlnaHRzCmRlbGl2ZXJlZCwgYW5kIHdpbGwgbm90IG1ha2UgYW55IGNoYW5nZXMsIG90aGVyIHRoYW4gdGhvc2UgcGVybWl0dGVkIGJ5CnRoaXMgbGljZW5zZQo=Repositório de PublicaçõesPUBhttp://repositorio.ufrn.br/oai/opendoar:2020-12-13T08:01:17Repositório Institucional da UFRN - Universidade Federal do Rio Grande do Norte (UFRN)false |
dc.title.pt_BR.fl_str_mv |
Automatic syntax error reporting and recovery in parsing expression grammars |
title |
Automatic syntax error reporting and recovery in parsing expression grammars |
spellingShingle |
Automatic syntax error reporting and recovery in parsing expression grammars Medeiros, Sérgio Queiroz de Parsing expression grammars Labeled failures Error reporting Error recovery |
title_short |
Automatic syntax error reporting and recovery in parsing expression grammars |
title_full |
Automatic syntax error reporting and recovery in parsing expression grammars |
title_fullStr |
Automatic syntax error reporting and recovery in parsing expression grammars |
title_full_unstemmed |
Automatic syntax error reporting and recovery in parsing expression grammars |
title_sort |
Automatic syntax error reporting and recovery in parsing expression grammars |
author |
Medeiros, Sérgio Queiroz de |
author_facet |
Medeiros, Sérgio Queiroz de Junior, Gilney de Azevedo Alvez Mascarenhas, Fabio |
author_role |
author |
author2 |
Junior, Gilney de Azevedo Alvez Mascarenhas, Fabio |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Medeiros, Sérgio Queiroz de Junior, Gilney de Azevedo Alvez Mascarenhas, Fabio |
dc.subject.por.fl_str_mv |
Parsing expression grammars Labeled failures Error reporting Error recovery |
topic |
Parsing expression grammars Labeled failures Error reporting Error recovery |
description |
Error recovery is an essential feature for a parser that should be plugged in Integrated Development Environments (IDEs), which must build Abstract Syntax Trees (ASTs) even for syntactically invalid programs in order to offer features such as automated refactoring and code completion. Parsing Expressions Grammars (PEGs) are a formalism that naturally describes recursive top-down parsers using a restricted form of backtracking. Labeled failures are a conservative extension of PEGs that adds an error reporting mechanism for PEG parsers, and these labels can also be associated with recovery expressions to provide an error recovery mechanism. These expressions can use the full expressivity of PEGs to recover from syntactic errors. Manually annotating a large grammar with labels and recovery expressions can be difficult. In this work, we present two approaches, Standard and Unique, to automatically annotate a PEG with labels, and to build their corresponding recovery expressions. The Standard approach annotates a grammar in a way similar to manual annotation, but it may insert labels incorrectly, while the Unique approach is more conservative to annotate a grammar and does not insert labels incorrectly. We evaluate both approaches by using them to generate error recovering parsers for four programming languages: Titan, C, Pascal and Java. In our evaluation, the parsers produced using the Standard approach, after a manual intervention to remove the labels incorrectly added, gave an acceptable recovery for at least 70% of the files in each language. By it turn, the acceptable recovery rate of the parsers produced via the Unique approach, without the need of manual intervention, ranged from 41% to 76% |
publishDate |
2020 |
dc.date.accessioned.fl_str_mv |
2020-12-07T18:39:54Z |
dc.date.available.fl_str_mv |
2020-12-07T18:39:54Z |
dc.date.issued.fl_str_mv |
2020-02-15 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
MEDEIROS, Sérgio Queiroz de; ALVEZ JUNIOR, Gilney de Azevedo; MASCARENHAS, Fabio. Automatic syntax error reporting and recovery in parsing expression grammars. Science of Computer Programming, [S.L.], v. 187, p. 102373-102373, fev. 2020. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0167642319301662?via%3Dihub. Acesso em: 06 out. 2020. http://dx.doi.org/10.1016/j.scico.2019.102373. |
dc.identifier.uri.fl_str_mv |
https://repositorio.ufrn.br/handle/123456789/30867 |
dc.identifier.issn.none.fl_str_mv |
0167-6423 |
dc.identifier.doi.none.fl_str_mv |
10.1016/j.scico.2019.102373 |
identifier_str_mv |
MEDEIROS, Sérgio Queiroz de; ALVEZ JUNIOR, Gilney de Azevedo; MASCARENHAS, Fabio. Automatic syntax error reporting and recovery in parsing expression grammars. Science of Computer Programming, [S.L.], v. 187, p. 102373-102373, fev. 2020. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0167642319301662?via%3Dihub. Acesso em: 06 out. 2020. http://dx.doi.org/10.1016/j.scico.2019.102373. 0167-6423 10.1016/j.scico.2019.102373 |
url |
https://repositorio.ufrn.br/handle/123456789/30867 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
Attribution 3.0 Brazil http://creativecommons.org/licenses/by/3.0/br/ info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Attribution 3.0 Brazil http://creativecommons.org/licenses/by/3.0/br/ |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Elsevier |
publisher.none.fl_str_mv |
Elsevier |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFRN instname:Universidade Federal do Rio Grande do Norte (UFRN) instacron:UFRN |
instname_str |
Universidade Federal do Rio Grande do Norte (UFRN) |
instacron_str |
UFRN |
institution |
UFRN |
reponame_str |
Repositório Institucional da UFRN |
collection |
Repositório Institucional da UFRN |
bitstream.url.fl_str_mv |
https://repositorio.ufrn.br/bitstream/123456789/30867/2/license_rdf https://repositorio.ufrn.br/bitstream/123456789/30867/3/license.txt https://repositorio.ufrn.br/bitstream/123456789/30867/1/AutomaticSyntaxError_MEDEIROS_2019.pdf https://repositorio.ufrn.br/bitstream/123456789/30867/4/AutomaticSyntaxError_MEDEIROS_2019.pdf.txt https://repositorio.ufrn.br/bitstream/123456789/30867/5/AutomaticSyntaxError_MEDEIROS_2019.pdf.jpg |
bitstream.checksum.fl_str_mv |
4d2950bda3d176f570a9f8b328dfbbef e9597aa2854d128fd968be5edc8a28d9 c50d39058589f204964cbadd72fd9edf 9274ebf19335bc8df9b6bff0f2df02e5 39aab08363a571c2d7bb7a3beb2e5253 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFRN - Universidade Federal do Rio Grande do Norte (UFRN) |
repository.mail.fl_str_mv |
|
_version_ |
1797777124936908800 |