Compilation and analysis of textual metrics of an essay's corpus

Soares Vital, Átila Augusto

Compilation and analysis of textual metrics of an essay's corpus

Detalhes bibliográficos
Autor(a) principal:	Soares Vital, Átila Augusto
Data de Publicação:	2023
Tipo de documento:	Artigo
Idioma:	por
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	https://doi.org/10.21814/lm.15.1.393
Resumo:	The writing test of the National High School Exam (Enem) is very important to guarantee a place for students in undergraduate institutions in Brazil. From 2010 to 2020, the number of texts evaluated in maximum grade (one thousand points) dropped abruptly: in 2011, 3,694 texts gained 1,000 points, and in 2020, only 28 texts were evaluated with the same grade. The objective of this research is to present a corpus of texts graded one thousand points by Enem's team, to describe them and to make brief considerations about their characteristics during the historical series from 2010 to 2020. The compilation was made manually, using the internet. We used Orange: Data Mining and the NILC-Metrix textual complexity analyzer. The results suggest an expressive increase in the number of words and a decrease in the type/token ratio during the period. Finally, syntactic metrics were measured and confirmed the increase in textual complexity.

Metadados do item

id	RCAP_c220a78c7b532d77422390c1615acfae
oai_identifier_str	oai:linguamatica.com:article/393
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Compilation and analysis of textual metrics of an essay's corpusA compilação e a análise de métricas textuais de um corpus de redaçõesA compilação e a análise de métricas textuais de um corpus de redaçõesEssaysCorpus linguisticsTextual complexityRedaçõesLinguística de corpusComplexidade textual The writing test of the National High School Exam (Enem) is very important to guarantee a place for students in undergraduate institutions in Brazil. From 2010 to 2020, the number of texts evaluated in maximum grade (one thousand points) dropped abruptly: in 2011, 3,694 texts gained 1,000 points, and in 2020, only 28 texts were evaluated with the same grade. The objective of this research is to present a corpus of texts graded one thousand points by Enem's team, to describe them and to make brief considerations about their characteristics during the historical series from 2010 to 2020. The compilation was made manually, using the internet. We used Orange: Data Mining and the NILC-Metrix textual complexity analyzer. The results suggest an expressive increase in the number of words and a decrease in the type/token ratio during the period. Finally, syntactic metrics were measured and confirmed the increase in textual complexity. A prova de redação do Exame Nacional do Ensino Médio (Enem) é decisiva para a garantia da vaga em instituições de ensino superior no Brasil. De 2010 a 2020, foi observado que a quantidade de redações avaliadas em nota máxima (mil pontos) caiu de maneira drástica e abrupta: de 3.694 redações nota máxima em 2011 para apenas 28 em 2020. O objetivo deste trabalho é apresentar um corpus de redações nota máxima avaliadas pela banca do Enem, descrevê-las e tecer breves considerações a partir da análise de métricas textuais na série histórica de 2010 a 2020. A compilação foi feita de forma manual, pela internet. Para as descrições, foram utilizados o programa Orange: Data Mining e o analisador de complexidade textual NILC-Metrix. Os resultados sugerem que houve aumento expressivo no número de palavras e diminuição da razão type/token ao longo dos anos. Além disso, foram feitas medidas sintáticas que constataram o aumento da complexidade dos textos. A prova de redação do Exame Nacional do Ensino Médio (Enem) é decisiva para a garantia da vaga em instituições de ensino superior no Brasil. De 2010 a 2020, foi observado que a quantidade de redações avaliadas em nota máxima (mil pontos) caiu de maneira drástica e abrupta: de 3.694 redações nota máxima em 2011 para apenas 28 em 2020. O objetivo deste trabalho é apresentar um corpus de redações nota máxima avaliadas pela banca do Enem, descrevê-las e tecer breves considerações a partir da análise de métricas textuais na série histórica de 2010 a 2020. A compilação foi feita de forma manual, pela internet. Para as descrições, foram utilizados o programa Orange: Data Mining e o analisador de complexidade textual NILC-Metrix. Os resultados sugerem que houve aumento expressivo no número de palavras e diminuição da razão type/token ao longo dos anos. Além disso, foram feitas medidas sintáticas que constataram o aumento da complexidade dos textos.Universidade do Minho e Universidade de Vigo2023-07-08info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://doi.org/10.21814/lm.15.1.393https://doi.org/10.21814/lm.15.1.393Linguamática; Vol. 15 No. 1; 131--140Linguamática; v. 15 n. 1; 131--140Linguamática; Vol. 15 Núm. 1; 131--1401647-0818reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAPporhttps://linguamatica.com/index.php/linguamatica/article/view/393https://linguamatica.com/index.php/linguamatica/article/view/393/497Direitos de Autor (c) 2023 Átila Augusto Soares Vitalhttp://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessSoares Vital, Átila Augusto2024-03-08T13:45:13Zoai:linguamatica.com:article/393Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T20:28:41.272889Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Compilation and analysis of textual metrics of an essay's corpus A compilação e a análise de métricas textuais de um corpus de redações A compilação e a análise de métricas textuais de um corpus de redações
title	Compilation and analysis of textual metrics of an essay's corpus
spellingShingle	Compilation and analysis of textual metrics of an essay's corpus Soares Vital, Átila Augusto Essays Corpus linguistics Textual complexity Redações Linguística de corpus Complexidade textual
title_short	Compilation and analysis of textual metrics of an essay's corpus
title_full	Compilation and analysis of textual metrics of an essay's corpus
title_fullStr	Compilation and analysis of textual metrics of an essay's corpus
title_full_unstemmed	Compilation and analysis of textual metrics of an essay's corpus
title_sort	Compilation and analysis of textual metrics of an essay's corpus
author	Soares Vital, Átila Augusto
author_facet	Soares Vital, Átila Augusto
author_role	author
dc.contributor.author.fl_str_mv	Soares Vital, Átila Augusto
dc.subject.por.fl_str_mv	Essays Corpus linguistics Textual complexity Redações Linguística de corpus Complexidade textual
topic	Essays Corpus linguistics Textual complexity Redações Linguística de corpus Complexidade textual
description	The writing test of the National High School Exam (Enem) is very important to guarantee a place for students in undergraduate institutions in Brazil. From 2010 to 2020, the number of texts evaluated in maximum grade (one thousand points) dropped abruptly: in 2011, 3,694 texts gained 1,000 points, and in 2020, only 28 texts were evaluated with the same grade. The objective of this research is to present a corpus of texts graded one thousand points by Enem's team, to describe them and to make brief considerations about their characteristics during the historical series from 2010 to 2020. The compilation was made manually, using the internet. We used Orange: Data Mining and the NILC-Metrix textual complexity analyzer. The results suggest an expressive increase in the number of words and a decrease in the type/token ratio during the period. Finally, syntactic metrics were measured and confirmed the increase in textual complexity.
publishDate	2023
dc.date.none.fl_str_mv	2023-07-08
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://doi.org/10.21814/lm.15.1.393 https://doi.org/10.21814/lm.15.1.393
url	https://doi.org/10.21814/lm.15.1.393
dc.language.iso.fl_str_mv	por
language	por
dc.relation.none.fl_str_mv	https://linguamatica.com/index.php/linguamatica/article/view/393 https://linguamatica.com/index.php/linguamatica/article/view/393/497
dc.rights.driver.fl_str_mv	Direitos de Autor (c) 2023 Átila Augusto Soares Vital http://creativecommons.org/licenses/by/4.0 info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Direitos de Autor (c) 2023 Átila Augusto Soares Vital http://creativecommons.org/licenses/by/4.0
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Universidade do Minho e Universidade de Vigo
publisher.none.fl_str_mv	Universidade do Minho e Universidade de Vigo
dc.source.none.fl_str_mv	Linguamática; Vol. 15 No. 1; 131--140 Linguamática; v. 15 n. 1; 131--140 Linguamática; Vol. 15 Núm. 1; 131--140 1647-0818 reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799133554110627840

Compilation and analysis of textual metrics of an essay's corpus

Registros relacionados