Evaluating a typology of signals for automatic detection of complementarity

Detalhes bibliográficos
Autor(a) principal: Cruz Souza, Jackson Wilke da
Data de Publicação: 2022
Outros Autores: Di Felippo, Ariani
Tipo de documento: Artigo
Idioma: eng
por
Título da fonte: Domínios de Lingu@gem
Texto Completo: https://seer.ufu.br/index.php/dominiosdelinguagem/article/view/63776
Resumo: In a cluster of news texts on the same event, two sentences from different documents might express different multi-document phenomena (redundancy, complementarity, and contradiction). Cross-Document Structure Theory (CST) provides labels to explicitly represent these phenomena. The automatic identification of the multi-document phenomena and their correspondent CST relations is definitely handy for Automatic Multi-Document Summarization since it helps computers understand text meaning. In this paper, we evaluated a typology of (textual) signals for the automatic detection of the CST relations of complementarity (i.e., Historical background, Follow-up and Elaboration) in a multi-document corpus of news texts in Brazilian Portuguese. Using algorithms from different machine-learning paradigms, we obtained classifiers that achieved high general accuracy (higher than 90%), indicating the potential of the signals.
id UFU-12_4305cd6a4c675a795d70aaa39fa039f8
oai_identifier_str oai:ojs.www.seer.ufu.br:article/63776
network_acronym_str UFU-12
network_name_str Domínios de Lingu@gem
repository_id_str
spelling Evaluating a typology of signals for automatic detection of complementarityAvaliação de uma tipologia de sinais para a detecção automática da complementaridadeCross-Document Structure TheorySumarização automáticaComplementaridadeCorpus multidocumentoSinal textualCross-Document Structure TheoryAutomatic summarizationMulti-document CorpusComplementarityTextual signalIn a cluster of news texts on the same event, two sentences from different documents might express different multi-document phenomena (redundancy, complementarity, and contradiction). Cross-Document Structure Theory (CST) provides labels to explicitly represent these phenomena. The automatic identification of the multi-document phenomena and their correspondent CST relations is definitely handy for Automatic Multi-Document Summarization since it helps computers understand text meaning. In this paper, we evaluated a typology of (textual) signals for the automatic detection of the CST relations of complementarity (i.e., Historical background, Follow-up and Elaboration) in a multi-document corpus of news texts in Brazilian Portuguese. Using algorithms from different machine-learning paradigms, we obtained classifiers that achieved high general accuracy (higher than 90%), indicating the potential of the signals.Em uma coleção de notícias sobre um mesmo evento, duas sentenças de textos distintos podem expressar diferentes fenômenos multidocumento (redundância, complementaridade e contradição). A Cross-Document Structure Theory (CST) provê rótulos para representar esses fenômenos. A identificação automática dos fenômenos multidocumento e das relações CST correspondentes é central à Sumarização Automática Mutidocumento, pois ajuda a máquina a entender o conteúdo textual. Neste artigo, avaliou-se uma tipologia de sinais (textuais) para a detecção automática das relações CST de complementaridade (Historical background, Follow-up e Elaboration) em um corpus multidocumento de notícias em Português do Brasil. Utilizando algoritmos de diferentes paradigmas de Aprendizado de Máquina, obtiveram-se classificadores que atingiram alto índice de acurácia geral (superior a 90%), indicando o potencial dos sinais.PP/UFU2022-09-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdftext/xmlhttps://seer.ufu.br/index.php/dominiosdelinguagem/article/view/6377610.14393/DL52-v16n4a2022-10Domínios de Lingu@gem; Vol. 16 No. 4 (2022): The computational treatment of Brazilian Portuguese; 1517-1543Domínios de Lingu@gem; Vol. 16 Núm. 4 (2022): El tratamiento computacional del portugués brasileño; 1517-1543Domínios de Lingu@gem; v. 16 n. 4 (2022): Tratamento Computacional do Português Brasileiro; 1517-15431980-5799reponame:Domínios de Lingu@geminstname:Universidade Federal de Uberlândia (UFU)instacron:UFUengporhttps://seer.ufu.br/index.php/dominiosdelinguagem/article/view/63776/33822https://seer.ufu.br/index.php/dominiosdelinguagem/article/view/63776/35235Copyright (c) 2022 Jackson Wilke da Cruz Souza, Ariani Di Felippohttp://creativecommons.org/licenses/by-nc-nd/4.0info:eu-repo/semantics/openAccessCruz Souza, Jackson Wilke daDi Felippo, Ariani2022-12-09T18:19:54Zoai:ojs.www.seer.ufu.br:article/63776Revistahttps://seer.ufu.br/index.php/dominiosdelinguagemPUBhttps://seer.ufu.br/index.php/dominiosdelinguagem/oairevistadominios@ileel.ufu.br||1980-57991980-5799opendoar:2022-12-09T18:19:54Domínios de Lingu@gem - Universidade Federal de Uberlândia (UFU)false
dc.title.none.fl_str_mv Evaluating a typology of signals for automatic detection of complementarity
Avaliação de uma tipologia de sinais para a detecção automática da complementaridade
title Evaluating a typology of signals for automatic detection of complementarity
spellingShingle Evaluating a typology of signals for automatic detection of complementarity
Cruz Souza, Jackson Wilke da
Cross-Document Structure Theory
Sumarização automática
Complementaridade
Corpus multidocumento
Sinal textual
Cross-Document Structure Theory
Automatic summarization
Multi-document Corpus
Complementarity
Textual signal
title_short Evaluating a typology of signals for automatic detection of complementarity
title_full Evaluating a typology of signals for automatic detection of complementarity
title_fullStr Evaluating a typology of signals for automatic detection of complementarity
title_full_unstemmed Evaluating a typology of signals for automatic detection of complementarity
title_sort Evaluating a typology of signals for automatic detection of complementarity
author Cruz Souza, Jackson Wilke da
author_facet Cruz Souza, Jackson Wilke da
Di Felippo, Ariani
author_role author
author2 Di Felippo, Ariani
author2_role author
dc.contributor.author.fl_str_mv Cruz Souza, Jackson Wilke da
Di Felippo, Ariani
dc.subject.por.fl_str_mv Cross-Document Structure Theory
Sumarização automática
Complementaridade
Corpus multidocumento
Sinal textual
Cross-Document Structure Theory
Automatic summarization
Multi-document Corpus
Complementarity
Textual signal
topic Cross-Document Structure Theory
Sumarização automática
Complementaridade
Corpus multidocumento
Sinal textual
Cross-Document Structure Theory
Automatic summarization
Multi-document Corpus
Complementarity
Textual signal
description In a cluster of news texts on the same event, two sentences from different documents might express different multi-document phenomena (redundancy, complementarity, and contradiction). Cross-Document Structure Theory (CST) provides labels to explicitly represent these phenomena. The automatic identification of the multi-document phenomena and their correspondent CST relations is definitely handy for Automatic Multi-Document Summarization since it helps computers understand text meaning. In this paper, we evaluated a typology of (textual) signals for the automatic detection of the CST relations of complementarity (i.e., Historical background, Follow-up and Elaboration) in a multi-document corpus of news texts in Brazilian Portuguese. Using algorithms from different machine-learning paradigms, we obtained classifiers that achieved high general accuracy (higher than 90%), indicating the potential of the signals.
publishDate 2022
dc.date.none.fl_str_mv 2022-09-12
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://seer.ufu.br/index.php/dominiosdelinguagem/article/view/63776
10.14393/DL52-v16n4a2022-10
url https://seer.ufu.br/index.php/dominiosdelinguagem/article/view/63776
identifier_str_mv 10.14393/DL52-v16n4a2022-10
dc.language.iso.fl_str_mv eng
por
language eng
por
dc.relation.none.fl_str_mv https://seer.ufu.br/index.php/dominiosdelinguagem/article/view/63776/33822
https://seer.ufu.br/index.php/dominiosdelinguagem/article/view/63776/35235
dc.rights.driver.fl_str_mv Copyright (c) 2022 Jackson Wilke da Cruz Souza, Ariani Di Felippo
http://creativecommons.org/licenses/by-nc-nd/4.0
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2022 Jackson Wilke da Cruz Souza, Ariani Di Felippo
http://creativecommons.org/licenses/by-nc-nd/4.0
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
text/xml
dc.publisher.none.fl_str_mv PP/UFU
publisher.none.fl_str_mv PP/UFU
dc.source.none.fl_str_mv Domínios de Lingu@gem; Vol. 16 No. 4 (2022): The computational treatment of Brazilian Portuguese; 1517-1543
Domínios de Lingu@gem; Vol. 16 Núm. 4 (2022): El tratamiento computacional del portugués brasileño; 1517-1543
Domínios de Lingu@gem; v. 16 n. 4 (2022): Tratamento Computacional do Português Brasileiro; 1517-1543
1980-5799
reponame:Domínios de Lingu@gem
instname:Universidade Federal de Uberlândia (UFU)
instacron:UFU
instname_str Universidade Federal de Uberlândia (UFU)
instacron_str UFU
institution UFU
reponame_str Domínios de Lingu@gem
collection Domínios de Lingu@gem
repository.name.fl_str_mv Domínios de Lingu@gem - Universidade Federal de Uberlândia (UFU)
repository.mail.fl_str_mv revistadominios@ileel.ufu.br||
_version_ 1797067717705990144