Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity

Detalhes bibliográficos
Autor(a) principal: Ribeiro, R.
Data de Publicação: 2011
Outros Autores: de Matos, D. M.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://jair.org/media/3387/live-3387-5920-jair.pdf
https://ciencia.iscte-iul.pt/public/pub/id/6669
http://hdl.handle.net/10071/6933
Resumo: In automatic summarization, centrality-as-relevance means that the most important content of an information source, or a collection of information sources, corresponds to the most central passages, considering a representation where such notion makes sense (graph, spatial, etc.). We assess the main paradigms, and introduce a new centrality-based relevance model for automatic summarization that relies on the use of support sets to better estimate the relevant content. Geometric proximity is used to compute semantic relatedness. Centrality (relevance) is determined by considering the whole input source (and not only local information), and by taking into account the existence of minor topics or lateral subjects in the information sources to be summarized. The method consists in creating, for each passage of the input source, a support set consisting only of the most semantically related passages. Then, the determination of the most relevant content is achieved by selecting the passages that occur in the largest number of support sets. This model produces extractive summaries that are generic, and language- and domainindependent. Thorough automatic evaluation shows that the method achieves state-of-theart performance, both in written text, and automatically transcribed speech summarization, including when compared to considerably more complex approaches.
id RCAP_28b6baa92d8e17d1b49a4a0263ef2153
oai_identifier_str oai:repositorio.iscte-iul.pt:10071/6933
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric ProximityIn automatic summarization, centrality-as-relevance means that the most important content of an information source, or a collection of information sources, corresponds to the most central passages, considering a representation where such notion makes sense (graph, spatial, etc.). We assess the main paradigms, and introduce a new centrality-based relevance model for automatic summarization that relies on the use of support sets to better estimate the relevant content. Geometric proximity is used to compute semantic relatedness. Centrality (relevance) is determined by considering the whole input source (and not only local information), and by taking into account the existence of minor topics or lateral subjects in the information sources to be summarized. The method consists in creating, for each passage of the input source, a support set consisting only of the most semantically related passages. Then, the determination of the most relevant content is achieved by selecting the passages that occur in the largest number of support sets. This model produces extractive summaries that are generic, and language- and domainindependent. Thorough automatic evaluation shows that the method achieves state-of-theart performance, both in written text, and automatically transcribed speech summarization, including when compared to considerably more complex approaches.AI Access Foundation2014-04-14T13:47:54Z2011-01-01T00:00:00Z20112014-04-14T13:44:57Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://jair.org/media/3387/live-3387-5920-jair.pdfhttps://ciencia.iscte-iul.pt/public/pub/id/6669http://hdl.handle.net/10071/6933eng1076-9757Ribeiro, R.de Matos, D. M.info:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:35:05Zoai:repositorio.iscte-iul.pt:10071/6933Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:15:51.676028Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity
title Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity
spellingShingle Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity
Ribeiro, R.
title_short Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity
title_full Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity
title_fullStr Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity
title_full_unstemmed Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity
title_sort Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity
author Ribeiro, R.
author_facet Ribeiro, R.
de Matos, D. M.
author_role author
author2 de Matos, D. M.
author2_role author
dc.contributor.author.fl_str_mv Ribeiro, R.
de Matos, D. M.
description In automatic summarization, centrality-as-relevance means that the most important content of an information source, or a collection of information sources, corresponds to the most central passages, considering a representation where such notion makes sense (graph, spatial, etc.). We assess the main paradigms, and introduce a new centrality-based relevance model for automatic summarization that relies on the use of support sets to better estimate the relevant content. Geometric proximity is used to compute semantic relatedness. Centrality (relevance) is determined by considering the whole input source (and not only local information), and by taking into account the existence of minor topics or lateral subjects in the information sources to be summarized. The method consists in creating, for each passage of the input source, a support set consisting only of the most semantically related passages. Then, the determination of the most relevant content is achieved by selecting the passages that occur in the largest number of support sets. This model produces extractive summaries that are generic, and language- and domainindependent. Thorough automatic evaluation shows that the method achieves state-of-theart performance, both in written text, and automatically transcribed speech summarization, including when compared to considerably more complex approaches.
publishDate 2011
dc.date.none.fl_str_mv 2011-01-01T00:00:00Z
2011
2014-04-14T13:47:54Z
2014-04-14T13:44:57Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://jair.org/media/3387/live-3387-5920-jair.pdf
https://ciencia.iscte-iul.pt/public/pub/id/6669
http://hdl.handle.net/10071/6933
url http://jair.org/media/3387/live-3387-5920-jair.pdf
https://ciencia.iscte-iul.pt/public/pub/id/6669
http://hdl.handle.net/10071/6933
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 1076-9757
dc.rights.driver.fl_str_mv info:eu-repo/semantics/embargoedAccess
eu_rights_str_mv embargoedAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv AI Access Foundation
publisher.none.fl_str_mv AI Access Foundation
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134716286205952