Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity
Autor(a) principal: | |
---|---|
Data de Publicação: | 2011 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://jair.org/media/3387/live-3387-5920-jair.pdf https://ciencia.iscte-iul.pt/public/pub/id/6669 http://hdl.handle.net/10071/6933 |
Resumo: | In automatic summarization, centrality-as-relevance means that the most important content of an information source, or a collection of information sources, corresponds to the most central passages, considering a representation where such notion makes sense (graph, spatial, etc.). We assess the main paradigms, and introduce a new centrality-based relevance model for automatic summarization that relies on the use of support sets to better estimate the relevant content. Geometric proximity is used to compute semantic relatedness. Centrality (relevance) is determined by considering the whole input source (and not only local information), and by taking into account the existence of minor topics or lateral subjects in the information sources to be summarized. The method consists in creating, for each passage of the input source, a support set consisting only of the most semantically related passages. Then, the determination of the most relevant content is achieved by selecting the passages that occur in the largest number of support sets. This model produces extractive summaries that are generic, and language- and domainindependent. Thorough automatic evaluation shows that the method achieves state-of-theart performance, both in written text, and automatically transcribed speech summarization, including when compared to considerably more complex approaches. |
id |
RCAP_28b6baa92d8e17d1b49a4a0263ef2153 |
---|---|
oai_identifier_str |
oai:repositorio.iscte-iul.pt:10071/6933 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric ProximityIn automatic summarization, centrality-as-relevance means that the most important content of an information source, or a collection of information sources, corresponds to the most central passages, considering a representation where such notion makes sense (graph, spatial, etc.). We assess the main paradigms, and introduce a new centrality-based relevance model for automatic summarization that relies on the use of support sets to better estimate the relevant content. Geometric proximity is used to compute semantic relatedness. Centrality (relevance) is determined by considering the whole input source (and not only local information), and by taking into account the existence of minor topics or lateral subjects in the information sources to be summarized. The method consists in creating, for each passage of the input source, a support set consisting only of the most semantically related passages. Then, the determination of the most relevant content is achieved by selecting the passages that occur in the largest number of support sets. This model produces extractive summaries that are generic, and language- and domainindependent. Thorough automatic evaluation shows that the method achieves state-of-theart performance, both in written text, and automatically transcribed speech summarization, including when compared to considerably more complex approaches.AI Access Foundation2014-04-14T13:47:54Z2011-01-01T00:00:00Z20112014-04-14T13:44:57Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://jair.org/media/3387/live-3387-5920-jair.pdfhttps://ciencia.iscte-iul.pt/public/pub/id/6669http://hdl.handle.net/10071/6933eng1076-9757Ribeiro, R.de Matos, D. M.info:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:35:05Zoai:repositorio.iscte-iul.pt:10071/6933Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:15:51.676028Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity |
title |
Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity |
spellingShingle |
Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity Ribeiro, R. |
title_short |
Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity |
title_full |
Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity |
title_fullStr |
Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity |
title_full_unstemmed |
Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity |
title_sort |
Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity |
author |
Ribeiro, R. |
author_facet |
Ribeiro, R. de Matos, D. M. |
author_role |
author |
author2 |
de Matos, D. M. |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Ribeiro, R. de Matos, D. M. |
description |
In automatic summarization, centrality-as-relevance means that the most important content of an information source, or a collection of information sources, corresponds to the most central passages, considering a representation where such notion makes sense (graph, spatial, etc.). We assess the main paradigms, and introduce a new centrality-based relevance model for automatic summarization that relies on the use of support sets to better estimate the relevant content. Geometric proximity is used to compute semantic relatedness. Centrality (relevance) is determined by considering the whole input source (and not only local information), and by taking into account the existence of minor topics or lateral subjects in the information sources to be summarized. The method consists in creating, for each passage of the input source, a support set consisting only of the most semantically related passages. Then, the determination of the most relevant content is achieved by selecting the passages that occur in the largest number of support sets. This model produces extractive summaries that are generic, and language- and domainindependent. Thorough automatic evaluation shows that the method achieves state-of-theart performance, both in written text, and automatically transcribed speech summarization, including when compared to considerably more complex approaches. |
publishDate |
2011 |
dc.date.none.fl_str_mv |
2011-01-01T00:00:00Z 2011 2014-04-14T13:47:54Z 2014-04-14T13:44:57Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://jair.org/media/3387/live-3387-5920-jair.pdf https://ciencia.iscte-iul.pt/public/pub/id/6669 http://hdl.handle.net/10071/6933 |
url |
http://jair.org/media/3387/live-3387-5920-jair.pdf https://ciencia.iscte-iul.pt/public/pub/id/6669 http://hdl.handle.net/10071/6933 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
1076-9757 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
eu_rights_str_mv |
embargoedAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
AI Access Foundation |
publisher.none.fl_str_mv |
AI Access Foundation |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134716286205952 |