Summarization of films and documentaries based on subtitles and scripts
Autor(a) principal: | |
---|---|
Data de Publicação: | 2016 |
Outros Autores: | , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10071/11020 |
Resumo: | We assess the performance of generic text summarization algorithms applied to films and documentaries, using extracts from news articles produced by reference models of extractive summarization. We use three datasets: (i) news articles, (ii) film scripts and subtitles, and (iii) documentary subtitles. Standard ROUGE metrics are used for comparing generated summaries against news abstracts, plot summaries, and synopses. We show that the best performing algorithms are LSA, for news articles and documentaries, and LexRank and Support Sets, for films. Despite the different nature of films and documentaries, their relative behavior is in accordance with that obtained for news articles. |
id |
RCAP_4f1e75cfd1b083bea5cf0e11004496b3 |
---|---|
oai_identifier_str |
oai:repositorio.iscte-iul.pt:10071/11020 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Summarization of films and documentaries based on subtitles and scriptsAutomatic text summarizationGeneric summarizationSummarization of filmsSummarization of documentariesWe assess the performance of generic text summarization algorithms applied to films and documentaries, using extracts from news articles produced by reference models of extractive summarization. We use three datasets: (i) news articles, (ii) film scripts and subtitles, and (iii) documentary subtitles. Standard ROUGE metrics are used for comparing generated summaries against news abstracts, plot summaries, and synopses. We show that the best performing algorithms are LSA, for news articles and documentaries, and LexRank and Support Sets, for films. Despite the different nature of films and documentaries, their relative behavior is in accordance with that obtained for news articles.Elsevier Science BV2016-03-04T14:34:46Z2016-01-01T00:00:00Z20162019-04-09T09:26:45Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10071/11020eng0167-865510.1016/j.patrec.2015.12.016Aparício, M.Figueiredo, P.Raposo, F.de Matos, D.Ribeiro, R.Marujo, L.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:30:54Zoai:repositorio.iscte-iul.pt:10071/11020Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:13:52.963822Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Summarization of films and documentaries based on subtitles and scripts |
title |
Summarization of films and documentaries based on subtitles and scripts |
spellingShingle |
Summarization of films and documentaries based on subtitles and scripts Aparício, M. Automatic text summarization Generic summarization Summarization of films Summarization of documentaries |
title_short |
Summarization of films and documentaries based on subtitles and scripts |
title_full |
Summarization of films and documentaries based on subtitles and scripts |
title_fullStr |
Summarization of films and documentaries based on subtitles and scripts |
title_full_unstemmed |
Summarization of films and documentaries based on subtitles and scripts |
title_sort |
Summarization of films and documentaries based on subtitles and scripts |
author |
Aparício, M. |
author_facet |
Aparício, M. Figueiredo, P. Raposo, F. de Matos, D. Ribeiro, R. Marujo, L. |
author_role |
author |
author2 |
Figueiredo, P. Raposo, F. de Matos, D. Ribeiro, R. Marujo, L. |
author2_role |
author author author author author |
dc.contributor.author.fl_str_mv |
Aparício, M. Figueiredo, P. Raposo, F. de Matos, D. Ribeiro, R. Marujo, L. |
dc.subject.por.fl_str_mv |
Automatic text summarization Generic summarization Summarization of films Summarization of documentaries |
topic |
Automatic text summarization Generic summarization Summarization of films Summarization of documentaries |
description |
We assess the performance of generic text summarization algorithms applied to films and documentaries, using extracts from news articles produced by reference models of extractive summarization. We use three datasets: (i) news articles, (ii) film scripts and subtitles, and (iii) documentary subtitles. Standard ROUGE metrics are used for comparing generated summaries against news abstracts, plot summaries, and synopses. We show that the best performing algorithms are LSA, for news articles and documentaries, and LexRank and Support Sets, for films. Despite the different nature of films and documentaries, their relative behavior is in accordance with that obtained for news articles. |
publishDate |
2016 |
dc.date.none.fl_str_mv |
2016-03-04T14:34:46Z 2016-01-01T00:00:00Z 2016 2019-04-09T09:26:45Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10071/11020 |
url |
http://hdl.handle.net/10071/11020 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
0167-8655 10.1016/j.patrec.2015.12.016 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier Science BV |
publisher.none.fl_str_mv |
Elsevier Science BV |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134695637647360 |