Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content

Viana, Paula; Andrade, Maria Teresa; Carvalho, Pedro; Vilaça, Luis; Teixeira, Inês N.; Costa, Tiago; Jonker, Pieter

Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content

Detalhes bibliográficos
Autor(a) principal:	Viana, Paula
Data de Publicação:	2022
Outros Autores:	Andrade, Maria Teresa, Carvalho, Pedro, Vilaça, Luis, Teixeira, Inês N., Costa, Tiago, Jonker, Pieter
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10400.22/21682
Resumo:	Applying machine learning (ML), and especially deep learning, to understand visual content is becoming common practice in many application areas. However, little attention has been given to its use within the multimedia creative domain. It is true that ML is already popular for content creation, but the progress achieved so far addresses essentially textual content or the identification and selection of specific types of content. A wealth of possibilities are yet to be explored by bringing the use of ML into the multimedia creative process, allowing the knowledge inferred by the former to influence automatically how new multimedia content is created. The work presented in this article provides contributions in three distinct ways towards this goal: firstly, it proposes a methodology to re-train popular neural network models in identifying new thematic concepts in static visual content and attaching meaningful annotations to the detected regions of interest; secondly, it presents varied visual digital effects and corresponding tools that can be automatically called upon to apply such effects in a previously analyzed photo; thirdly, it defines a complete automated creative workflow, from the acquisition of a photograph and corresponding contextual data, through the ML region-based annotation, to the automatic application of digital effects and generation of a semantically aware multimedia story driven by the previously derived situational and visual contextual data. Additionally, it presents a variant of this automated workflow by offering to the user the possibility of manipulating the automatic annotations in an assisted manner. The final aim is to transform a static digital photo into a short video clip, taking into account the information acquired. The final result strongly contrasts with current standard approaches of creating random movements, by implementing an intelligent content- and context-aware video.

Metadados do item

id	RCAP_ae79ea5e1d12a9271872c2e6e5c46f51
oai_identifier_str	oai:recipp.ipp.pt:10400.22/21682
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still ContentDeep LearningStorytellingAutomated content creationSemantic awarenessContext awarenessRoIApplying machine learning (ML), and especially deep learning, to understand visual content is becoming common practice in many application areas. However, little attention has been given to its use within the multimedia creative domain. It is true that ML is already popular for content creation, but the progress achieved so far addresses essentially textual content or the identification and selection of specific types of content. A wealth of possibilities are yet to be explored by bringing the use of ML into the multimedia creative process, allowing the knowledge inferred by the former to influence automatically how new multimedia content is created. The work presented in this article provides contributions in three distinct ways towards this goal: firstly, it proposes a methodology to re-train popular neural network models in identifying new thematic concepts in static visual content and attaching meaningful annotations to the detected regions of interest; secondly, it presents varied visual digital effects and corresponding tools that can be automatically called upon to apply such effects in a previously analyzed photo; thirdly, it defines a complete automated creative workflow, from the acquisition of a photograph and corresponding contextual data, through the ML region-based annotation, to the automatic application of digital effects and generation of a semantically aware multimedia story driven by the previously derived situational and visual contextual data. Additionally, it presents a variant of this automated workflow by offering to the user the possibility of manipulating the automatic annotations in an assisted manner. The final aim is to transform a static digital photo into a short video clip, taking into account the information acquired. The final result strongly contrasts with current standard approaches of creating random movements, by implementing an intelligent content- and context-aware video.The work presented in this paper has been supported by the European Commission under contract number H2020-ICT-20-2017-1-RIA-780612 and by National Funds through the Portuguese funding agency, FCT—Fundação para a Ciência e a Tecnologia, within project LA/P/0063/2020.MDPIRepositório Científico do Instituto Politécnico do PortoViana, PaulaAndrade, Maria TeresaCarvalho, PedroVilaça, LuisTeixeira, Inês N.Costa, TiagoJonker, Pieter2023-01-19T12:23:44Z2022-03-102022-03-10T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.22/21682eng10.3390/jimaging8030068info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-13T13:17:55Zoai:recipp.ipp.pt:10400.22/21682Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:41:43.371767Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content
title	Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content
spellingShingle	Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content Viana, Paula Deep Learning Storytelling Automated content creation Semantic awareness Context awareness RoI
title_short	Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content
title_full	Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content
title_fullStr	Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content
title_full_unstemmed	Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content
title_sort	Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content
author	Viana, Paula
author_facet	Viana, Paula Andrade, Maria Teresa Carvalho, Pedro Vilaça, Luis Teixeira, Inês N. Costa, Tiago Jonker, Pieter
author_role	author
author2	Andrade, Maria Teresa Carvalho, Pedro Vilaça, Luis Teixeira, Inês N. Costa, Tiago Jonker, Pieter
author2_role	author author author author author author
dc.contributor.none.fl_str_mv	Repositório Científico do Instituto Politécnico do Porto
dc.contributor.author.fl_str_mv	Viana, Paula Andrade, Maria Teresa Carvalho, Pedro Vilaça, Luis Teixeira, Inês N. Costa, Tiago Jonker, Pieter
dc.subject.por.fl_str_mv	Deep Learning Storytelling Automated content creation Semantic awareness Context awareness RoI
topic	Deep Learning Storytelling Automated content creation Semantic awareness Context awareness RoI
description	Applying machine learning (ML), and especially deep learning, to understand visual content is becoming common practice in many application areas. However, little attention has been given to its use within the multimedia creative domain. It is true that ML is already popular for content creation, but the progress achieved so far addresses essentially textual content or the identification and selection of specific types of content. A wealth of possibilities are yet to be explored by bringing the use of ML into the multimedia creative process, allowing the knowledge inferred by the former to influence automatically how new multimedia content is created. The work presented in this article provides contributions in three distinct ways towards this goal: firstly, it proposes a methodology to re-train popular neural network models in identifying new thematic concepts in static visual content and attaching meaningful annotations to the detected regions of interest; secondly, it presents varied visual digital effects and corresponding tools that can be automatically called upon to apply such effects in a previously analyzed photo; thirdly, it defines a complete automated creative workflow, from the acquisition of a photograph and corresponding contextual data, through the ML region-based annotation, to the automatic application of digital effects and generation of a semantically aware multimedia story driven by the previously derived situational and visual contextual data. Additionally, it presents a variant of this automated workflow by offering to the user the possibility of manipulating the automatic annotations in an assisted manner. The final aim is to transform a static digital photo into a short video clip, taking into account the information acquired. The final result strongly contrasts with current standard approaches of creating random movements, by implementing an intelligent content- and context-aware video.
publishDate	2022
dc.date.none.fl_str_mv	2022-03-10 2022-03-10T00:00:00Z 2023-01-19T12:23:44Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10400.22/21682
url	http://hdl.handle.net/10400.22/21682
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	10.3390/jimaging8030068
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	MDPI
publisher.none.fl_str_mv	MDPI
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799131504563978240

Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content

Registros relacionados