A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks

Naik, Haraprasad; Tripathy, Saswati; Laxmi Priya; Sahu, Prajna Paramita

A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks

Detalhes bibliográficos
Autor(a) principal:	Naik, Haraprasad
Data de Publicação:	2022
Outros Autores:	Tripathy, Saswati, Laxmi Priya, Sahu, Prajna Paramita
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	INFOCOMP: Jornal de Ciência da Computação
Texto Completo:	https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495
Resumo:	Caption Generation from an Image is a task that has close relationship with Object Detectionand Natural Language Processing. However, The Object Detection has been evolving tremendouslysince a decade. In the Traditional approach of Object Detection there is an involvement of three stepprocess, which includes (1) region selection (2) Feature Extraction and (3) Classification. But in currentdays research trends Neural Networks are used to overcome the hand-crafted feature extraction alongwith the application of classification algorithm through various algorithm such as SVM, AdaBoost andDeformable Part Model (DPM). Similarly, to generate a Caption we can take the help of Neural Networkspecifically Recurrent Neural Network (RNN). In most of the Caption Generator a variation of RNNis used, that is Long Short-Term Memory (LSTM). Many researchers adopts either sampling or BeamSearch to generate a valid sentence/caption. In this paper our aim is to capture the fundamental ideabehind the presently available Image Caption Generator and compare their architectural design withtheir performance.

Metadados do item

id	UFLA-5_bf4b0ed202557b2c3405b09eb02244c0
oai_identifier_str	oai:infocomp.dcc.ufla.br:article/1495
network_acronym_str	UFLA-5
network_name_str	INFOCOMP: Jornal de Ciência da Computação
repository_id_str
spelling	A Comprehensive Investigation on Image Caption Generation using Deep Neural NetworksCaption Generation from an Image is a task that has close relationship with Object Detectionand Natural Language Processing. However, The Object Detection has been evolving tremendouslysince a decade. In the Traditional approach of Object Detection there is an involvement of three stepprocess, which includes (1) region selection (2) Feature Extraction and (3) Classification. But in currentdays research trends Neural Networks are used to overcome the hand-crafted feature extraction alongwith the application of classification algorithm through various algorithm such as SVM, AdaBoost andDeformable Part Model (DPM). Similarly, to generate a Caption we can take the help of Neural Networkspecifically Recurrent Neural Network (RNN). In most of the Caption Generator a variation of RNNis used, that is Long Short-Term Memory (LSTM). Many researchers adopts either sampling or BeamSearch to generate a valid sentence/caption. In this paper our aim is to capture the fundamental ideabehind the presently available Image Caption Generator and compare their architectural design withtheir performance.Editora da UFLA2022-06-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495INFOCOMP Journal of Computer Science; Vol. 21 No. 1 (2022): June 20221982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495/581Copyright (c) 2022 Haraprasad Naik, Saswati Tripathy, Laxmi Priya , Prajna Paramita Sahuinfo:eu-repo/semantics/openAccessNaik, HaraprasadTripathy, Saswati Laxmi Priya Sahu, Prajna Paramita2022-06-01T13:53:39Zoai:infocomp.dcc.ufla.br:article/1495Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br\|\|apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:47.062170INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true
dc.title.none.fl_str_mv	A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
title	A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
spellingShingle	A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks Naik, Haraprasad
title_short	A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
title_full	A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
title_fullStr	A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
title_full_unstemmed	A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
title_sort	A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
author	Naik, Haraprasad
author_facet	Naik, Haraprasad Tripathy, Saswati Laxmi Priya Sahu, Prajna Paramita
author_role	author
author2	Tripathy, Saswati Laxmi Priya Sahu, Prajna Paramita
author2_role	author author author
dc.contributor.author.fl_str_mv	Naik, Haraprasad Tripathy, Saswati Laxmi Priya Sahu, Prajna Paramita
description	Caption Generation from an Image is a task that has close relationship with Object Detectionand Natural Language Processing. However, The Object Detection has been evolving tremendouslysince a decade. In the Traditional approach of Object Detection there is an involvement of three stepprocess, which includes (1) region selection (2) Feature Extraction and (3) Classification. But in currentdays research trends Neural Networks are used to overcome the hand-crafted feature extraction alongwith the application of classification algorithm through various algorithm such as SVM, AdaBoost andDeformable Part Model (DPM). Similarly, to generate a Caption we can take the help of Neural Networkspecifically Recurrent Neural Network (RNN). In most of the Caption Generator a variation of RNNis used, that is Long Short-Term Memory (LSTM). Many researchers adopts either sampling or BeamSearch to generate a valid sentence/caption. In this paper our aim is to capture the fundamental ideabehind the presently available Image Caption Generator and compare their architectural design withtheir performance.
publishDate	2022
dc.date.none.fl_str_mv	2022-06-01
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495
url	https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495/581
dc.rights.driver.fl_str_mv	Copyright (c) 2022 Haraprasad Naik, Saswati Tripathy, Laxmi Priya , Prajna Paramita Sahu info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Copyright (c) 2022 Haraprasad Naik, Saswati Tripathy, Laxmi Priya , Prajna Paramita Sahu
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Editora da UFLA
publisher.none.fl_str_mv	Editora da UFLA
dc.source.none.fl_str_mv	INFOCOMP Journal of Computer Science; Vol. 21 No. 1 (2022): June 2022 1982-3363 1807-4545 reponame:INFOCOMP: Jornal de Ciência da Computação instname:Universidade Federal de Lavras (UFLA) instacron:UFLA
instname_str	Universidade Federal de Lavras (UFLA)
instacron_str	UFLA
institution	UFLA
reponame_str	INFOCOMP: Jornal de Ciência da Computação
collection	INFOCOMP: Jornal de Ciência da Computação
repository.name.fl_str_mv	INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)
repository.mail.fl_str_mv	infocomp@dcc.ufla.br\|\|apfreire@dcc.ufla.br
_version_	1799874742668230656

A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks

Registros relacionados