A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks

Detalhes bibliográficos
Autor(a) principal: Naik, Haraprasad
Data de Publicação: 2022
Outros Autores: Tripathy, Saswati, Laxmi Priya, Sahu, Prajna Paramita
Tipo de documento: Artigo
Idioma: eng
Título da fonte: INFOCOMP: Jornal de Ciência da Computação
Texto Completo: https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495
Resumo: Caption Generation from an Image is a task that has close relationship with Object Detectionand Natural Language Processing. However, The Object Detection has been evolving tremendouslysince a decade. In the Traditional approach of Object Detection there is an involvement of three stepprocess, which includes (1) region selection (2) Feature Extraction and (3) Classification. But in currentdays research trends Neural Networks are used to overcome the hand-crafted feature extraction alongwith the application of classification algorithm through various algorithm such as SVM, AdaBoost andDeformable Part Model (DPM). Similarly, to generate a Caption we can take the help of Neural Networkspecifically Recurrent Neural Network (RNN). In most of the Caption Generator a variation of RNNis used, that is Long Short-Term Memory (LSTM). Many researchers adopts either sampling or BeamSearch to generate a valid sentence/caption. In this paper our aim is to capture the fundamental ideabehind the presently available Image Caption Generator and compare their architectural design withtheir performance.
id UFLA-5_bf4b0ed202557b2c3405b09eb02244c0
oai_identifier_str oai:infocomp.dcc.ufla.br:article/1495
network_acronym_str UFLA-5
network_name_str INFOCOMP: Jornal de Ciência da Computação
repository_id_str
spelling A Comprehensive Investigation on Image Caption Generation using Deep Neural NetworksCaption Generation from an Image is a task that has close relationship with Object Detectionand Natural Language Processing. However, The Object Detection has been evolving tremendouslysince a decade. In the Traditional approach of Object Detection there is an involvement of three stepprocess, which includes (1) region selection (2) Feature Extraction and (3) Classification. But in currentdays research trends Neural Networks are used to overcome the hand-crafted feature extraction alongwith the application of classification algorithm through various algorithm such as SVM, AdaBoost andDeformable Part Model (DPM). Similarly, to generate a Caption we can take the help of Neural Networkspecifically Recurrent Neural Network (RNN). In most of the Caption Generator a variation of RNNis used, that is Long Short-Term Memory (LSTM). Many researchers adopts either sampling or BeamSearch to generate a valid sentence/caption. In this paper our aim is to capture the fundamental ideabehind the presently available Image Caption Generator and compare their architectural design withtheir performance.Editora da UFLA2022-06-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495INFOCOMP Journal of Computer Science; Vol. 21 No. 1 (2022): June 20221982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495/581Copyright (c) 2022 Haraprasad Naik, Saswati Tripathy, Laxmi Priya , Prajna Paramita Sahuinfo:eu-repo/semantics/openAccessNaik, HaraprasadTripathy, Saswati Laxmi Priya Sahu, Prajna Paramita2022-06-01T13:53:39Zoai:infocomp.dcc.ufla.br:article/1495Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:47.062170INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true
dc.title.none.fl_str_mv A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
title A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
spellingShingle A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
Naik, Haraprasad
title_short A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
title_full A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
title_fullStr A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
title_full_unstemmed A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
title_sort A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
author Naik, Haraprasad
author_facet Naik, Haraprasad
Tripathy, Saswati
Laxmi Priya
Sahu, Prajna Paramita
author_role author
author2 Tripathy, Saswati
Laxmi Priya
Sahu, Prajna Paramita
author2_role author
author
author
dc.contributor.author.fl_str_mv Naik, Haraprasad
Tripathy, Saswati
Laxmi Priya
Sahu, Prajna Paramita
description Caption Generation from an Image is a task that has close relationship with Object Detectionand Natural Language Processing. However, The Object Detection has been evolving tremendouslysince a decade. In the Traditional approach of Object Detection there is an involvement of three stepprocess, which includes (1) region selection (2) Feature Extraction and (3) Classification. But in currentdays research trends Neural Networks are used to overcome the hand-crafted feature extraction alongwith the application of classification algorithm through various algorithm such as SVM, AdaBoost andDeformable Part Model (DPM). Similarly, to generate a Caption we can take the help of Neural Networkspecifically Recurrent Neural Network (RNN). In most of the Caption Generator a variation of RNNis used, that is Long Short-Term Memory (LSTM). Many researchers adopts either sampling or BeamSearch to generate a valid sentence/caption. In this paper our aim is to capture the fundamental ideabehind the presently available Image Caption Generator and compare their architectural design withtheir performance.
publishDate 2022
dc.date.none.fl_str_mv 2022-06-01
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495
url https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495/581
dc.rights.driver.fl_str_mv Copyright (c) 2022 Haraprasad Naik, Saswati Tripathy, Laxmi Priya , Prajna Paramita Sahu
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2022 Haraprasad Naik, Saswati Tripathy, Laxmi Priya , Prajna Paramita Sahu
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Editora da UFLA
publisher.none.fl_str_mv Editora da UFLA
dc.source.none.fl_str_mv INFOCOMP Journal of Computer Science; Vol. 21 No. 1 (2022): June 2022
1982-3363
1807-4545
reponame:INFOCOMP: Jornal de Ciência da Computação
instname:Universidade Federal de Lavras (UFLA)
instacron:UFLA
instname_str Universidade Federal de Lavras (UFLA)
instacron_str UFLA
institution UFLA
reponame_str INFOCOMP: Jornal de Ciência da Computação
collection INFOCOMP: Jornal de Ciência da Computação
repository.name.fl_str_mv INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)
repository.mail.fl_str_mv infocomp@dcc.ufla.br||apfreire@dcc.ufla.br
_version_ 1799874742668230656