A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | INFOCOMP: Jornal de Ciência da Computação |
Texto Completo: | https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495 |
Resumo: | Caption Generation from an Image is a task that has close relationship with Object Detectionand Natural Language Processing. However, The Object Detection has been evolving tremendouslysince a decade. In the Traditional approach of Object Detection there is an involvement of three stepprocess, which includes (1) region selection (2) Feature Extraction and (3) Classification. But in currentdays research trends Neural Networks are used to overcome the hand-crafted feature extraction alongwith the application of classification algorithm through various algorithm such as SVM, AdaBoost andDeformable Part Model (DPM). Similarly, to generate a Caption we can take the help of Neural Networkspecifically Recurrent Neural Network (RNN). In most of the Caption Generator a variation of RNNis used, that is Long Short-Term Memory (LSTM). Many researchers adopts either sampling or BeamSearch to generate a valid sentence/caption. In this paper our aim is to capture the fundamental ideabehind the presently available Image Caption Generator and compare their architectural design withtheir performance. |
id |
UFLA-5_bf4b0ed202557b2c3405b09eb02244c0 |
---|---|
oai_identifier_str |
oai:infocomp.dcc.ufla.br:article/1495 |
network_acronym_str |
UFLA-5 |
network_name_str |
INFOCOMP: Jornal de Ciência da Computação |
repository_id_str |
|
spelling |
A Comprehensive Investigation on Image Caption Generation using Deep Neural NetworksCaption Generation from an Image is a task that has close relationship with Object Detectionand Natural Language Processing. However, The Object Detection has been evolving tremendouslysince a decade. In the Traditional approach of Object Detection there is an involvement of three stepprocess, which includes (1) region selection (2) Feature Extraction and (3) Classification. But in currentdays research trends Neural Networks are used to overcome the hand-crafted feature extraction alongwith the application of classification algorithm through various algorithm such as SVM, AdaBoost andDeformable Part Model (DPM). Similarly, to generate a Caption we can take the help of Neural Networkspecifically Recurrent Neural Network (RNN). In most of the Caption Generator a variation of RNNis used, that is Long Short-Term Memory (LSTM). Many researchers adopts either sampling or BeamSearch to generate a valid sentence/caption. In this paper our aim is to capture the fundamental ideabehind the presently available Image Caption Generator and compare their architectural design withtheir performance.Editora da UFLA2022-06-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495INFOCOMP Journal of Computer Science; Vol. 21 No. 1 (2022): June 20221982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495/581Copyright (c) 2022 Haraprasad Naik, Saswati Tripathy, Laxmi Priya , Prajna Paramita Sahuinfo:eu-repo/semantics/openAccessNaik, HaraprasadTripathy, Saswati Laxmi Priya Sahu, Prajna Paramita2022-06-01T13:53:39Zoai:infocomp.dcc.ufla.br:article/1495Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:47.062170INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true |
dc.title.none.fl_str_mv |
A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks |
title |
A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks |
spellingShingle |
A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks Naik, Haraprasad |
title_short |
A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks |
title_full |
A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks |
title_fullStr |
A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks |
title_full_unstemmed |
A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks |
title_sort |
A Comprehensive Investigation on Image Caption Generation using Deep Neural Networks |
author |
Naik, Haraprasad |
author_facet |
Naik, Haraprasad Tripathy, Saswati Laxmi Priya Sahu, Prajna Paramita |
author_role |
author |
author2 |
Tripathy, Saswati Laxmi Priya Sahu, Prajna Paramita |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Naik, Haraprasad Tripathy, Saswati Laxmi Priya Sahu, Prajna Paramita |
description |
Caption Generation from an Image is a task that has close relationship with Object Detectionand Natural Language Processing. However, The Object Detection has been evolving tremendouslysince a decade. In the Traditional approach of Object Detection there is an involvement of three stepprocess, which includes (1) region selection (2) Feature Extraction and (3) Classification. But in currentdays research trends Neural Networks are used to overcome the hand-crafted feature extraction alongwith the application of classification algorithm through various algorithm such as SVM, AdaBoost andDeformable Part Model (DPM). Similarly, to generate a Caption we can take the help of Neural Networkspecifically Recurrent Neural Network (RNN). In most of the Caption Generator a variation of RNNis used, that is Long Short-Term Memory (LSTM). Many researchers adopts either sampling or BeamSearch to generate a valid sentence/caption. In this paper our aim is to capture the fundamental ideabehind the presently available Image Caption Generator and compare their architectural design withtheir performance. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-06-01 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495 |
url |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1495/581 |
dc.rights.driver.fl_str_mv |
Copyright (c) 2022 Haraprasad Naik, Saswati Tripathy, Laxmi Priya , Prajna Paramita Sahu info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Copyright (c) 2022 Haraprasad Naik, Saswati Tripathy, Laxmi Priya , Prajna Paramita Sahu |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Editora da UFLA |
publisher.none.fl_str_mv |
Editora da UFLA |
dc.source.none.fl_str_mv |
INFOCOMP Journal of Computer Science; Vol. 21 No. 1 (2022): June 2022 1982-3363 1807-4545 reponame:INFOCOMP: Jornal de Ciência da Computação instname:Universidade Federal de Lavras (UFLA) instacron:UFLA |
instname_str |
Universidade Federal de Lavras (UFLA) |
instacron_str |
UFLA |
institution |
UFLA |
reponame_str |
INFOCOMP: Jornal de Ciência da Computação |
collection |
INFOCOMP: Jornal de Ciência da Computação |
repository.name.fl_str_mv |
INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA) |
repository.mail.fl_str_mv |
infocomp@dcc.ufla.br||apfreire@dcc.ufla.br |
_version_ |
1799874742668230656 |