Human action recognition in videos based on spatiotemporal features and bag-of-poses
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1016/j.asoc.2020.106513 http://hdl.handle.net/11449/199093 |
Resumo: | Currently, there is a large number of methods that use 2D poses to represent and recognize human action in videos. Most of these methods use information computed from raw 2D poses based on the straight line segments that form the body parts in a 2D pose model in order to extract features (e.g., angles and trajectories). In our work, we propose a new method of representing 2D poses. Instead of directly using the straight line segments, firstly, the 2D pose is converted to the parameter space in which each segment is mapped to a point. Then, from the parameter space, spatiotemporal features are extracted and encoded using a Bag-of-Poses approach, then used for human action recognition in the video. Experiments on two well-known public datasets, Weizmann and KTH, showed that the proposed method using 2D poses encoded in parameter space can improve the recognition rates, obtaining competitive accuracy rates compared to state-of-the-art methods. |
id |
UNSP_24b0356090064277307ed78dce90f65c |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/199093 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Human action recognition in videos based on spatiotemporal features and bag-of-posesBag-of-posesHuman action recognitionSpatiotemporal featuresSurveillance systemsVideo sequencesCurrently, there is a large number of methods that use 2D poses to represent and recognize human action in videos. Most of these methods use information computed from raw 2D poses based on the straight line segments that form the body parts in a 2D pose model in order to extract features (e.g., angles and trajectories). In our work, we propose a new method of representing 2D poses. Instead of directly using the straight line segments, firstly, the 2D pose is converted to the parameter space in which each segment is mapped to a point. Then, from the parameter space, spatiotemporal features are extracted and encoded using a Bag-of-Poses approach, then used for human action recognition in the video. Experiments on two well-known public datasets, Weizmann and KTH, showed that the proposed method using 2D poses encoded in parameter space can improve the recognition rates, obtaining competitive accuracy rates compared to state-of-the-art methods.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Department of Computing UFSCar - Federal University of São Carlos, Rod. Washington Luís, Km 235Department of Computing IFSP - Federal Institute of Education Science and Technology of São Paulo, Rua Pedro Cavalo, 709Department of Computing Faculty of Sciences UNESP - São Paulo State University, Av. Eng. Luiz Edmundo Carrijo Coube, 14-01Department of Computing Faculty of Sciences UNESP - São Paulo State University, Av. Eng. Luiz Edmundo Carrijo Coube, 14-01Universidade Federal de São Carlos (UFSCar)Science and Technology of São PauloUniversidade Estadual Paulista (Unesp)Varges da Silva, MuriloNilceu Marana, Aparecido [UNESP]2020-12-12T01:30:32Z2020-12-12T01:30:32Z2020-10-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.1016/j.asoc.2020.106513Applied Soft Computing Journal, v. 95.1568-4946http://hdl.handle.net/11449/19909310.1016/j.asoc.2020.1065132-s2.0-85087755333Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengApplied Soft Computing Journalinfo:eu-repo/semantics/openAccess2021-10-23T03:03:48Zoai:repositorio.unesp.br:11449/199093Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T16:49:47.857003Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Human action recognition in videos based on spatiotemporal features and bag-of-poses |
title |
Human action recognition in videos based on spatiotemporal features and bag-of-poses |
spellingShingle |
Human action recognition in videos based on spatiotemporal features and bag-of-poses Varges da Silva, Murilo Bag-of-poses Human action recognition Spatiotemporal features Surveillance systems Video sequences |
title_short |
Human action recognition in videos based on spatiotemporal features and bag-of-poses |
title_full |
Human action recognition in videos based on spatiotemporal features and bag-of-poses |
title_fullStr |
Human action recognition in videos based on spatiotemporal features and bag-of-poses |
title_full_unstemmed |
Human action recognition in videos based on spatiotemporal features and bag-of-poses |
title_sort |
Human action recognition in videos based on spatiotemporal features and bag-of-poses |
author |
Varges da Silva, Murilo |
author_facet |
Varges da Silva, Murilo Nilceu Marana, Aparecido [UNESP] |
author_role |
author |
author2 |
Nilceu Marana, Aparecido [UNESP] |
author2_role |
author |
dc.contributor.none.fl_str_mv |
Universidade Federal de São Carlos (UFSCar) Science and Technology of São Paulo Universidade Estadual Paulista (Unesp) |
dc.contributor.author.fl_str_mv |
Varges da Silva, Murilo Nilceu Marana, Aparecido [UNESP] |
dc.subject.por.fl_str_mv |
Bag-of-poses Human action recognition Spatiotemporal features Surveillance systems Video sequences |
topic |
Bag-of-poses Human action recognition Spatiotemporal features Surveillance systems Video sequences |
description |
Currently, there is a large number of methods that use 2D poses to represent and recognize human action in videos. Most of these methods use information computed from raw 2D poses based on the straight line segments that form the body parts in a 2D pose model in order to extract features (e.g., angles and trajectories). In our work, we propose a new method of representing 2D poses. Instead of directly using the straight line segments, firstly, the 2D pose is converted to the parameter space in which each segment is mapped to a point. Then, from the parameter space, spatiotemporal features are extracted and encoded using a Bag-of-Poses approach, then used for human action recognition in the video. Experiments on two well-known public datasets, Weizmann and KTH, showed that the proposed method using 2D poses encoded in parameter space can improve the recognition rates, obtaining competitive accuracy rates compared to state-of-the-art methods. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-12-12T01:30:32Z 2020-12-12T01:30:32Z 2020-10-01 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1016/j.asoc.2020.106513 Applied Soft Computing Journal, v. 95. 1568-4946 http://hdl.handle.net/11449/199093 10.1016/j.asoc.2020.106513 2-s2.0-85087755333 |
url |
http://dx.doi.org/10.1016/j.asoc.2020.106513 http://hdl.handle.net/11449/199093 |
identifier_str_mv |
Applied Soft Computing Journal, v. 95. 1568-4946 10.1016/j.asoc.2020.106513 2-s2.0-85087755333 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Applied Soft Computing Journal |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808128707758391296 |