Human action recognition in videos based on spatiotemporal features and bag-of-poses

Detalhes bibliográficos
Autor(a) principal: Varges da Silva, Murilo
Data de Publicação: 2020
Outros Autores: Nilceu Marana, Aparecido [UNESP]
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1016/j.asoc.2020.106513
http://hdl.handle.net/11449/199093
Resumo: Currently, there is a large number of methods that use 2D poses to represent and recognize human action in videos. Most of these methods use information computed from raw 2D poses based on the straight line segments that form the body parts in a 2D pose model in order to extract features (e.g., angles and trajectories). In our work, we propose a new method of representing 2D poses. Instead of directly using the straight line segments, firstly, the 2D pose is converted to the parameter space in which each segment is mapped to a point. Then, from the parameter space, spatiotemporal features are extracted and encoded using a Bag-of-Poses approach, then used for human action recognition in the video. Experiments on two well-known public datasets, Weizmann and KTH, showed that the proposed method using 2D poses encoded in parameter space can improve the recognition rates, obtaining competitive accuracy rates compared to state-of-the-art methods.
id UNSP_24b0356090064277307ed78dce90f65c
oai_identifier_str oai:repositorio.unesp.br:11449/199093
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Human action recognition in videos based on spatiotemporal features and bag-of-posesBag-of-posesHuman action recognitionSpatiotemporal featuresSurveillance systemsVideo sequencesCurrently, there is a large number of methods that use 2D poses to represent and recognize human action in videos. Most of these methods use information computed from raw 2D poses based on the straight line segments that form the body parts in a 2D pose model in order to extract features (e.g., angles and trajectories). In our work, we propose a new method of representing 2D poses. Instead of directly using the straight line segments, firstly, the 2D pose is converted to the parameter space in which each segment is mapped to a point. Then, from the parameter space, spatiotemporal features are extracted and encoded using a Bag-of-Poses approach, then used for human action recognition in the video. Experiments on two well-known public datasets, Weizmann and KTH, showed that the proposed method using 2D poses encoded in parameter space can improve the recognition rates, obtaining competitive accuracy rates compared to state-of-the-art methods.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Department of Computing UFSCar - Federal University of São Carlos, Rod. Washington Luís, Km 235Department of Computing IFSP - Federal Institute of Education Science and Technology of São Paulo, Rua Pedro Cavalo, 709Department of Computing Faculty of Sciences UNESP - São Paulo State University, Av. Eng. Luiz Edmundo Carrijo Coube, 14-01Department of Computing Faculty of Sciences UNESP - São Paulo State University, Av. Eng. Luiz Edmundo Carrijo Coube, 14-01Universidade Federal de São Carlos (UFSCar)Science and Technology of São PauloUniversidade Estadual Paulista (Unesp)Varges da Silva, MuriloNilceu Marana, Aparecido [UNESP]2020-12-12T01:30:32Z2020-12-12T01:30:32Z2020-10-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.1016/j.asoc.2020.106513Applied Soft Computing Journal, v. 95.1568-4946http://hdl.handle.net/11449/19909310.1016/j.asoc.2020.1065132-s2.0-85087755333Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengApplied Soft Computing Journalinfo:eu-repo/semantics/openAccess2021-10-23T03:03:48Zoai:repositorio.unesp.br:11449/199093Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T16:49:47.857003Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Human action recognition in videos based on spatiotemporal features and bag-of-poses
title Human action recognition in videos based on spatiotemporal features and bag-of-poses
spellingShingle Human action recognition in videos based on spatiotemporal features and bag-of-poses
Varges da Silva, Murilo
Bag-of-poses
Human action recognition
Spatiotemporal features
Surveillance systems
Video sequences
title_short Human action recognition in videos based on spatiotemporal features and bag-of-poses
title_full Human action recognition in videos based on spatiotemporal features and bag-of-poses
title_fullStr Human action recognition in videos based on spatiotemporal features and bag-of-poses
title_full_unstemmed Human action recognition in videos based on spatiotemporal features and bag-of-poses
title_sort Human action recognition in videos based on spatiotemporal features and bag-of-poses
author Varges da Silva, Murilo
author_facet Varges da Silva, Murilo
Nilceu Marana, Aparecido [UNESP]
author_role author
author2 Nilceu Marana, Aparecido [UNESP]
author2_role author
dc.contributor.none.fl_str_mv Universidade Federal de São Carlos (UFSCar)
Science and Technology of São Paulo
Universidade Estadual Paulista (Unesp)
dc.contributor.author.fl_str_mv Varges da Silva, Murilo
Nilceu Marana, Aparecido [UNESP]
dc.subject.por.fl_str_mv Bag-of-poses
Human action recognition
Spatiotemporal features
Surveillance systems
Video sequences
topic Bag-of-poses
Human action recognition
Spatiotemporal features
Surveillance systems
Video sequences
description Currently, there is a large number of methods that use 2D poses to represent and recognize human action in videos. Most of these methods use information computed from raw 2D poses based on the straight line segments that form the body parts in a 2D pose model in order to extract features (e.g., angles and trajectories). In our work, we propose a new method of representing 2D poses. Instead of directly using the straight line segments, firstly, the 2D pose is converted to the parameter space in which each segment is mapped to a point. Then, from the parameter space, spatiotemporal features are extracted and encoded using a Bag-of-Poses approach, then used for human action recognition in the video. Experiments on two well-known public datasets, Weizmann and KTH, showed that the proposed method using 2D poses encoded in parameter space can improve the recognition rates, obtaining competitive accuracy rates compared to state-of-the-art methods.
publishDate 2020
dc.date.none.fl_str_mv 2020-12-12T01:30:32Z
2020-12-12T01:30:32Z
2020-10-01
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1016/j.asoc.2020.106513
Applied Soft Computing Journal, v. 95.
1568-4946
http://hdl.handle.net/11449/199093
10.1016/j.asoc.2020.106513
2-s2.0-85087755333
url http://dx.doi.org/10.1016/j.asoc.2020.106513
http://hdl.handle.net/11449/199093
identifier_str_mv Applied Soft Computing Journal, v. 95.
1568-4946
10.1016/j.asoc.2020.106513
2-s2.0-85087755333
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Applied Soft Computing Journal
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1808128707758391296