Human action recognition based on spatiotemporal features from videos

Silva, Murilo Varges da

Human action recognition based on spatiotemporal features from videos

Detalhes bibliográficos
Autor(a) principal:	Silva, Murilo Varges da
Data de Publicação:	2020
Tipo de documento:	Tese
Idioma:	eng
Título da fonte:	Repositório Institucional da UFSCAR
Texto Completo:	https://repositorio.ufscar.br/handle/ufscar/13976
Resumo:	Currently, there is a high demand for the development of new techniques for automatic pattern recognition in videos, for example for the automatic recognition of human actions, this demand is motivated by the advances in the technologies of production, storage, transmission and sharing of videos, such advances triggered the production of a huge volume of videos that need to be automatically processed to be useful. Among the main applications, we can highlight: surveillance in public places, detection of falls of the elderly in their homes, automation in no-checkout-required stores, detection of pedestrian actions by self-driving car, detection of inappropriate content posted on the Internet like violence or pornography, etc. The automatic recognition of actions in videos is a challenging task because, in order to obtain good classification rates, it is necessary to work with spatial information (for example, shapes found in a single frame of the video) and temporal information (for example, movement patterns found throughout the frames in the video). In this thesis new methods are proposed for automatic recognition of human actions based on spatiotemporal features extracted from videos. Initially, different architectures of 3D Convolution Neural Networks (CNNs) were evaluated in the context of detecting pornography in videos. Afterwards, new methods were proposed for the recognition of human actions based on spatiotemporal information extracted from 2D poses. The use of 2D poses proved to be a promising strategy, as it requires a lower computational cost when compared to techniques that use deep learning. Besides, by using 2D poses, instead of raw images, one can preserve the privacy of people and places where the video cameras are installed. The proposed method has presented accuracy rates compatible with the state-of-the-art rates on the public databases in which the experiments were carried out.

Metadados do item

id	SCAR_bb51922e86cc8c56f90f8e14851e217c
oai_identifier_str	oai:repositorio.ufscar.br:ufscar/13976
network_acronym_str	SCAR
network_name_str	Repositório Institucional da UFSCAR
repository_id_str	4322
spelling	Silva, Murilo Varges daMarana, Aparecido Nilceuhttp://lattes.cnpq.br/6027713750942689http://lattes.cnpq.br/55015759179328926567eed7-1bdf-4992-906f-4ab55412fc612021-03-13T21:53:42Z2021-03-13T21:53:42Z2020-12-22SILVA, Murilo Varges da. Human action recognition based on spatiotemporal features from videos. 2020. Tese (Doutorado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2020. Disponível em: https://repositorio.ufscar.br/handle/ufscar/13976.https://repositorio.ufscar.br/handle/ufscar/13976Currently, there is a high demand for the development of new techniques for automatic pattern recognition in videos, for example for the automatic recognition of human actions, this demand is motivated by the advances in the technologies of production, storage, transmission and sharing of videos, such advances triggered the production of a huge volume of videos that need to be automatically processed to be useful. Among the main applications, we can highlight: surveillance in public places, detection of falls of the elderly in their homes, automation in no-checkout-required stores, detection of pedestrian actions by self-driving car, detection of inappropriate content posted on the Internet like violence or pornography, etc. The automatic recognition of actions in videos is a challenging task because, in order to obtain good classification rates, it is necessary to work with spatial information (for example, shapes found in a single frame of the video) and temporal information (for example, movement patterns found throughout the frames in the video). In this thesis new methods are proposed for automatic recognition of human actions based on spatiotemporal features extracted from videos. Initially, different architectures of 3D Convolution Neural Networks (CNNs) were evaluated in the context of detecting pornography in videos. Afterwards, new methods were proposed for the recognition of human actions based on spatiotemporal information extracted from 2D poses. The use of 2D poses proved to be a promising strategy, as it requires a lower computational cost when compared to techniques that use deep learning. Besides, by using 2D poses, instead of raw images, one can preserve the privacy of people and places where the video cameras are installed. The proposed method has presented accuracy rates compatible with the state-of-the-art rates on the public databases in which the experiments were carried out.Atualmente, existe uma alta demanda para o desenvolvimento de novas técnicas de reconhecimento automático de padrões em vídeos, como por exemplo para o reconhecimento automático de ações humanas, demanda essa motivada pelos avanços nas tecnologias de produção, armazenamento, transmissão e compartilhamento de vídeos, tais avanços desencadearam a produção de um grande volume de vídeos que para serem úteis necessitam de tratamento automatizado. Dentre as principais aplicações do reconhecimento de ações humanas em vídeos, destacam-se: vigilância em locais públicos, detecção de quedas de idosos em suas residências, automação em lojas com sistema de \textit{checkout} sem atendentes, detecção de ações de pedestres por parte de veículos autônomos, detecção de conteúdo inadequado postado na internet, como violência ou pornografia, etc. O reconhecimento automático de ações em vídeos é uma tarefa desafiadora, pois para se obter boas taxas de acurácia é necessário trabalhar com informações espaciais (por exemplo, formas encontradas em um único quadro do vídeo) e informações temporais (por exemplo, padrões de movimentos encontrados entre os quadros do vídeo). Nesta tese são propostos novos métodos para reconhecimento automático de ações humanas a partir de informações espaço-temporais extraídas de vídeos. Inicialmente, foram avaliadas diferentes arquiteturas de Redes Neurais de Convolução 3D (\textit{3D CNN - Convolutional Neural Networks}) no contexto de detecção de pornografia em vídeos. Após, foram propostos novos métodos para o reconhecimento de ações humanas baseados em informações espaço-temporais extraídas de poses 2D. O uso de poses 2D se mostrou uma estratégia promissora, pois exige um custo computacional menor se comparado com técnicas que utilizam aprendizado de máquina em profundidade, além disso ao se utilizar poses 2D ao invés das imagens brutas pode-se preservar a privacidade das pessoas e dos ambientes onde as câmeras de vídeos estão instaladas. O método proposto, apresentou taxas de acurácia compatíveis com o estado-da-arte nas bases de dados públicas em que os experimentos foram realizados.Não recebi financiamentoengUniversidade Federal de São CarlosCâmpus São CarlosPrograma de Pós-Graduação em Ciência da Computação - PPGCCUFSCarAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessHuman Action Recognition2D PosesVideo ClassificationReconhecimento de Ações HumanasPoses em 2DClassificação de VídeoCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAOHuman action recognition based on spatiotemporal features from videosReconhecimento de ações humanas baseado em características espaço-temporais de vídeosinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesis6007130220c-6ef2-41e9-bc45-cc368a9c6597reponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALTeseMuriloVersaoFinal.pdfTeseMuriloVersaoFinal.pdfTexto final da teseapplication/pdf12917804https://repositorio.ufscar.br/bitstream/ufscar/13976/1/TeseMuriloVersaoFinal.pdf5a4e6021c3e86eb3161c903cd239da3aMD51PPGCC_DeclaracaoOrientadorCorrecoes.pdfPPGCC_DeclaracaoOrientadorCorrecoes.pdfCarta comprovante assinada orientadorapplication/pdf91485https://repositorio.ufscar.br/bitstream/ufscar/13976/2/PPGCC_DeclaracaoOrientadorCorrecoes.pdf7f77c9a4d1b6aeb331c2a009dcd4d4e6MD52CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufscar.br/bitstream/ufscar/13976/3/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD53TEXTTeseMuriloVersaoFinal.pdf.txtTeseMuriloVersaoFinal.pdf.txtExtracted texttext/plain157004https://repositorio.ufscar.br/bitstream/ufscar/13976/4/TeseMuriloVersaoFinal.pdf.txt8d509e54bd6e50badbb35a4906570decMD54PPGCC_DeclaracaoOrientadorCorrecoes.pdf.txtPPGCC_DeclaracaoOrientadorCorrecoes.pdf.txtExtracted texttext/plain1507https://repositorio.ufscar.br/bitstream/ufscar/13976/6/PPGCC_DeclaracaoOrientadorCorrecoes.pdf.txte870168d50b8f16e64bc7b57d2caf44fMD56THUMBNAILTeseMuriloVersaoFinal.pdf.jpgTeseMuriloVersaoFinal.pdf.jpgIM Thumbnailimage/jpeg7458https://repositorio.ufscar.br/bitstream/ufscar/13976/5/TeseMuriloVersaoFinal.pdf.jpg14428a14e96d179f688883289bd04f3dMD55PPGCC_DeclaracaoOrientadorCorrecoes.pdf.jpgPPGCC_DeclaracaoOrientadorCorrecoes.pdf.jpgIM Thumbnailimage/jpeg14040https://repositorio.ufscar.br/bitstream/ufscar/13976/7/PPGCC_DeclaracaoOrientadorCorrecoes.pdf.jpg64c6c72619d70ecc75f60dc4f0fa144fMD57ufscar/139762023-09-18 18:32:07.504oai:repositorio.ufscar.br:ufscar/13976Repositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestopendoar:43222023-09-18T18:32:07Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)false
dc.title.eng.fl_str_mv	Human action recognition based on spatiotemporal features from videos
dc.title.alternative.por.fl_str_mv	Reconhecimento de ações humanas baseado em características espaço-temporais de vídeos
title	Human action recognition based on spatiotemporal features from videos
spellingShingle	Human action recognition based on spatiotemporal features from videos Silva, Murilo Varges da Human Action Recognition 2D Poses Video Classification Reconhecimento de Ações Humanas Poses em 2D Classificação de Vídeo CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
title_short	Human action recognition based on spatiotemporal features from videos
title_full	Human action recognition based on spatiotemporal features from videos
title_fullStr	Human action recognition based on spatiotemporal features from videos
title_full_unstemmed	Human action recognition based on spatiotemporal features from videos
title_sort	Human action recognition based on spatiotemporal features from videos
author	Silva, Murilo Varges da
author_facet	Silva, Murilo Varges da
author_role	author
dc.contributor.authorlattes.por.fl_str_mv	http://lattes.cnpq.br/5501575917932892
dc.contributor.author.fl_str_mv	Silva, Murilo Varges da
dc.contributor.advisor1.fl_str_mv	Marana, Aparecido Nilceu
dc.contributor.advisor1Lattes.fl_str_mv	http://lattes.cnpq.br/6027713750942689
dc.contributor.authorID.fl_str_mv	6567eed7-1bdf-4992-906f-4ab55412fc61
contributor_str_mv	Marana, Aparecido Nilceu
dc.subject.eng.fl_str_mv	Human Action Recognition 2D Poses Video Classification
topic	Human Action Recognition 2D Poses Video Classification Reconhecimento de Ações Humanas Poses em 2D Classificação de Vídeo CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
dc.subject.por.fl_str_mv	Reconhecimento de Ações Humanas Poses em 2D Classificação de Vídeo
dc.subject.cnpq.fl_str_mv	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
description	Currently, there is a high demand for the development of new techniques for automatic pattern recognition in videos, for example for the automatic recognition of human actions, this demand is motivated by the advances in the technologies of production, storage, transmission and sharing of videos, such advances triggered the production of a huge volume of videos that need to be automatically processed to be useful. Among the main applications, we can highlight: surveillance in public places, detection of falls of the elderly in their homes, automation in no-checkout-required stores, detection of pedestrian actions by self-driving car, detection of inappropriate content posted on the Internet like violence or pornography, etc. The automatic recognition of actions in videos is a challenging task because, in order to obtain good classification rates, it is necessary to work with spatial information (for example, shapes found in a single frame of the video) and temporal information (for example, movement patterns found throughout the frames in the video). In this thesis new methods are proposed for automatic recognition of human actions based on spatiotemporal features extracted from videos. Initially, different architectures of 3D Convolution Neural Networks (CNNs) were evaluated in the context of detecting pornography in videos. Afterwards, new methods were proposed for the recognition of human actions based on spatiotemporal information extracted from 2D poses. The use of 2D poses proved to be a promising strategy, as it requires a lower computational cost when compared to techniques that use deep learning. Besides, by using 2D poses, instead of raw images, one can preserve the privacy of people and places where the video cameras are installed. The proposed method has presented accuracy rates compatible with the state-of-the-art rates on the public databases in which the experiments were carried out.
publishDate	2020
dc.date.issued.fl_str_mv	2020-12-22
dc.date.accessioned.fl_str_mv	2021-03-13T21:53:42Z
dc.date.available.fl_str_mv	2021-03-13T21:53:42Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	SILVA, Murilo Varges da. Human action recognition based on spatiotemporal features from videos. 2020. Tese (Doutorado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2020. Disponível em: https://repositorio.ufscar.br/handle/ufscar/13976.
dc.identifier.uri.fl_str_mv	https://repositorio.ufscar.br/handle/ufscar/13976
identifier_str_mv	SILVA, Murilo Varges da. Human action recognition based on spatiotemporal features from videos. 2020. Tese (Doutorado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2020. Disponível em: https://repositorio.ufscar.br/handle/ufscar/13976.
url	https://repositorio.ufscar.br/handle/ufscar/13976
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.confidence.fl_str_mv	600
dc.relation.authority.fl_str_mv	7130220c-6ef2-41e9-bc45-cc368a9c6597
dc.rights.driver.fl_str_mv	Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/ info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Universidade Federal de São Carlos Câmpus São Carlos
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Ciência da Computação - PPGCC
dc.publisher.initials.fl_str_mv	UFSCar
publisher.none.fl_str_mv	Universidade Federal de São Carlos Câmpus São Carlos
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFSCAR instname:Universidade Federal de São Carlos (UFSCAR) instacron:UFSCAR
instname_str	Universidade Federal de São Carlos (UFSCAR)
instacron_str	UFSCAR
institution	UFSCAR
reponame_str	Repositório Institucional da UFSCAR
collection	Repositório Institucional da UFSCAR
bitstream.url.fl_str_mv	https://repositorio.ufscar.br/bitstream/ufscar/13976/1/TeseMuriloVersaoFinal.pdf https://repositorio.ufscar.br/bitstream/ufscar/13976/2/PPGCC_DeclaracaoOrientadorCorrecoes.pdf https://repositorio.ufscar.br/bitstream/ufscar/13976/3/license_rdf https://repositorio.ufscar.br/bitstream/ufscar/13976/4/TeseMuriloVersaoFinal.pdf.txt https://repositorio.ufscar.br/bitstream/ufscar/13976/6/PPGCC_DeclaracaoOrientadorCorrecoes.pdf.txt https://repositorio.ufscar.br/bitstream/ufscar/13976/5/TeseMuriloVersaoFinal.pdf.jpg https://repositorio.ufscar.br/bitstream/ufscar/13976/7/PPGCC_DeclaracaoOrientadorCorrecoes.pdf.jpg
bitstream.checksum.fl_str_mv	5a4e6021c3e86eb3161c903cd239da3a 7f77c9a4d1b6aeb331c2a009dcd4d4e6 e39d27027a6cc9cb039ad269a5db8e34 8d509e54bd6e50badbb35a4906570dec e870168d50b8f16e64bc7b57d2caf44f 14428a14e96d179f688883289bd04f3d 64c6c72619d70ecc75f60dc4f0fa144f
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)
repository.mail.fl_str_mv
_version_	1802136386003795968

Human action recognition based on spatiotemporal features from videos

Registros relacionados