Identificação automática de aves a partir de áudio

Carvalho, Silvestre Daniel Dias

Identificação automática de aves a partir de áudio

Detalhes bibliográficos
Autor(a) principal:	Carvalho, Silvestre Daniel Dias
Data de Publicação:	2020
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10400.22/16439
Resumo:	Bird classification from audio is mainly useful for ornithologists and ecologists. With growing amounts of data, manual bird classification is time-consuming, which makes it a costly method. Birds react quickly to environmental changes, which makes their analysis an important problem in ecology, as analyzing bird behaviour and population trends helps detect other organisms in the environment. A reliable methodology that automatically identifies bird species from audio would be a valuable tool for the experts in the area. The main purpose of this work is to propose a methodology able to identify a bird species by its chirp. There are many techniques that can be used to process the audio data, and to classify the audio data. This thesis explores the deep learning techniques that are being used in this domain, such as using Convolutional Neural Networks and Recurrent Neural Networks to classify the data. Audio problems in deep learning are commonly approached by converting them into images using feature extraction techniques such as Mel Spectrograms and Mel Frequency Cepstral Coefficients. Multiple deep learning and feature extraction combinations are used and compared in this thesis in order to find the most suitable approach to this problem.

Metadados do item

id	RCAP_ace5797d0f271e9d86bc7c7483926745
oai_identifier_str	oai:recipp.ipp.pt:10400.22/16439
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Identificação automática de aves a partir de áudioBird Audio ClassificationDeep LearningAudio Feature ExtractionBird classification from audio is mainly useful for ornithologists and ecologists. With growing amounts of data, manual bird classification is time-consuming, which makes it a costly method. Birds react quickly to environmental changes, which makes their analysis an important problem in ecology, as analyzing bird behaviour and population trends helps detect other organisms in the environment. A reliable methodology that automatically identifies bird species from audio would be a valuable tool for the experts in the area. The main purpose of this work is to propose a methodology able to identify a bird species by its chirp. There are many techniques that can be used to process the audio data, and to classify the audio data. This thesis explores the deep learning techniques that are being used in this domain, such as using Convolutional Neural Networks and Recurrent Neural Networks to classify the data. Audio problems in deep learning are commonly approached by converting them into images using feature extraction techniques such as Mel Spectrograms and Mel Frequency Cepstral Coefficients. Multiple deep learning and feature extraction combinations are used and compared in this thesis in order to find the most suitable approach to this problem.Classificação de pássaros a partir de áudio é principalmente útil para ornitólogos e ecologistas. Com o aumento da quantidade de dados disponível, classificar a espécie dos pássaros manualmente acaba por consumir muito tempo. Os pássaros reagem rapidamente às alterações climáticas, o que faz com que a análise de pássaros seja um problema interessante na ecologia, porque ao analisar o comportamento das aves e a tendência populacional, outros organismos podem ser detetados no meio ambiente. Devido a estes factos, a criação de uma metodologia que identifique a espécie dos pássaros fiavelmente seria uma ferramenta bastante útil para os especialistas na área. O objetivo principal do trabalho nesta dissertação é propor uma metodologia que identifique a espécie de uma ave através do seu canto. Existem diversas técnicas que podem ser usadas para processar os dados sonoros que contêm os cantos das aves, e que podem ser usadas para classificar as espécies das aves. Esta dissertação explora as principais técnicas de deep learning que são usadas neste domínio, tais como as redes neuronais convolucionais e as redes neuronais recorrentes que são usadas para classificar os dados. Os problemas relacionados com som no deep learning, são normalmente abordados por converter os dados sonoros em imagens utilizando técnicas de extração de atributos, para depois serem classificados utilizando modelos de deep learning tipicamente utilizados para classificar imagens. Dois exemplos destas técnicas de extração de atributos normalmente utilizadas são os Espectrogramas de Mel e os Coeficientes Cepstrais da Frequência de Mel. Nesta dissertação, são feitas múltiplas combinações de técnicas de deep learning com técnicas de extração de atributos do som. Estas combinações são utilizadas para serem comparadas com o âmbito de encontrar a abordagem mais apropriada para o problema.Gomes, Elsa Maria de Carvalho FerreiraRepositório Científico do Instituto Politécnico do PortoCarvalho, Silvestre Daniel Dias2020-11-05T15:00:55Z20202020-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.22/16439TID:202533476enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-13T13:03:32Zoai:recipp.ipp.pt:10400.22/16439Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:36:06.579317Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Identificação automática de aves a partir de áudio
title	Identificação automática de aves a partir de áudio
spellingShingle	Identificação automática de aves a partir de áudio Carvalho, Silvestre Daniel Dias Bird Audio Classification Deep Learning Audio Feature Extraction
title_short	Identificação automática de aves a partir de áudio
title_full	Identificação automática de aves a partir de áudio
title_fullStr	Identificação automática de aves a partir de áudio
title_full_unstemmed	Identificação automática de aves a partir de áudio
title_sort	Identificação automática de aves a partir de áudio
author	Carvalho, Silvestre Daniel Dias
author_facet	Carvalho, Silvestre Daniel Dias
author_role	author
dc.contributor.none.fl_str_mv	Gomes, Elsa Maria de Carvalho Ferreira Repositório Científico do Instituto Politécnico do Porto
dc.contributor.author.fl_str_mv	Carvalho, Silvestre Daniel Dias
dc.subject.por.fl_str_mv	Bird Audio Classification Deep Learning Audio Feature Extraction
topic	Bird Audio Classification Deep Learning Audio Feature Extraction
description	Bird classification from audio is mainly useful for ornithologists and ecologists. With growing amounts of data, manual bird classification is time-consuming, which makes it a costly method. Birds react quickly to environmental changes, which makes their analysis an important problem in ecology, as analyzing bird behaviour and population trends helps detect other organisms in the environment. A reliable methodology that automatically identifies bird species from audio would be a valuable tool for the experts in the area. The main purpose of this work is to propose a methodology able to identify a bird species by its chirp. There are many techniques that can be used to process the audio data, and to classify the audio data. This thesis explores the deep learning techniques that are being used in this domain, such as using Convolutional Neural Networks and Recurrent Neural Networks to classify the data. Audio problems in deep learning are commonly approached by converting them into images using feature extraction techniques such as Mel Spectrograms and Mel Frequency Cepstral Coefficients. Multiple deep learning and feature extraction combinations are used and compared in this thesis in order to find the most suitable approach to this problem.
publishDate	2020
dc.date.none.fl_str_mv	2020-11-05T15:00:55Z 2020 2020-01-01T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10400.22/16439 TID:202533476
url	http://hdl.handle.net/10400.22/16439
identifier_str_mv	TID:202533476
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799131451996766208

Identificação automática de aves a partir de áudio

Registros relacionados