Identification of archaeological sites with LiDAR data using machine learning and feature extraction

Detalhes bibliográficos
Autor(a) principal: Costa, Miguel Reis da Conceição
Data de Publicação: 2022
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10773/38724
Resumo: This dissertation was developed within the project ODYSSEY, funded by the PORTUGAL2020 program. The project ODYSSEY aims to implement a platform of geographic information destined to archaeology and heritage experts. Identifying archaeological sites provides information that allows the consolidation of heritage information sources. The identification and detection of archaeological sites on the platform are expected to be automatic using Deep learning and Machine Learning techniques and data obtained from non-intrusive methods like LiDAR. The objective of this dissertation is to develop a method that can identify archaeological objects using Machine Learning and Feature Extraction. For this, it is required pre-processing the data provided, since the data is not ready to be used, being required to extract specific images of the target objects to build a dataset needed for training the classifier. There’s also a need to apply feature extraction in order to improve the data for training the classifier. The dataset used in the experimental results of this dissertation corresponds to LiDAR data obtained in a region around the city of Viana do Castelo, north of Portugal. This data contains 136 annotated archaeological sites called “mamoas”. After the pre-processing, several tests were executed, starting by evaluating the efficiency, while using different sizes of the bounding boxes. Several Machine Learning methods were evaluated, such as Random Forest, Adaptive Boosting, Bagging, Extra Trees, Gradient Boosting and Histogram of Gradient Boosting. It was also applied some texture filters on the images obtained from the data in order to extract features to use them on the classifier. It was obtained a combination that provided better results that are going to be used in future work. Afterwards, it was implemented an inference method using a sliding window where the objective is to identify objects of interest present in the area providing with the respective geographic coordinates.
id RCAP_beedc95296403c9f807e08d33a4cef62
oai_identifier_str oai:ria.ua.pt:10773/38724
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Identification of archaeological sites with LiDAR data using machine learning and feature extractionArchaeologyLiDARMachine learningRemote sensingRandom forestGradientEntropyThis dissertation was developed within the project ODYSSEY, funded by the PORTUGAL2020 program. The project ODYSSEY aims to implement a platform of geographic information destined to archaeology and heritage experts. Identifying archaeological sites provides information that allows the consolidation of heritage information sources. The identification and detection of archaeological sites on the platform are expected to be automatic using Deep learning and Machine Learning techniques and data obtained from non-intrusive methods like LiDAR. The objective of this dissertation is to develop a method that can identify archaeological objects using Machine Learning and Feature Extraction. For this, it is required pre-processing the data provided, since the data is not ready to be used, being required to extract specific images of the target objects to build a dataset needed for training the classifier. There’s also a need to apply feature extraction in order to improve the data for training the classifier. The dataset used in the experimental results of this dissertation corresponds to LiDAR data obtained in a region around the city of Viana do Castelo, north of Portugal. This data contains 136 annotated archaeological sites called “mamoas”. After the pre-processing, several tests were executed, starting by evaluating the efficiency, while using different sizes of the bounding boxes. Several Machine Learning methods were evaluated, such as Random Forest, Adaptive Boosting, Bagging, Extra Trees, Gradient Boosting and Histogram of Gradient Boosting. It was also applied some texture filters on the images obtained from the data in order to extract features to use them on the classifier. It was obtained a combination that provided better results that are going to be used in future work. Afterwards, it was implemented an inference method using a sliding window where the objective is to identify objects of interest present in the area providing with the respective geographic coordinates.A dissertação apresentada foi desenvolvida no âmbito do projeto ODYSSEY, financiado pelo programa PORTUGAL2020. O projecto ODYSSEY tem o objetivo de criar uma plataforma para informação geográfica destinada a especialistas em arqueologia e património. A identificação de objetos e certos locais arqueológicos fornece informação essencial que permite a consolidação das fontes de informação patrimonial. A identificação e a deteção de sitios arqueológicos na plataforma são realizadas de forma automática, usando métodos de aprendizagem automática e aprendizagem profunda, bem como o uso de dados obtidos por métodos não intrusivos, tal como LiDAR. O objetivo desta dissertação é desenvolver um método capaz de identificar esses objetos arqueológicos recorrendo à extração de características e uso de aprendizagem automática. Para isto, é necessário um pré-processamento dos dados fornecidos, visto que os dados anotados não estão prontos a serem usados, sendo necessário extrair imagens específicas dos objetos alvo para construir o dataset necessário para treino do classificador. Existe, também, a necessidade de fazer uma extração de características com o objetivo de melhorar os dados de treino do classificador. O dataset usado nos testes realizados nesta dissertação corresponde a dados LiDAR obtidos de uma região em Viana do Castelo, no Norte de Portugal. Estes dados contêm cerca de 136 objetos de interesse chamados “mamoas”. Depois do pré-processamento, vários testes foram executados, começando pela avaliação da eficácia, usando diferentes tamanhos das “bounding boxes”. Também foram avaliados alguns métodos de aprendizagem automática para perceber qual é o mais eficiente, tais como Random Forest, Adaptive Boosting, Bagging, Extra Trees, Gradient Boosting e Histogram of Gradient Boosting. Foram, também, aplicados alguns filtros de texturas nas imagens obtidas dos dados, permitindo a extração de características para usar no classificador. Foi obtida uma combinação características que obteve melhores resultados e que será usado em trabalho futuro. Posteriormente, foi aplicado um método de inferência, usando janela deslizante, cujo objetivo é identificar os objetos de interesse presentes na área fornecendo as respetivas coordenadas geográficas.2023-07-17T13:55:29Z2022-10-17T00:00:00Z2022-10-17info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/38724engCosta, Miguel Reis da Conceiçãoinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T12:14:56Zoai:ria.ua.pt:10773/38724Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:08:50.848709Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Identification of archaeological sites with LiDAR data using machine learning and feature extraction
title Identification of archaeological sites with LiDAR data using machine learning and feature extraction
spellingShingle Identification of archaeological sites with LiDAR data using machine learning and feature extraction
Costa, Miguel Reis da Conceição
Archaeology
LiDAR
Machine learning
Remote sensing
Random forest
Gradient
Entropy
title_short Identification of archaeological sites with LiDAR data using machine learning and feature extraction
title_full Identification of archaeological sites with LiDAR data using machine learning and feature extraction
title_fullStr Identification of archaeological sites with LiDAR data using machine learning and feature extraction
title_full_unstemmed Identification of archaeological sites with LiDAR data using machine learning and feature extraction
title_sort Identification of archaeological sites with LiDAR data using machine learning and feature extraction
author Costa, Miguel Reis da Conceição
author_facet Costa, Miguel Reis da Conceição
author_role author
dc.contributor.author.fl_str_mv Costa, Miguel Reis da Conceição
dc.subject.por.fl_str_mv Archaeology
LiDAR
Machine learning
Remote sensing
Random forest
Gradient
Entropy
topic Archaeology
LiDAR
Machine learning
Remote sensing
Random forest
Gradient
Entropy
description This dissertation was developed within the project ODYSSEY, funded by the PORTUGAL2020 program. The project ODYSSEY aims to implement a platform of geographic information destined to archaeology and heritage experts. Identifying archaeological sites provides information that allows the consolidation of heritage information sources. The identification and detection of archaeological sites on the platform are expected to be automatic using Deep learning and Machine Learning techniques and data obtained from non-intrusive methods like LiDAR. The objective of this dissertation is to develop a method that can identify archaeological objects using Machine Learning and Feature Extraction. For this, it is required pre-processing the data provided, since the data is not ready to be used, being required to extract specific images of the target objects to build a dataset needed for training the classifier. There’s also a need to apply feature extraction in order to improve the data for training the classifier. The dataset used in the experimental results of this dissertation corresponds to LiDAR data obtained in a region around the city of Viana do Castelo, north of Portugal. This data contains 136 annotated archaeological sites called “mamoas”. After the pre-processing, several tests were executed, starting by evaluating the efficiency, while using different sizes of the bounding boxes. Several Machine Learning methods were evaluated, such as Random Forest, Adaptive Boosting, Bagging, Extra Trees, Gradient Boosting and Histogram of Gradient Boosting. It was also applied some texture filters on the images obtained from the data in order to extract features to use them on the classifier. It was obtained a combination that provided better results that are going to be used in future work. Afterwards, it was implemented an inference method using a sliding window where the objective is to identify objects of interest present in the area providing with the respective geographic coordinates.
publishDate 2022
dc.date.none.fl_str_mv 2022-10-17T00:00:00Z
2022-10-17
2023-07-17T13:55:29Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10773/38724
url http://hdl.handle.net/10773/38724
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799137740380438528