On the automated learning of air pollution prediction models from data collected by mobile sensor networks
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10071/24971 |
Resumo: | This paper addresses the problem of automated learning of air pollution predictive models that were trained using information gathered by a set of mobile low-cost sensors. Concretely, fast to compute machine learning methods (Decision Trees and Support Vector Machines) were used to build regression models that predict air pollution levels for a given location. The models were trained using the data collected by the OpenSense project, in particular, number of particulate matter, particle diameter, and lung deposited surface area (LDSA). We examined two different sets of attributes: one based on a geographical description of the location under analysis (e.g. distribution of households and roads), and another based on a time series of past air pollution observations in that location. Overall, we have found out that past measures lead to better pollution predictions. The best R2 score was 0.751 obtained with the model that predicts LDSA and was trained with the data set with time series attributes, while the worst R2 was 0.009 obtained with the geographical data set to predict number of particles. The performance of the best model is on par with similar air pollution systems. Moreover it can be used in a production system that requires frequent updates. |
id |
RCAP_e8dd884f9416a780d0c6fabe2cd7704f |
---|---|
oai_identifier_str |
oai:repositorio.iscte-iul.pt:10071/24971 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
On the automated learning of air pollution prediction models from data collected by mobile sensor networksAir pollutionDecision treeLand-useMachine learningSupport vector machineTime-seriesThis paper addresses the problem of automated learning of air pollution predictive models that were trained using information gathered by a set of mobile low-cost sensors. Concretely, fast to compute machine learning methods (Decision Trees and Support Vector Machines) were used to build regression models that predict air pollution levels for a given location. The models were trained using the data collected by the OpenSense project, in particular, number of particulate matter, particle diameter, and lung deposited surface area (LDSA). We examined two different sets of attributes: one based on a geographical description of the location under analysis (e.g. distribution of households and roads), and another based on a time series of past air pollution observations in that location. Overall, we have found out that past measures lead to better pollution predictions. The best R2 score was 0.751 obtained with the model that predicts LDSA and was trained with the data set with time series attributes, while the worst R2 was 0.009 obtained with the geographical data set to predict number of particles. The performance of the best model is on par with similar air pollution systems. Moreover it can be used in a production system that requires frequent updates.Taylor and Francis2022-08-28T00:00:00Z2021-01-01T00:00:00Z20212022-04-01T16:12:01Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10071/24971eng1556-703610.1080/15567036.2021.1968076Mariano, P.Almeida, S. M.Santana, P.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:35:42Zoai:repositorio.iscte-iul.pt:10071/24971Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:16:09.379321Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
On the automated learning of air pollution prediction models from data collected by mobile sensor networks |
title |
On the automated learning of air pollution prediction models from data collected by mobile sensor networks |
spellingShingle |
On the automated learning of air pollution prediction models from data collected by mobile sensor networks Mariano, P. Air pollution Decision tree Land-use Machine learning Support vector machine Time-series |
title_short |
On the automated learning of air pollution prediction models from data collected by mobile sensor networks |
title_full |
On the automated learning of air pollution prediction models from data collected by mobile sensor networks |
title_fullStr |
On the automated learning of air pollution prediction models from data collected by mobile sensor networks |
title_full_unstemmed |
On the automated learning of air pollution prediction models from data collected by mobile sensor networks |
title_sort |
On the automated learning of air pollution prediction models from data collected by mobile sensor networks |
author |
Mariano, P. |
author_facet |
Mariano, P. Almeida, S. M. Santana, P. |
author_role |
author |
author2 |
Almeida, S. M. Santana, P. |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Mariano, P. Almeida, S. M. Santana, P. |
dc.subject.por.fl_str_mv |
Air pollution Decision tree Land-use Machine learning Support vector machine Time-series |
topic |
Air pollution Decision tree Land-use Machine learning Support vector machine Time-series |
description |
This paper addresses the problem of automated learning of air pollution predictive models that were trained using information gathered by a set of mobile low-cost sensors. Concretely, fast to compute machine learning methods (Decision Trees and Support Vector Machines) were used to build regression models that predict air pollution levels for a given location. The models were trained using the data collected by the OpenSense project, in particular, number of particulate matter, particle diameter, and lung deposited surface area (LDSA). We examined two different sets of attributes: one based on a geographical description of the location under analysis (e.g. distribution of households and roads), and another based on a time series of past air pollution observations in that location. Overall, we have found out that past measures lead to better pollution predictions. The best R2 score was 0.751 obtained with the model that predicts LDSA and was trained with the data set with time series attributes, while the worst R2 was 0.009 obtained with the geographical data set to predict number of particles. The performance of the best model is on par with similar air pollution systems. Moreover it can be used in a production system that requires frequent updates. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-01-01T00:00:00Z 2021 2022-08-28T00:00:00Z 2022-04-01T16:12:01Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10071/24971 |
url |
http://hdl.handle.net/10071/24971 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
1556-7036 10.1080/15567036.2021.1968076 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Taylor and Francis |
publisher.none.fl_str_mv |
Taylor and Francis |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134719955173376 |