Flow time series clustering for demand pattern recognition in drinking water distribution systems: New insights about the most adequate methods

Detalhes bibliográficos
Autor(a) principal: Gomes, Pedro André Fonseca Garez
Data de Publicação: 2019
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10071/20195
Resumo: This study presents a proposal of clustering methodologies for demand pattern recognition using network flow data collected from a large set of drinking water distribution networks in Portugal. Most of the existing studies about clustering in flow time series rely on hierarchical or k-Means clustering algorithms with inelastic measures distances. This study explores alternative clustering algorithms, distance measures, comparison time windows, internal index metrics and clustering prototypes. The performance of the alternative clustering methodology was assessed in terms of multiple internal index metrics and the characterization of the cluster centroids. The methods with the best performance were Partition Algorithm with DTW distance, PAM prototype with 15 minutes time window and the Partition Algorithm with GAK distance, PAM prototype and 15 minutes time window because they allow a clear partition of flow time series in three clusters. The first method identifies a night consumption pattern, a typical weekend pattern and a typical working day pattern, whereas the second one identifies a pattern with small variability between night and daily consumption. To improve knowledge extraction, in terms of typical and anomalous existing patterns, additional clustering operations were performed with the flow data set that belongs to the cluster with small variability between night and daily consumption. New clusters were identified and characterized regarding weekday, geographical location, and dry months and wet months, showing that patterns associated with garden irrigation are independent of the period of the day and season of the year, which indicates an inefficient water use.
id RCAP_9fb022a3b33b93eafa70bdc7274e7b8b
oai_identifier_str oai:repositorio.iscte-iul.pt:10071/20195
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Flow time series clustering for demand pattern recognition in drinking water distribution systems: New insights about the most adequate methodsUnsupervised learningTime series clusteringFlow time seriesDemand pattern recognitionWater distribution systemsAprendizagem não supervisionadaClustering de series temporaisSéries temporais de caudalReconhecimento de padrões de consumoSistemas de distribuição de águaAlgoritmo -- AlgorithmClusters -- ClustersSéries temporaisUtilização da águaPortugalThis study presents a proposal of clustering methodologies for demand pattern recognition using network flow data collected from a large set of drinking water distribution networks in Portugal. Most of the existing studies about clustering in flow time series rely on hierarchical or k-Means clustering algorithms with inelastic measures distances. This study explores alternative clustering algorithms, distance measures, comparison time windows, internal index metrics and clustering prototypes. The performance of the alternative clustering methodology was assessed in terms of multiple internal index metrics and the characterization of the cluster centroids. The methods with the best performance were Partition Algorithm with DTW distance, PAM prototype with 15 minutes time window and the Partition Algorithm with GAK distance, PAM prototype and 15 minutes time window because they allow a clear partition of flow time series in three clusters. The first method identifies a night consumption pattern, a typical weekend pattern and a typical working day pattern, whereas the second one identifies a pattern with small variability between night and daily consumption. To improve knowledge extraction, in terms of typical and anomalous existing patterns, additional clustering operations were performed with the flow data set that belongs to the cluster with small variability between night and daily consumption. New clusters were identified and characterized regarding weekday, geographical location, and dry months and wet months, showing that patterns associated with garden irrigation are independent of the period of the day and season of the year, which indicates an inefficient water use.Este estudo apresenta uma proposta de metodologias de clustering para reconhecimento de padrões de consumo usando um conjunto de dados de caudal coletados em redes de distribuição de água em Portugal. A maioria dos estudos existentes sobre clustering em séries temporais de caudal baseia-se em algoritmos de clustering hierárquicos ou de k-Means com medidas de distâncias inelásticas. Este estudo explora alternativas de algoritmos de clustering, medidas de distância, janelas temporais de comparação, medidas de índice interno e protótipos de clustering. O desempenho das metodologias de clustering foi avaliado em termos de medidas de índice interno e também através da caracterização dos centroides dos clusters. As metodologias com melhor desempenho foram o Algoritmo de Partição com distância DTW, protótipo PAM e janela de temporal de 15 minutos e o Algoritmo de Partição com distância GAK, protótipo PAM e janela de temporal de 15 minutos, pois permitiram a formação três clusters. O primeiro método identifica um padrão de consumo noturno, um padrão típico de fim-de-semana e um padrão típico de dia útil, enquanto o segundo método destaca-se por apresentar um padrão com pequena variabilidade entre o consumo noturno e diurno. Para melhorar a extração de conhecimento, operações adicionais de clustering foram realizadas ao conjunto de dados que pertence ao cluster com pequena variabilidade entre consumo noturno e diurno. Novos clusters foram identificados e caracterizados, mostrando que os padrões associados à irrigação são independentes do período do dia e da época do ano, o que indica um uso ineficiente da água.2021-12-11T00:00:00Z2019-12-12T00:00:00Z2019-12-122019-12info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10071/20195TID:202462005engGomes, Pedro André Fonseca Garezinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T18:01:03Zoai:repositorio.iscte-iul.pt:10071/20195Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:32:31.744935Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Flow time series clustering for demand pattern recognition in drinking water distribution systems: New insights about the most adequate methods
title Flow time series clustering for demand pattern recognition in drinking water distribution systems: New insights about the most adequate methods
spellingShingle Flow time series clustering for demand pattern recognition in drinking water distribution systems: New insights about the most adequate methods
Gomes, Pedro André Fonseca Garez
Unsupervised learning
Time series clustering
Flow time series
Demand pattern recognition
Water distribution systems
Aprendizagem não supervisionada
Clustering de series temporais
Séries temporais de caudal
Reconhecimento de padrões de consumo
Sistemas de distribuição de água
Algoritmo -- Algorithm
Clusters -- Clusters
Séries temporais
Utilização da água
Portugal
title_short Flow time series clustering for demand pattern recognition in drinking water distribution systems: New insights about the most adequate methods
title_full Flow time series clustering for demand pattern recognition in drinking water distribution systems: New insights about the most adequate methods
title_fullStr Flow time series clustering for demand pattern recognition in drinking water distribution systems: New insights about the most adequate methods
title_full_unstemmed Flow time series clustering for demand pattern recognition in drinking water distribution systems: New insights about the most adequate methods
title_sort Flow time series clustering for demand pattern recognition in drinking water distribution systems: New insights about the most adequate methods
author Gomes, Pedro André Fonseca Garez
author_facet Gomes, Pedro André Fonseca Garez
author_role author
dc.contributor.author.fl_str_mv Gomes, Pedro André Fonseca Garez
dc.subject.por.fl_str_mv Unsupervised learning
Time series clustering
Flow time series
Demand pattern recognition
Water distribution systems
Aprendizagem não supervisionada
Clustering de series temporais
Séries temporais de caudal
Reconhecimento de padrões de consumo
Sistemas de distribuição de água
Algoritmo -- Algorithm
Clusters -- Clusters
Séries temporais
Utilização da água
Portugal
topic Unsupervised learning
Time series clustering
Flow time series
Demand pattern recognition
Water distribution systems
Aprendizagem não supervisionada
Clustering de series temporais
Séries temporais de caudal
Reconhecimento de padrões de consumo
Sistemas de distribuição de água
Algoritmo -- Algorithm
Clusters -- Clusters
Séries temporais
Utilização da água
Portugal
description This study presents a proposal of clustering methodologies for demand pattern recognition using network flow data collected from a large set of drinking water distribution networks in Portugal. Most of the existing studies about clustering in flow time series rely on hierarchical or k-Means clustering algorithms with inelastic measures distances. This study explores alternative clustering algorithms, distance measures, comparison time windows, internal index metrics and clustering prototypes. The performance of the alternative clustering methodology was assessed in terms of multiple internal index metrics and the characterization of the cluster centroids. The methods with the best performance were Partition Algorithm with DTW distance, PAM prototype with 15 minutes time window and the Partition Algorithm with GAK distance, PAM prototype and 15 minutes time window because they allow a clear partition of flow time series in three clusters. The first method identifies a night consumption pattern, a typical weekend pattern and a typical working day pattern, whereas the second one identifies a pattern with small variability between night and daily consumption. To improve knowledge extraction, in terms of typical and anomalous existing patterns, additional clustering operations were performed with the flow data set that belongs to the cluster with small variability between night and daily consumption. New clusters were identified and characterized regarding weekday, geographical location, and dry months and wet months, showing that patterns associated with garden irrigation are independent of the period of the day and season of the year, which indicates an inefficient water use.
publishDate 2019
dc.date.none.fl_str_mv 2019-12-12T00:00:00Z
2019-12-12
2019-12
2021-12-11T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10071/20195
TID:202462005
url http://hdl.handle.net/10071/20195
identifier_str_mv TID:202462005
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134886871695360