Detection of outliers in multivariate data: a method based on clustering and robust estimators
Autor(a) principal: | |
---|---|
Data de Publicação: | 2002 |
Outros Autores: | |
Tipo de documento: | Livro |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/65794 |
Resumo: | Outlier identification is important in many applications of multivariate analysis. Either because there is some specific interest in finding anomalous observations or as a pre-processing task before the application of some multivariate method, in order to preserve the results from possible harmful effects of those observations. It is also of great interest in supervised classification (or discriminant analysis) if, when predicting group membership, one wants to have the possibility of labelling an observation as does not belong to any of the available groups. The identification of outliers in multivariate data is usually based on Mahalanobis distance. The use of robust estimates of the mean and the covariance matrix is advised in order to avoid the masking effect (Rousseeuw and Leroy, 1985; Rousseeuw and von Zomeren, 1990; Rocke and Woodruff, 1996; Becker and Gather, 1999). However, the performance of these rules is still highly dependent of multivariate normality of the bulk of the data. The aim of the method here described is to remove this dependence. |
id |
RCAP_6af4ded2001c1fe8cdcf128ce4039331 |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/65794 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Detection of outliers in multivariate data: a method based on clustering and robust estimatorsCiências exactas e naturaisNatural sciencesOutlier identification is important in many applications of multivariate analysis. Either because there is some specific interest in finding anomalous observations or as a pre-processing task before the application of some multivariate method, in order to preserve the results from possible harmful effects of those observations. It is also of great interest in supervised classification (or discriminant analysis) if, when predicting group membership, one wants to have the possibility of labelling an observation as does not belong to any of the available groups. The identification of outliers in multivariate data is usually based on Mahalanobis distance. The use of robust estimates of the mean and the covariance matrix is advised in order to avoid the masking effect (Rousseeuw and Leroy, 1985; Rousseeuw and von Zomeren, 1990; Rocke and Woodruff, 1996; Becker and Gather, 1999). However, the performance of these rules is still highly dependent of multivariate normality of the bulk of the data. The aim of the method here described is to remove this dependence.20022002-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/bookapplication/pdfhttps://hdl.handle.net/10216/65794eng10.1007/978-3-642-57489-4_41Carla M. Santos PereiraAna M. Piresinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T12:41:44Zoai:repositorio-aberto.up.pt:10216/65794Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:24:57.290044Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Detection of outliers in multivariate data: a method based on clustering and robust estimators |
title |
Detection of outliers in multivariate data: a method based on clustering and robust estimators |
spellingShingle |
Detection of outliers in multivariate data: a method based on clustering and robust estimators Carla M. Santos Pereira Ciências exactas e naturais Natural sciences |
title_short |
Detection of outliers in multivariate data: a method based on clustering and robust estimators |
title_full |
Detection of outliers in multivariate data: a method based on clustering and robust estimators |
title_fullStr |
Detection of outliers in multivariate data: a method based on clustering and robust estimators |
title_full_unstemmed |
Detection of outliers in multivariate data: a method based on clustering and robust estimators |
title_sort |
Detection of outliers in multivariate data: a method based on clustering and robust estimators |
author |
Carla M. Santos Pereira |
author_facet |
Carla M. Santos Pereira Ana M. Pires |
author_role |
author |
author2 |
Ana M. Pires |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Carla M. Santos Pereira Ana M. Pires |
dc.subject.por.fl_str_mv |
Ciências exactas e naturais Natural sciences |
topic |
Ciências exactas e naturais Natural sciences |
description |
Outlier identification is important in many applications of multivariate analysis. Either because there is some specific interest in finding anomalous observations or as a pre-processing task before the application of some multivariate method, in order to preserve the results from possible harmful effects of those observations. It is also of great interest in supervised classification (or discriminant analysis) if, when predicting group membership, one wants to have the possibility of labelling an observation as does not belong to any of the available groups. The identification of outliers in multivariate data is usually based on Mahalanobis distance. The use of robust estimates of the mean and the covariance matrix is advised in order to avoid the masking effect (Rousseeuw and Leroy, 1985; Rousseeuw and von Zomeren, 1990; Rocke and Woodruff, 1996; Becker and Gather, 1999). However, the performance of these rules is still highly dependent of multivariate normality of the bulk of the data. The aim of the method here described is to remove this dependence. |
publishDate |
2002 |
dc.date.none.fl_str_mv |
2002 2002-01-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/book |
format |
book |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/65794 |
url |
https://hdl.handle.net/10216/65794 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
10.1007/978-3-642-57489-4_41 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799135554120450048 |