Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , , , , , , , , , , , |
Tipo de documento: | Artigo |
Idioma: | por |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10174/33513 https://doi.org/3. Infante, P., Jacinto, G., Afonso, A., Rego, L., Nogueira, V., Quaresma, P., Saias, J., Santos, D., Nogueira, P., Silva, M., Costa, R. P., Gois, P., Manuel, P. R. (2022). Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification. Computers, 11, 80. https://doi.org/10.3390/computers11050080 https://doi.org/10.3390/computers11050080 |
Resumo: | Portugal has the sixth highest road fatality rate among European Union members. This is a problem of different dimensions with serious consequences in people’s lives. This study analyses daily data from police and government authorities on road traffic accidents that occurred between 2016 and 2019 in a district of Portugal. This paper looks for the determinants that contribute to the existence of victims in road traffic accidents, as well as the determinants for fatalities and/or serious injuries in accidents with victims. We use logistic regression models, and the results are compared to the machine-learning model results. For the severity model, where the response variable indicates whether only property damage or casualties resulted in the traffic accident, we used a large sample with a small imbalance. For the serious injuries model, where the response variable indicates whether or not there were victims with serious injuries and/or fatalities in the traffic accident with victims, we used a small sample with very imbalanced data. Empirical analysis supports the conclusion that, with a small sample of imbalanced data, machine-learning models generally do not perform better than statistical models; however, they perform similarly when the sample is large and has a small imbalance. |
id |
RCAP_94cdf8f2e1b014bd0f380dc7ed2fee88 |
---|---|
oai_identifier_str |
oai:dspace.uevora.pt:10174/33513 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classificationinjurylogistic regressionmachine learningroad traffic accidentsseverity of victimsPortugal has the sixth highest road fatality rate among European Union members. This is a problem of different dimensions with serious consequences in people’s lives. This study analyses daily data from police and government authorities on road traffic accidents that occurred between 2016 and 2019 in a district of Portugal. This paper looks for the determinants that contribute to the existence of victims in road traffic accidents, as well as the determinants for fatalities and/or serious injuries in accidents with victims. We use logistic regression models, and the results are compared to the machine-learning model results. For the severity model, where the response variable indicates whether only property damage or casualties resulted in the traffic accident, we used a large sample with a small imbalance. For the serious injuries model, where the response variable indicates whether or not there were victims with serious injuries and/or fatalities in the traffic accident with victims, we used a small sample with very imbalanced data. Empirical analysis supports the conclusion that, with a small sample of imbalanced data, machine-learning models generally do not perform better than statistical models; however, they perform similarly when the sample is large and has a small imbalance.2023-01-17T12:03:20Z2023-01-172022-05-16T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10174/33513https://doi.org/3. Infante, P., Jacinto, G., Afonso, A., Rego, L., Nogueira, V., Quaresma, P., Saias, J., Santos, D., Nogueira, P., Silva, M., Costa, R. P., Gois, P., Manuel, P. R. (2022). Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification. Computers, 11, 80. https://doi.org/10.3390/computers11050080http://hdl.handle.net/10174/33513https://doi.org/10.3390/computers11050080por805Computers11pinfante@uevora.ptgjcj@uevora.ptaafonso@uevora.ptlrego@uevora.ptvbn@uevora.ptpq@uevora.ptjsaias@uevora.ptdfsantos@uevora.ptpmn@uevora.ptmarcelogs@uevora.ptrosalina@uevora.ptpafg@uevora.ptpjsrm@uevora.pt336infante, pauloJacinto, GonçaloAfonso, AnabelaRego, LeonorNogueira, VitorQuaresma, PauloSaias, JoséSantos, DanielNogueira, PedroSilva, MarceloCosta, RosalinaGóis, PatríciaManuel, Paulo Rebeloinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-03T19:34:47Zoai:dspace.uevora.pt:10174/33513Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:22:05.428882Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification |
title |
Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification |
spellingShingle |
Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification infante, paulo injury logistic regression machine learning road traffic accidents severity of victims |
title_short |
Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification |
title_full |
Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification |
title_fullStr |
Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification |
title_full_unstemmed |
Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification |
title_sort |
Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification |
author |
infante, paulo |
author_facet |
infante, paulo Jacinto, Gonçalo Afonso, Anabela Rego, Leonor Nogueira, Vitor Quaresma, Paulo Saias, José Santos, Daniel Nogueira, Pedro Silva, Marcelo Costa, Rosalina Góis, Patrícia Manuel, Paulo Rebelo |
author_role |
author |
author2 |
Jacinto, Gonçalo Afonso, Anabela Rego, Leonor Nogueira, Vitor Quaresma, Paulo Saias, José Santos, Daniel Nogueira, Pedro Silva, Marcelo Costa, Rosalina Góis, Patrícia Manuel, Paulo Rebelo |
author2_role |
author author author author author author author author author author author author |
dc.contributor.author.fl_str_mv |
infante, paulo Jacinto, Gonçalo Afonso, Anabela Rego, Leonor Nogueira, Vitor Quaresma, Paulo Saias, José Santos, Daniel Nogueira, Pedro Silva, Marcelo Costa, Rosalina Góis, Patrícia Manuel, Paulo Rebelo |
dc.subject.por.fl_str_mv |
injury logistic regression machine learning road traffic accidents severity of victims |
topic |
injury logistic regression machine learning road traffic accidents severity of victims |
description |
Portugal has the sixth highest road fatality rate among European Union members. This is a problem of different dimensions with serious consequences in people’s lives. This study analyses daily data from police and government authorities on road traffic accidents that occurred between 2016 and 2019 in a district of Portugal. This paper looks for the determinants that contribute to the existence of victims in road traffic accidents, as well as the determinants for fatalities and/or serious injuries in accidents with victims. We use logistic regression models, and the results are compared to the machine-learning model results. For the severity model, where the response variable indicates whether only property damage or casualties resulted in the traffic accident, we used a large sample with a small imbalance. For the serious injuries model, where the response variable indicates whether or not there were victims with serious injuries and/or fatalities in the traffic accident with victims, we used a small sample with very imbalanced data. Empirical analysis supports the conclusion that, with a small sample of imbalanced data, machine-learning models generally do not perform better than statistical models; however, they perform similarly when the sample is large and has a small imbalance. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-05-16T00:00:00Z 2023-01-17T12:03:20Z 2023-01-17 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10174/33513 https://doi.org/3. Infante, P., Jacinto, G., Afonso, A., Rego, L., Nogueira, V., Quaresma, P., Saias, J., Santos, D., Nogueira, P., Silva, M., Costa, R. P., Gois, P., Manuel, P. R. (2022). Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification. Computers, 11, 80. https://doi.org/10.3390/computers11050080 http://hdl.handle.net/10174/33513 https://doi.org/10.3390/computers11050080 |
url |
http://hdl.handle.net/10174/33513 https://doi.org/3. Infante, P., Jacinto, G., Afonso, A., Rego, L., Nogueira, V., Quaresma, P., Saias, J., Santos, D., Nogueira, P., Silva, M., Costa, R. P., Gois, P., Manuel, P. R. (2022). Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification. Computers, 11, 80. https://doi.org/10.3390/computers11050080 https://doi.org/10.3390/computers11050080 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.none.fl_str_mv |
80 5 Computers 11 pinfante@uevora.pt gjcj@uevora.pt aafonso@uevora.pt lrego@uevora.pt vbn@uevora.pt pq@uevora.pt jsaias@uevora.pt dfsantos@uevora.pt pmn@uevora.pt marcelogs@uevora.pt rosalina@uevora.pt pafg@uevora.pt pjsrm@uevora.pt 336 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799136702330044416 |