Crime inference using machine learning and geographical data

Roque, Miguel Francisco Frade

Crime inference using machine learning and geographical data

Detalhes bibliográficos
Autor(a) principal:	Roque, Miguel Francisco Frade
Data de Publicação:	2023
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10400.14/41436
Resumo:	Crimes are not random events in society, and eventually something must influence their occurrence. It is by characterizing the environment that it is possible to create algorithms that predict the criminal activity in a certain place and at some point in time, which allows its anticipation and prevention through decision-making in public policy. This study focusses on finding the best way to predict crimes, that is, which types of features are the most important to consider while predicting crimes, and which methods are the most predictive. An analysis of the city of Philadelphia, in the state of Pennsylvania (USA), is made, taking into account the urban, racial, demographic and socioeconomic characteristics of its different geographical blocks, and the number of criminal occurrences in each of them, over multiple years. The methods used are both linear and non-linear. When non-linear methods are used, via machine learning techniques, it is evident that the prediction of the number of crimes is much more assertive for any type of variable, leading to the conclusion that the relationships studied here are not linear in nature, and therefore tree based models (especially gradient boosting and random forest) represent the most suitable approach for this data. In this perspective, the models that consider only the socio-demographic characteristics of the neighborhoods are significantly more effective in forecasting than the entirely urban ones.

Metadados do item

id	RCAP_79c724f3d64611ef2cbe9c2f264a79b5
oai_identifier_str	oai:repositorio.ucp.pt:10400.14/41436
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Crime inference using machine learning and geographical dataCrimesSocio-demographicUrbanLinearNon-linearDomínio/Área Científica::Ciências Sociais::Economia e GestãoCrimes are not random events in society, and eventually something must influence their occurrence. It is by characterizing the environment that it is possible to create algorithms that predict the criminal activity in a certain place and at some point in time, which allows its anticipation and prevention through decision-making in public policy. This study focusses on finding the best way to predict crimes, that is, which types of features are the most important to consider while predicting crimes, and which methods are the most predictive. An analysis of the city of Philadelphia, in the state of Pennsylvania (USA), is made, taking into account the urban, racial, demographic and socioeconomic characteristics of its different geographical blocks, and the number of criminal occurrences in each of them, over multiple years. The methods used are both linear and non-linear. When non-linear methods are used, via machine learning techniques, it is evident that the prediction of the number of crimes is much more assertive for any type of variable, leading to the conclusion that the relationships studied here are not linear in nature, and therefore tree based models (especially gradient boosting and random forest) represent the most suitable approach for this data. In this perspective, the models that consider only the socio-demographic characteristics of the neighborhoods are significantly more effective in forecasting than the entirely urban ones.Os crimes não são eventos aleatórios na sociedade e, eventualmente, algo deve influenciar a sua ocorrência. É pela caracterização do ambiente que é possível criar algoritmos que preveem a atividade criminosa num determinado local e em algum momento no tempo, o que permite a sua antecipação e prevenção por meio das tomadas de decisão na política pública. Este estudo foca-se em encontrar a melhor forma de prever crimes, ou seja, que tipos de características são as mais importantes a considerar na previsão de crimes, e que métodos são os mais preditivos. É feita uma análise da cidade de Filadélfia, no estado da Pensilvânia (EUA), tendo em consideração as características urbanas, raciais, demográficas e socioeconómicas dos seus diferentes quarteirões geográficos, e o número de ocorrências criminais em cada um deles, ao longo de vários anos. Os métodos utilizados são lineares e não lineares. Quando são utilizados métodos não lineares, através de técnicas de machine learning, fica evidente que a previsão do número de crimes é muito mais assertiva para qualquer tipo de variável, levando à conclusão de que as relações aqui estudadas não são de natureza linear e, portanto, modelos baseados em árvores de decisão (especialmente gradient boosting e random forest) representam a abordagem mais adequada para estes dados. Nessa perspetiva, os modelos que consideram apenas as características sociodemográficas dos bairros são significativamente mais eficazes na previsão do que os inteiramente urbanos.Bertani, NicolòVeritati - Repositório Institucional da Universidade Católica PortuguesaRoque, Miguel Francisco Frade2023-06-26T13:20:36Z2023-02-032023-012023-02-03T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.14/41436TID:203278755enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-12T17:47:01Zoai:repositorio.ucp.pt:10400.14/41436Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T18:34:07.766095Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Crime inference using machine learning and geographical data
title	Crime inference using machine learning and geographical data
spellingShingle	Crime inference using machine learning and geographical data Roque, Miguel Francisco Frade Crimes Socio-demographic Urban Linear Non-linear Domínio/Área Científica::Ciências Sociais::Economia e Gestão
title_short	Crime inference using machine learning and geographical data
title_full	Crime inference using machine learning and geographical data
title_fullStr	Crime inference using machine learning and geographical data
title_full_unstemmed	Crime inference using machine learning and geographical data
title_sort	Crime inference using machine learning and geographical data
author	Roque, Miguel Francisco Frade
author_facet	Roque, Miguel Francisco Frade
author_role	author
dc.contributor.none.fl_str_mv	Bertani, Nicolò Veritati - Repositório Institucional da Universidade Católica Portuguesa
dc.contributor.author.fl_str_mv	Roque, Miguel Francisco Frade
dc.subject.por.fl_str_mv	Crimes Socio-demographic Urban Linear Non-linear Domínio/Área Científica::Ciências Sociais::Economia e Gestão
topic	Crimes Socio-demographic Urban Linear Non-linear Domínio/Área Científica::Ciências Sociais::Economia e Gestão
description	Crimes are not random events in society, and eventually something must influence their occurrence. It is by characterizing the environment that it is possible to create algorithms that predict the criminal activity in a certain place and at some point in time, which allows its anticipation and prevention through decision-making in public policy. This study focusses on finding the best way to predict crimes, that is, which types of features are the most important to consider while predicting crimes, and which methods are the most predictive. An analysis of the city of Philadelphia, in the state of Pennsylvania (USA), is made, taking into account the urban, racial, demographic and socioeconomic characteristics of its different geographical blocks, and the number of criminal occurrences in each of them, over multiple years. The methods used are both linear and non-linear. When non-linear methods are used, via machine learning techniques, it is evident that the prediction of the number of crimes is much more assertive for any type of variable, leading to the conclusion that the relationships studied here are not linear in nature, and therefore tree based models (especially gradient boosting and random forest) represent the most suitable approach for this data. In this perspective, the models that consider only the socio-demographic characteristics of the neighborhoods are significantly more effective in forecasting than the entirely urban ones.
publishDate	2023
dc.date.none.fl_str_mv	2023-06-26T13:20:36Z 2023-02-03 2023-01 2023-02-03T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10400.14/41436 TID:203278755
url	http://hdl.handle.net/10400.14/41436
identifier_str_mv	TID:203278755
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799132067619930112

Crime inference using machine learning and geographical data

Registros relacionados