A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE

Detalhes bibliográficos
Autor(a) principal: Vieira, Joana da Silva
Data de Publicação: 2023
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/160218
Resumo: Crude oil price volatility has an impact on the global economy and oil-dependent industries and is influenced by supply and demand, geopolitical tensions, and the global economy. Every day, a massive amount of textual information flows in the form of news articles, which humans use to forecast future trends. News articles can have a significant impact on the price of crude oil because they contain information about recent events, trends, and advancements in the industry. The purpose of this work is to investigate how news articles may affect crude oil prices, using the concept of topic modeling and its potential for handling data. Using the webscraping method, the data for the study comes from a large dataset of news articles about the crude oil industry. These news articles were published between January 1 and December 31, 2022, and come from four different sources. The data was compiled using the source Exchange Rates UK to demonstrate how the price of crude oil fluctuated during this period. After the cleaning process was completed, the dataset contained a total of 1532 news articles. The Latent Dirichlet Allocation (LDA) technique is suggested for extracting relevant keywords from news articles and then using the findings as input features to forecast the crude oil price. The forecasting methods employed in the study were the Ridge model, the Random Forest and XGBoost techniques, and the time series method ARIMAX. The outcomes of the experiment indicate that the association between the meaning of the news articles and the crude oil price is not sufficiently strong. It is additionally concluded that the XGBoost algorithm reveal superior predictive performance in the training set. As a result, XGBoost models for each month of 2022 were developed to investigate the impact of features and determine the most important ones for the problem.
id RCAP_dcf03be325ec12b56e579ab3045d6570
oai_identifier_str oai:run.unl.pt:10362/160218
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICELatent Dirichlet Allocation (LDA)RidgeRandom ForestXGBoostARIMAXcrude oil price forecastDomínio/Área Científica::Ciências Naturais::MatemáticasCrude oil price volatility has an impact on the global economy and oil-dependent industries and is influenced by supply and demand, geopolitical tensions, and the global economy. Every day, a massive amount of textual information flows in the form of news articles, which humans use to forecast future trends. News articles can have a significant impact on the price of crude oil because they contain information about recent events, trends, and advancements in the industry. The purpose of this work is to investigate how news articles may affect crude oil prices, using the concept of topic modeling and its potential for handling data. Using the webscraping method, the data for the study comes from a large dataset of news articles about the crude oil industry. These news articles were published between January 1 and December 31, 2022, and come from four different sources. The data was compiled using the source Exchange Rates UK to demonstrate how the price of crude oil fluctuated during this period. After the cleaning process was completed, the dataset contained a total of 1532 news articles. The Latent Dirichlet Allocation (LDA) technique is suggested for extracting relevant keywords from news articles and then using the findings as input features to forecast the crude oil price. The forecasting methods employed in the study were the Ridge model, the Random Forest and XGBoost techniques, and the time series method ARIMAX. The outcomes of the experiment indicate that the association between the meaning of the news articles and the crude oil price is not sufficiently strong. It is additionally concluded that the XGBoost algorithm reveal superior predictive performance in the training set. As a result, XGBoost models for each month of 2022 were developed to investigate the impact of features and determine the most important ones for the problem.A volatilidade dos preços do petróleo bruto tem um impacto na economia global e nas indústrias dependentes do petróleo e é influenciada pela oferta e procura, por tensões geopolíticas e pela economia global. Todos os dias uma enorme quantidade de informação flui sob a forma de artigos de notícias e é utilizada pelo ser humano para prever tendências futuras. Os artigos de notícias podem influenciar significativamente o preço do petróleo bruto porque contêm informação sobre eventos recentes, tendências e avanços na indústria. O objetivo deste trabalho é investigar como os artigos de notícias podem afetar os preços do petróleo bruto, utilizando o conceito de modelação de tópicos. Utilizando o método web-scraping, os dados para o estudo provêm de um grande conjunto de artigos de notícias sobre a indústria do petróleo bruto. Estes artigos foram publicados entre 1 de janeiro e 31 de dezembro de 2022 e resultam de quatro fontes diferentes. Os dados foram compilados usando a fonte Exchange Rates UK para demonstrar como o preço do petróleo bruto flutuou ao longo deste período. Após a conclusão do processo de limpeza, obteve-se um total de 1532 artigos de notícias. A técnica Latent Dirichlet Allocation (LDA) é sugerida para extrair as palavras-chave pertinentes dos artigos de notícias. Os seus resultados foram depois utilizados como variáveis de entrada para prever o preço do petróleo bruto. Os métodos de previsão utilizados no estudo foram os modelos Ridge, Random Forest, XGBoost e ARIMAX. Os resultados indicam que a relação entre os artigos de notícias e o preço do petróleo bruto não é suficientemente forte. Conclui-se que o algoritmo XGBoost revela um desempenho preditivo superior no conjunto de treino. Como resultado, foram desenvolvidos modelos XGBoost para cada mês de 2022 para investigar o impacto das características e determinar as mais importantes para o problema.Norouzirad, MinaMarques, FilipeRUNVieira, Joana da Silva2023-11-21T14:28:10Z2023-072023-07-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/160218enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:42:48Zoai:run.unl.pt:10362/160218Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:57:54.983346Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE
title A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE
spellingShingle A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE
Vieira, Joana da Silva
Latent Dirichlet Allocation (LDA)
Ridge
Random Forest
XGBoost
ARIMAX
crude oil price forecast
Domínio/Área Científica::Ciências Naturais::Matemáticas
title_short A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE
title_full A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE
title_fullStr A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE
title_full_unstemmed A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE
title_sort A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE
author Vieira, Joana da Silva
author_facet Vieira, Joana da Silva
author_role author
dc.contributor.none.fl_str_mv Norouzirad, Mina
Marques, Filipe
RUN
dc.contributor.author.fl_str_mv Vieira, Joana da Silva
dc.subject.por.fl_str_mv Latent Dirichlet Allocation (LDA)
Ridge
Random Forest
XGBoost
ARIMAX
crude oil price forecast
Domínio/Área Científica::Ciências Naturais::Matemáticas
topic Latent Dirichlet Allocation (LDA)
Ridge
Random Forest
XGBoost
ARIMAX
crude oil price forecast
Domínio/Área Científica::Ciências Naturais::Matemáticas
description Crude oil price volatility has an impact on the global economy and oil-dependent industries and is influenced by supply and demand, geopolitical tensions, and the global economy. Every day, a massive amount of textual information flows in the form of news articles, which humans use to forecast future trends. News articles can have a significant impact on the price of crude oil because they contain information about recent events, trends, and advancements in the industry. The purpose of this work is to investigate how news articles may affect crude oil prices, using the concept of topic modeling and its potential for handling data. Using the webscraping method, the data for the study comes from a large dataset of news articles about the crude oil industry. These news articles were published between January 1 and December 31, 2022, and come from four different sources. The data was compiled using the source Exchange Rates UK to demonstrate how the price of crude oil fluctuated during this period. After the cleaning process was completed, the dataset contained a total of 1532 news articles. The Latent Dirichlet Allocation (LDA) technique is suggested for extracting relevant keywords from news articles and then using the findings as input features to forecast the crude oil price. The forecasting methods employed in the study were the Ridge model, the Random Forest and XGBoost techniques, and the time series method ARIMAX. The outcomes of the experiment indicate that the association between the meaning of the news articles and the crude oil price is not sufficiently strong. It is additionally concluded that the XGBoost algorithm reveal superior predictive performance in the training set. As a result, XGBoost models for each month of 2022 were developed to investigate the impact of features and determine the most important ones for the problem.
publishDate 2023
dc.date.none.fl_str_mv 2023-11-21T14:28:10Z
2023-07
2023-07-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/160218
url http://hdl.handle.net/10362/160218
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138160930717696