A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/160218 |
Resumo: | Crude oil price volatility has an impact on the global economy and oil-dependent industries and is influenced by supply and demand, geopolitical tensions, and the global economy. Every day, a massive amount of textual information flows in the form of news articles, which humans use to forecast future trends. News articles can have a significant impact on the price of crude oil because they contain information about recent events, trends, and advancements in the industry. The purpose of this work is to investigate how news articles may affect crude oil prices, using the concept of topic modeling and its potential for handling data. Using the webscraping method, the data for the study comes from a large dataset of news articles about the crude oil industry. These news articles were published between January 1 and December 31, 2022, and come from four different sources. The data was compiled using the source Exchange Rates UK to demonstrate how the price of crude oil fluctuated during this period. After the cleaning process was completed, the dataset contained a total of 1532 news articles. The Latent Dirichlet Allocation (LDA) technique is suggested for extracting relevant keywords from news articles and then using the findings as input features to forecast the crude oil price. The forecasting methods employed in the study were the Ridge model, the Random Forest and XGBoost techniques, and the time series method ARIMAX. The outcomes of the experiment indicate that the association between the meaning of the news articles and the crude oil price is not sufficiently strong. It is additionally concluded that the XGBoost algorithm reveal superior predictive performance in the training set. As a result, XGBoost models for each month of 2022 were developed to investigate the impact of features and determine the most important ones for the problem. |
id |
RCAP_dcf03be325ec12b56e579ab3045d6570 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/160218 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICELatent Dirichlet Allocation (LDA)RidgeRandom ForestXGBoostARIMAXcrude oil price forecastDomínio/Área Científica::Ciências Naturais::MatemáticasCrude oil price volatility has an impact on the global economy and oil-dependent industries and is influenced by supply and demand, geopolitical tensions, and the global economy. Every day, a massive amount of textual information flows in the form of news articles, which humans use to forecast future trends. News articles can have a significant impact on the price of crude oil because they contain information about recent events, trends, and advancements in the industry. The purpose of this work is to investigate how news articles may affect crude oil prices, using the concept of topic modeling and its potential for handling data. Using the webscraping method, the data for the study comes from a large dataset of news articles about the crude oil industry. These news articles were published between January 1 and December 31, 2022, and come from four different sources. The data was compiled using the source Exchange Rates UK to demonstrate how the price of crude oil fluctuated during this period. After the cleaning process was completed, the dataset contained a total of 1532 news articles. The Latent Dirichlet Allocation (LDA) technique is suggested for extracting relevant keywords from news articles and then using the findings as input features to forecast the crude oil price. The forecasting methods employed in the study were the Ridge model, the Random Forest and XGBoost techniques, and the time series method ARIMAX. The outcomes of the experiment indicate that the association between the meaning of the news articles and the crude oil price is not sufficiently strong. It is additionally concluded that the XGBoost algorithm reveal superior predictive performance in the training set. As a result, XGBoost models for each month of 2022 were developed to investigate the impact of features and determine the most important ones for the problem.A volatilidade dos preços do petróleo bruto tem um impacto na economia global e nas indústrias dependentes do petróleo e é influenciada pela oferta e procura, por tensões geopolíticas e pela economia global. Todos os dias uma enorme quantidade de informação flui sob a forma de artigos de notícias e é utilizada pelo ser humano para prever tendências futuras. Os artigos de notícias podem influenciar significativamente o preço do petróleo bruto porque contêm informação sobre eventos recentes, tendências e avanços na indústria. O objetivo deste trabalho é investigar como os artigos de notícias podem afetar os preços do petróleo bruto, utilizando o conceito de modelação de tópicos. Utilizando o método web-scraping, os dados para o estudo provêm de um grande conjunto de artigos de notícias sobre a indústria do petróleo bruto. Estes artigos foram publicados entre 1 de janeiro e 31 de dezembro de 2022 e resultam de quatro fontes diferentes. Os dados foram compilados usando a fonte Exchange Rates UK para demonstrar como o preço do petróleo bruto flutuou ao longo deste período. Após a conclusão do processo de limpeza, obteve-se um total de 1532 artigos de notícias. A técnica Latent Dirichlet Allocation (LDA) é sugerida para extrair as palavras-chave pertinentes dos artigos de notícias. Os seus resultados foram depois utilizados como variáveis de entrada para prever o preço do petróleo bruto. Os métodos de previsão utilizados no estudo foram os modelos Ridge, Random Forest, XGBoost e ARIMAX. Os resultados indicam que a relação entre os artigos de notícias e o preço do petróleo bruto não é suficientemente forte. Conclui-se que o algoritmo XGBoost revela um desempenho preditivo superior no conjunto de treino. Como resultado, foram desenvolvidos modelos XGBoost para cada mês de 2022 para investigar o impacto das características e determinar as mais importantes para o problema.Norouzirad, MinaMarques, FilipeRUNVieira, Joana da Silva2023-11-21T14:28:10Z2023-072023-07-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/160218enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:42:48Zoai:run.unl.pt:10362/160218Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:57:54.983346Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE |
title |
A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE |
spellingShingle |
A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE Vieira, Joana da Silva Latent Dirichlet Allocation (LDA) Ridge Random Forest XGBoost ARIMAX crude oil price forecast Domínio/Área Científica::Ciências Naturais::Matemáticas |
title_short |
A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE |
title_full |
A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE |
title_fullStr |
A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE |
title_full_unstemmed |
A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE |
title_sort |
A CASE STUDY ON THE EFFECT OF NEWS ON CRUDE OIL PRICE |
author |
Vieira, Joana da Silva |
author_facet |
Vieira, Joana da Silva |
author_role |
author |
dc.contributor.none.fl_str_mv |
Norouzirad, Mina Marques, Filipe RUN |
dc.contributor.author.fl_str_mv |
Vieira, Joana da Silva |
dc.subject.por.fl_str_mv |
Latent Dirichlet Allocation (LDA) Ridge Random Forest XGBoost ARIMAX crude oil price forecast Domínio/Área Científica::Ciências Naturais::Matemáticas |
topic |
Latent Dirichlet Allocation (LDA) Ridge Random Forest XGBoost ARIMAX crude oil price forecast Domínio/Área Científica::Ciências Naturais::Matemáticas |
description |
Crude oil price volatility has an impact on the global economy and oil-dependent industries and is influenced by supply and demand, geopolitical tensions, and the global economy. Every day, a massive amount of textual information flows in the form of news articles, which humans use to forecast future trends. News articles can have a significant impact on the price of crude oil because they contain information about recent events, trends, and advancements in the industry. The purpose of this work is to investigate how news articles may affect crude oil prices, using the concept of topic modeling and its potential for handling data. Using the webscraping method, the data for the study comes from a large dataset of news articles about the crude oil industry. These news articles were published between January 1 and December 31, 2022, and come from four different sources. The data was compiled using the source Exchange Rates UK to demonstrate how the price of crude oil fluctuated during this period. After the cleaning process was completed, the dataset contained a total of 1532 news articles. The Latent Dirichlet Allocation (LDA) technique is suggested for extracting relevant keywords from news articles and then using the findings as input features to forecast the crude oil price. The forecasting methods employed in the study were the Ridge model, the Random Forest and XGBoost techniques, and the time series method ARIMAX. The outcomes of the experiment indicate that the association between the meaning of the news articles and the crude oil price is not sufficiently strong. It is additionally concluded that the XGBoost algorithm reveal superior predictive performance in the training set. As a result, XGBoost models for each month of 2022 were developed to investigate the impact of features and determine the most important ones for the problem. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-11-21T14:28:10Z 2023-07 2023-07-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/160218 |
url |
http://hdl.handle.net/10362/160218 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799138160930717696 |