A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso

Detalhes bibliográficos
Autor(a) principal: Mussumeci, Elisa
Data de Publicação: 2018
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Institucional do FGV (FGV Repositório Digital)
Texto Completo: http://hdl.handle.net/10438/24093
Resumo: We used the Infodengue database of incidence and weather time-series, to train predictive models for the weekly number of cases of dengue in 790 cities of Brazil. To overcome a limitation in the length of time-series available to train the model, we proposed using the time series of epidemiologically similar cities as predictors for the incidence of each city. As Machine Learning-based forecasting models have been used in recent years with reasonable success, in this work we compare three machine learning models: Random Forest, lasso and Long-short term memory neural network in their forecasting performance for all cities monitored by the Infodengue Project.
id FGV_166e4a6dad08831bfd7afad169ed6d5a
oai_identifier_str oai:repositorio.fgv.br:10438/24093
network_acronym_str FGV
network_name_str Repositório Institucional do FGV (FGV Repositório Digital)
repository_id_str 3974
spelling Mussumeci, ElisaEscolas::EMApTargino, Rodrigo dos SantosBastos, Leonardo SoaresCoelho, Flávio Codeço2018-06-14T19:45:29Z2018-06-14T19:45:29Z2018-04-12http://hdl.handle.net/10438/24093We used the Infodengue database of incidence and weather time-series, to train predictive models for the weekly number of cases of dengue in 790 cities of Brazil. To overcome a limitation in the length of time-series available to train the model, we proposed using the time series of epidemiologically similar cities as predictors for the incidence of each city. As Machine Learning-based forecasting models have been used in recent years with reasonable success, in this work we compare three machine learning models: Random Forest, lasso and Long-short term memory neural network in their forecasting performance for all cities monitored by the Infodengue Project.engMachine learningNeural networksTime seriesForecastingEpidemiologyAprendizado por máquinaRedes neuraisMatemáticaAnálise de séries temporaisRedes neurais (Computação)Modelagem de dadosAnálise de regressãoDengueA machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lassoinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesis2018-04-12reponame:Repositório Institucional do FGV (FGV Repositório Digital)instname:Fundação Getulio Vargas (FGV)instacron:FGVinfo:eu-repo/semantics/openAccessTEXTmachine-learning-aproach (4).pdf.txtmachine-learning-aproach (4).pdf.txtExtracted texttext/plain46571https://repositorio.fgv.br/bitstreams/e39d46fa-2d4f-40a1-bf11-e1d147086657/downloade24a879264a1778f8a8f8e8bfe70877aMD55ORIGINALmachine-learning-aproach (4).pdfmachine-learning-aproach (4).pdfapplication/pdf11272802https://repositorio.fgv.br/bitstreams/18b2ab41-7c3a-406b-a2cd-7bc642ddb124/download52b25abf2711fdd6d1a338316c15c154MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-84707https://repositorio.fgv.br/bitstreams/b8d0840a-d7b1-494b-bea2-42bc7ee0cbc4/downloaddfb340242cced38a6cca06c627998fa1MD52THUMBNAILmachine-learning-aproach (4).pdf.jpgmachine-learning-aproach (4).pdf.jpgGenerated Thumbnailimage/jpeg2682https://repositorio.fgv.br/bitstreams/00e825f7-4581-4509-8067-220b5abbc788/download9234575c9d4ea35156b172ab8f548e0bMD5610438/240932023-11-26 18:03:16.25open.accessoai:repositorio.fgv.br:10438/24093https://repositorio.fgv.brRepositório InstitucionalPRIhttp://bibliotecadigital.fgv.br/dspace-oai/requestopendoar:39742023-11-26T18:03:16Repositório Institucional do FGV (FGV Repositório Digital) - Fundação Getulio Vargas (FGV)falseVEVSTU9TIExJQ0VOQ0lBTUVOVE8gUEFSQSBBUlFVSVZBTUVOVE8sIFJFUFJPRFXDh8ODTyBFIERJVlVMR0HDh8ODTwpQw5pCTElDQSBERSBDT05URcOaRE8gw4AgQklCTElPVEVDQSBWSVJUVUFMIEZHViAodmVyc8OjbyAxLjIpCgoxLiBWb2PDqiwgdXN1w6FyaW8tZGVwb3NpdGFudGUgZGEgQmlibGlvdGVjYSBWaXJ0dWFsIEZHViwgYXNzZWd1cmEsIG5vCnByZXNlbnRlIGF0bywgcXVlIMOpIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhdHJpbW9uaWFpcyBlL291CmRpcmVpdG9zIGNvbmV4b3MgcmVmZXJlbnRlcyDDoCB0b3RhbGlkYWRlIGRhIE9icmEgb3JhIGRlcG9zaXRhZGEgZW0KZm9ybWF0byBkaWdpdGFsLCBiZW0gY29tbyBkZSBzZXVzIGNvbXBvbmVudGVzIG1lbm9yZXMsIGVtIHNlIHRyYXRhbmRvCmRlIG9icmEgY29sZXRpdmEsIGNvbmZvcm1lIG8gcHJlY2VpdHVhZG8gcGVsYSBMZWkgOS42MTAvOTggZS9vdSBMZWkKOS42MDkvOTguIE7Do28gc2VuZG8gZXN0ZSBvIGNhc28sIHZvY8OqIGFzc2VndXJhIHRlciBvYnRpZG8sIGRpcmV0YW1lbnRlCmRvcyBkZXZpZG9zIHRpdHVsYXJlcywgYXV0b3JpemHDp8OjbyBwcsOpdmlhIGUgZXhwcmVzc2EgcGFyYSBvIGRlcMOzc2l0byBlCmRpdnVsZ2HDp8OjbyBkYSBPYnJhLCBhYnJhbmdlbmRvIHRvZG9zIG9zIGRpcmVpdG9zIGF1dG9yYWlzIGUgY29uZXhvcwphZmV0YWRvcyBwZWxhIGFzc2luYXR1cmEgZG9zIHByZXNlbnRlcyB0ZXJtb3MgZGUgbGljZW5jaWFtZW50bywgZGUKbW9kbyBhIGVmZXRpdmFtZW50ZSBpc2VudGFyIGEgRnVuZGHDp8OjbyBHZXR1bGlvIFZhcmdhcyBlIHNldXMKZnVuY2lvbsOhcmlvcyBkZSBxdWFscXVlciByZXNwb25zYWJpbGlkYWRlIHBlbG8gdXNvIG7Do28tYXV0b3JpemFkbyBkbwptYXRlcmlhbCBkZXBvc2l0YWRvLCBzZWphIGVtIHZpbmN1bGHDp8OjbyDDoCBCaWJsaW90ZWNhIFZpcnR1YWwgRkdWLCBzZWphCmVtIHZpbmN1bGHDp8OjbyBhIHF1YWlzcXVlciBzZXJ2acOnb3MgZGUgYnVzY2EgZSBkaXN0cmlidWnDp8OjbyBkZSBjb250ZcO6ZG8KcXVlIGZhw6dhbSB1c28gZGFzIGludGVyZmFjZXMgZSBlc3Bhw6dvIGRlIGFybWF6ZW5hbWVudG8gcHJvdmlkZW5jaWFkb3MKcGVsYSBGdW5kYcOnw6NvIEdldHVsaW8gVmFyZ2FzIHBvciBtZWlvIGRlIHNldXMgc2lzdGVtYXMgaW5mb3JtYXRpemFkb3MuCgoyLiBBIGFzc2luYXR1cmEgZGVzdGEgbGljZW7Dp2EgdGVtIGNvbW8gY29uc2Vxw7zDqm5jaWEgYSB0cmFuc2ZlcsOqbmNpYSwgYQp0w610dWxvIG7Do28tZXhjbHVzaXZvIGUgbsOjby1vbmVyb3NvLCBpc2VudGEgZG8gcGFnYW1lbnRvIGRlIHJveWFsdGllcwpvdSBxdWFscXVlciBvdXRyYSBjb250cmFwcmVzdGHDp8OjbywgcGVjdW5pw6FyaWEgb3UgbsOjbywgw6AgRnVuZGHDp8OjbwpHZXR1bGlvIFZhcmdhcywgZG9zIGRpcmVpdG9zIGRlIGFybWF6ZW5hciBkaWdpdGFsbWVudGUsIHJlcHJvZHV6aXIgZQpkaXN0cmlidWlyIG5hY2lvbmFsIGUgaW50ZXJuYWNpb25hbG1lbnRlIGEgT2JyYSwgaW5jbHVpbmRvLXNlIG8gc2V1CnJlc3Vtby9hYnN0cmFjdCwgcG9yIG1laW9zIGVsZXRyw7RuaWNvcywgbm8gc2l0ZSBkYSBCaWJsaW90ZWNhIFZpcnR1YWwKRkdWLCBhbyBww7pibGljbyBlbSBnZXJhbCwgZW0gcmVnaW1lIGRlIGFjZXNzbyBhYmVydG8uCgozLiBBIHByZXNlbnRlIGxpY2Vuw6dhIHRhbWLDqW0gYWJyYW5nZSwgbm9zIG1lc21vcyB0ZXJtb3MgZXN0YWJlbGVjaWRvcwpubyBpdGVtIDIsIHN1cHJhLCBxdWFscXVlciBkaXJlaXRvIGRlIGNvbXVuaWNhw6fDo28gYW8gcMO6YmxpY28gY2Fiw612ZWwKZW0gcmVsYcOnw6NvIMOgIE9icmEgb3JhIGRlcG9zaXRhZGEsIGluY2x1aW5kby1zZSBvcyB1c29zIHJlZmVyZW50ZXMgw6AKcmVwcmVzZW50YcOnw6NvIHDDumJsaWNhIGUvb3UgZXhlY3XDp8OjbyBww7pibGljYSwgYmVtIGNvbW8gcXVhbHF1ZXIgb3V0cmEKbW9kYWxpZGFkZSBkZSBjb211bmljYcOnw6NvIGFvIHDDumJsaWNvIHF1ZSBleGlzdGEgb3UgdmVuaGEgYSBleGlzdGlyLApub3MgdGVybW9zIGRvIGFydGlnbyA2OCBlIHNlZ3VpbnRlcyBkYSBMZWkgOS42MTAvOTgsIG5hIGV4dGVuc8OjbyBxdWUKZm9yIGFwbGljw6F2ZWwgYW9zIHNlcnZpw6dvcyBwcmVzdGFkb3MgYW8gcMO6YmxpY28gcGVsYSBCaWJsaW90ZWNhClZpcnR1YWwgRkdWLgoKNC4gRXN0YSBsaWNlbsOnYSBhYnJhbmdlLCBhaW5kYSwgbm9zIG1lc21vcyB0ZXJtb3MgZXN0YWJlbGVjaWRvcyBubwppdGVtIDIsIHN1cHJhLCB0b2RvcyBvcyBkaXJlaXRvcyBjb25leG9zIGRlIGFydGlzdGFzIGludMOpcnByZXRlcyBvdQpleGVjdXRhbnRlcywgcHJvZHV0b3JlcyBmb25vZ3LDoWZpY29zIG91IGVtcHJlc2FzIGRlIHJhZGlvZGlmdXPDo28gcXVlCmV2ZW50dWFsbWVudGUgc2VqYW0gYXBsaWPDoXZlaXMgZW0gcmVsYcOnw6NvIMOgIG9icmEgZGVwb3NpdGFkYSwgZW0KY29uZm9ybWlkYWRlIGNvbSBvIHJlZ2ltZSBmaXhhZG8gbm8gVMOtdHVsbyBWIGRhIExlaSA5LjYxMC85OC4KCjUuIFNlIGEgT2JyYSBkZXBvc2l0YWRhIGZvaSBvdSDDqSBvYmpldG8gZGUgZmluYW5jaWFtZW50byBwb3IKaW5zdGl0dWnDp8O1ZXMgZGUgZm9tZW50byDDoCBwZXNxdWlzYSBvdSBxdWFscXVlciBvdXRyYSBzZW1lbGhhbnRlLCB2b2PDqgpvdSBvIHRpdHVsYXIgYXNzZWd1cmEgcXVlIGN1bXByaXUgdG9kYXMgYXMgb2JyaWdhw6fDtWVzIHF1ZSBsaGUgZm9yYW0KaW1wb3N0YXMgcGVsYSBpbnN0aXR1acOnw6NvIGZpbmFuY2lhZG9yYSBlbSByYXrDo28gZG8gZmluYW5jaWFtZW50bywgZQpxdWUgbsOjbyBlc3TDoSBjb250cmFyaWFuZG8gcXVhbHF1ZXIgZGlzcG9zacOnw6NvIGNvbnRyYXR1YWwgcmVmZXJlbnRlIMOgCnB1YmxpY2HDp8OjbyBkbyBjb250ZcO6ZG8gb3JhIHN1Ym1ldGlkbyDDoCBCaWJsaW90ZWNhIFZpcnR1YWwgRkdWLgoKNi4gQ2FzbyBhIE9icmEgb3JhIGRlcG9zaXRhZGEgZW5jb250cmUtc2UgbGljZW5jaWFkYSBzb2IgdW1hIGxpY2Vuw6dhCkNyZWF0aXZlIENvbW1vbnMgKHF1YWxxdWVyIHZlcnPDo28pLCBzb2IgYSBsaWNlbsOnYSBHTlUgRnJlZQpEb2N1bWVudGF0aW9uIExpY2Vuc2UgKHF1YWxxdWVyIHZlcnPDo28pLCBvdSBvdXRyYSBsaWNlbsOnYSBxdWFsaWZpY2FkYQpjb21vIGxpdnJlIHNlZ3VuZG8gb3MgY3JpdMOpcmlvcyBkYSBEZWZpbml0aW9uIG9mIEZyZWUgQ3VsdHVyYWwgV29ya3MKKGRpc3BvbsOtdmVsIGVtOiBodHRwOi8vZnJlZWRvbWRlZmluZWQub3JnL0RlZmluaXRpb24pIG91IEZyZWUgU29mdHdhcmUKRGVmaW5pdGlvbiAoZGlzcG9uw612ZWwgZW06IGh0dHA6Ly93d3cuZ251Lm9yZy9waGlsb3NvcGh5L2ZyZWUtc3cuaHRtbCksIApvIGFycXVpdm8gcmVmZXJlbnRlIMOgIE9icmEgZGV2ZSBpbmRpY2FyIGEgbGljZW7Dp2EgYXBsaWPDoXZlbCBlbQpjb250ZcO6ZG8gbGVnw612ZWwgcG9yIHNlcmVzIGh1bWFub3MgZSwgc2UgcG9zc8OtdmVsLCB0YW1iw6ltIGVtIG1ldGFkYWRvcwpsZWfDrXZlaXMgcG9yIG3DoXF1aW5hLiBBIGluZGljYcOnw6NvIGRhIGxpY2Vuw6dhIGFwbGljw6F2ZWwgZGV2ZSBzZXIKYWNvbXBhbmhhZGEgZGUgdW0gbGluayBwYXJhIG9zIHRlcm1vcyBkZSBsaWNlbmNpYW1lbnRvIG91IHN1YSBjw7NwaWEKaW50ZWdyYWwuCgoKQW8gY29uY2x1aXIgYSBwcmVzZW50ZSBldGFwYSBlIGFzIGV0YXBhcyBzdWJzZXHDvGVudGVzIGRvIHByb2Nlc3NvIGRlCnN1Ym1pc3PDo28gZGUgYXJxdWl2b3Mgw6AgQmlibGlvdGVjYSBWaXJ0dWFsIEZHViwgdm9jw6ogYXRlc3RhIHF1ZSBsZXUgZQpjb25jb3JkYSBpbnRlZ3JhbG1lbnRlIGNvbSBvcyB0ZXJtb3MgYWNpbWEgZGVsaW1pdGFkb3MsIGFzc2luYW5kby1vcwpzZW0gZmF6ZXIgcXVhbHF1ZXIgcmVzZXJ2YSBlIG5vdmFtZW50ZSBjb25maXJtYW5kbyBxdWUgY3VtcHJlIG9zCnJlcXVpc2l0b3MgaW5kaWNhZG9zIG5vIGl0ZW0gMSwgc3VwcmEuCgpIYXZlbmRvIHF1YWxxdWVyIGRpc2NvcmTDom5jaWEgZW0gcmVsYcOnw6NvIGFvcyBwcmVzZW50ZXMgdGVybW9zIG91IG7Do28Kc2UgdmVyaWZpY2FuZG8gbyBleGlnaWRvIG5vIGl0ZW0gMSwgc3VwcmEsIHZvY8OqIGRldmUgaW50ZXJyb21wZXIKaW1lZGlhdGFtZW50ZSBvIHByb2Nlc3NvIGRlIHN1Ym1pc3PDo28uIEEgY29udGludWlkYWRlIGRvIHByb2Nlc3NvCmVxdWl2YWxlIMOgIGFzc2luYXR1cmEgZGVzdGUgZG9jdW1lbnRvLCBjb20gdG9kYXMgYXMgY29uc2Vxw7zDqm5jaWFzIG5lbGUKcHJldmlzdGFzLCBzdWplaXRhbmRvLXNlIG8gc2lnbmF0w6FyaW8gYSBzYW7Dp8O1ZXMgY2l2aXMgZSBjcmltaW5haXMgY2Fzbwpuw6NvIHNlamEgdGl0dWxhciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgcGF0cmltb25pYWlzIGUvb3UgY29uZXhvcwphcGxpY8OhdmVpcyDDoCBPYnJhIGRlcG9zaXRhZGEgZHVyYW50ZSBlc3RlIHByb2Nlc3NvLCBvdSBjYXNvIG7Do28gdGVuaGEKb2J0aWRvIHByw6l2aWEgZSBleHByZXNzYSBhdXRvcml6YcOnw6NvIGRvIHRpdHVsYXIgcGFyYSBvIGRlcMOzc2l0byBlCnRvZG9zIG9zIHVzb3MgZGEgT2JyYSBlbnZvbHZpZG9zLgoKClBhcmEgYSBzb2x1w6fDo28gZGUgcXVhbHF1ZXIgZMO6dmlkYSBxdWFudG8gYW9zIHRlcm1vcyBkZSBsaWNlbmNpYW1lbnRvIGUKbyBwcm9jZXNzbyBkZSBzdWJtaXNzw6NvLCBjbGlxdWUgbm8gbGluayAiRmFsZSBjb25vc2NvIi4K
dc.title.eng.fl_str_mv A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso
title A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso
spellingShingle A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso
Mussumeci, Elisa
Machine learning
Neural networks
Time series
Forecasting
Epidemiology
Aprendizado por máquina
Redes neurais
Matemática
Análise de séries temporais
Redes neurais (Computação)
Modelagem de dados
Análise de regressão
Dengue
title_short A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso
title_full A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso
title_fullStr A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso
title_full_unstemmed A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso
title_sort A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso
author Mussumeci, Elisa
author_facet Mussumeci, Elisa
author_role author
dc.contributor.unidadefgv.por.fl_str_mv Escolas::EMAp
dc.contributor.member.none.fl_str_mv Targino, Rodrigo dos Santos
Bastos, Leonardo Soares
dc.contributor.author.fl_str_mv Mussumeci, Elisa
dc.contributor.advisor1.fl_str_mv Coelho, Flávio Codeço
contributor_str_mv Coelho, Flávio Codeço
dc.subject.eng.fl_str_mv Machine learning
Neural networks
Time series
Forecasting
Epidemiology
topic Machine learning
Neural networks
Time series
Forecasting
Epidemiology
Aprendizado por máquina
Redes neurais
Matemática
Análise de séries temporais
Redes neurais (Computação)
Modelagem de dados
Análise de regressão
Dengue
dc.subject.por.fl_str_mv Aprendizado por máquina
Redes neurais
dc.subject.area.por.fl_str_mv Matemática
dc.subject.bibliodata.por.fl_str_mv Análise de séries temporais
Redes neurais (Computação)
Modelagem de dados
Análise de regressão
Dengue
description We used the Infodengue database of incidence and weather time-series, to train predictive models for the weekly number of cases of dengue in 790 cities of Brazil. To overcome a limitation in the length of time-series available to train the model, we proposed using the time series of epidemiologically similar cities as predictors for the incidence of each city. As Machine Learning-based forecasting models have been used in recent years with reasonable success, in this work we compare three machine learning models: Random Forest, lasso and Long-short term memory neural network in their forecasting performance for all cities monitored by the Infodengue Project.
publishDate 2018
dc.date.accessioned.fl_str_mv 2018-06-14T19:45:29Z
dc.date.available.fl_str_mv 2018-06-14T19:45:29Z
dc.date.issued.fl_str_mv 2018-04-12
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10438/24093
url http://hdl.handle.net/10438/24093
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional do FGV (FGV Repositório Digital)
instname:Fundação Getulio Vargas (FGV)
instacron:FGV
instname_str Fundação Getulio Vargas (FGV)
instacron_str FGV
institution FGV
reponame_str Repositório Institucional do FGV (FGV Repositório Digital)
collection Repositório Institucional do FGV (FGV Repositório Digital)
bitstream.url.fl_str_mv https://repositorio.fgv.br/bitstreams/e39d46fa-2d4f-40a1-bf11-e1d147086657/download
https://repositorio.fgv.br/bitstreams/18b2ab41-7c3a-406b-a2cd-7bc642ddb124/download
https://repositorio.fgv.br/bitstreams/b8d0840a-d7b1-494b-bea2-42bc7ee0cbc4/download
https://repositorio.fgv.br/bitstreams/00e825f7-4581-4509-8067-220b5abbc788/download
bitstream.checksum.fl_str_mv e24a879264a1778f8a8f8e8bfe70877a
52b25abf2711fdd6d1a338316c15c154
dfb340242cced38a6cca06c627998fa1
9234575c9d4ea35156b172ab8f548e0b
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional do FGV (FGV Repositório Digital) - Fundação Getulio Vargas (FGV)
repository.mail.fl_str_mv
_version_ 1802749781893709824