A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso
Autor(a) principal: | |
---|---|
Data de Publicação: | 2018 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Institucional do FGV (FGV Repositório Digital) |
Texto Completo: | http://hdl.handle.net/10438/24093 |
Resumo: | We used the Infodengue database of incidence and weather time-series, to train predictive models for the weekly number of cases of dengue in 790 cities of Brazil. To overcome a limitation in the length of time-series available to train the model, we proposed using the time series of epidemiologically similar cities as predictors for the incidence of each city. As Machine Learning-based forecasting models have been used in recent years with reasonable success, in this work we compare three machine learning models: Random Forest, lasso and Long-short term memory neural network in their forecasting performance for all cities monitored by the Infodengue Project. |
id |
FGV_166e4a6dad08831bfd7afad169ed6d5a |
---|---|
oai_identifier_str |
oai:repositorio.fgv.br:10438/24093 |
network_acronym_str |
FGV |
network_name_str |
Repositório Institucional do FGV (FGV Repositório Digital) |
repository_id_str |
3974 |
spelling |
Mussumeci, ElisaEscolas::EMApTargino, Rodrigo dos SantosBastos, Leonardo SoaresCoelho, Flávio Codeço2018-06-14T19:45:29Z2018-06-14T19:45:29Z2018-04-12http://hdl.handle.net/10438/24093We used the Infodengue database of incidence and weather time-series, to train predictive models for the weekly number of cases of dengue in 790 cities of Brazil. To overcome a limitation in the length of time-series available to train the model, we proposed using the time series of epidemiologically similar cities as predictors for the incidence of each city. As Machine Learning-based forecasting models have been used in recent years with reasonable success, in this work we compare three machine learning models: Random Forest, lasso and Long-short term memory neural network in their forecasting performance for all cities monitored by the Infodengue Project.engMachine learningNeural networksTime seriesForecastingEpidemiologyAprendizado por máquinaRedes neuraisMatemáticaAnálise de séries temporaisRedes neurais (Computação)Modelagem de dadosAnálise de regressãoDengueA machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lassoinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesis2018-04-12reponame:Repositório Institucional do FGV (FGV Repositório Digital)instname:Fundação Getulio Vargas (FGV)instacron:FGVinfo:eu-repo/semantics/openAccessTEXTmachine-learning-aproach (4).pdf.txtmachine-learning-aproach (4).pdf.txtExtracted texttext/plain46571https://repositorio.fgv.br/bitstreams/e39d46fa-2d4f-40a1-bf11-e1d147086657/downloade24a879264a1778f8a8f8e8bfe70877aMD55ORIGINALmachine-learning-aproach (4).pdfmachine-learning-aproach (4).pdfapplication/pdf11272802https://repositorio.fgv.br/bitstreams/18b2ab41-7c3a-406b-a2cd-7bc642ddb124/download52b25abf2711fdd6d1a338316c15c154MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-84707https://repositorio.fgv.br/bitstreams/b8d0840a-d7b1-494b-bea2-42bc7ee0cbc4/downloaddfb340242cced38a6cca06c627998fa1MD52THUMBNAILmachine-learning-aproach (4).pdf.jpgmachine-learning-aproach (4).pdf.jpgGenerated Thumbnailimage/jpeg2682https://repositorio.fgv.br/bitstreams/00e825f7-4581-4509-8067-220b5abbc788/download9234575c9d4ea35156b172ab8f548e0bMD5610438/240932023-11-26 18:03:16.25open.accessoai:repositorio.fgv.br:10438/24093https://repositorio.fgv.brRepositório InstitucionalPRIhttp://bibliotecadigital.fgv.br/dspace-oai/requestopendoar:39742023-11-26T18:03:16Repositório Institucional do FGV (FGV Repositório Digital) - Fundação Getulio Vargas (FGV)falseVEVSTU9TIExJQ0VOQ0lBTUVOVE8gUEFSQSBBUlFVSVZBTUVOVE8sIFJFUFJPRFXDh8ODTyBFIERJVlVMR0HDh8ODTwpQw5pCTElDQSBERSBDT05URcOaRE8gw4AgQklCTElPVEVDQSBWSVJUVUFMIEZHViAodmVyc8OjbyAxLjIpCgoxLiBWb2PDqiwgdXN1w6FyaW8tZGVwb3NpdGFudGUgZGEgQmlibGlvdGVjYSBWaXJ0dWFsIEZHViwgYXNzZWd1cmEsIG5vCnByZXNlbnRlIGF0bywgcXVlIMOpIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhdHJpbW9uaWFpcyBlL291CmRpcmVpdG9zIGNvbmV4b3MgcmVmZXJlbnRlcyDDoCB0b3RhbGlkYWRlIGRhIE9icmEgb3JhIGRlcG9zaXRhZGEgZW0KZm9ybWF0byBkaWdpdGFsLCBiZW0gY29tbyBkZSBzZXVzIGNvbXBvbmVudGVzIG1lbm9yZXMsIGVtIHNlIHRyYXRhbmRvCmRlIG9icmEgY29sZXRpdmEsIGNvbmZvcm1lIG8gcHJlY2VpdHVhZG8gcGVsYSBMZWkgOS42MTAvOTggZS9vdSBMZWkKOS42MDkvOTguIE7Do28gc2VuZG8gZXN0ZSBvIGNhc28sIHZvY8OqIGFzc2VndXJhIHRlciBvYnRpZG8sIGRpcmV0YW1lbnRlCmRvcyBkZXZpZG9zIHRpdHVsYXJlcywgYXV0b3JpemHDp8OjbyBwcsOpdmlhIGUgZXhwcmVzc2EgcGFyYSBvIGRlcMOzc2l0byBlCmRpdnVsZ2HDp8OjbyBkYSBPYnJhLCBhYnJhbmdlbmRvIHRvZG9zIG9zIGRpcmVpdG9zIGF1dG9yYWlzIGUgY29uZXhvcwphZmV0YWRvcyBwZWxhIGFzc2luYXR1cmEgZG9zIHByZXNlbnRlcyB0ZXJtb3MgZGUgbGljZW5jaWFtZW50bywgZGUKbW9kbyBhIGVmZXRpdmFtZW50ZSBpc2VudGFyIGEgRnVuZGHDp8OjbyBHZXR1bGlvIFZhcmdhcyBlIHNldXMKZnVuY2lvbsOhcmlvcyBkZSBxdWFscXVlciByZXNwb25zYWJpbGlkYWRlIHBlbG8gdXNvIG7Do28tYXV0b3JpemFkbyBkbwptYXRlcmlhbCBkZXBvc2l0YWRvLCBzZWphIGVtIHZpbmN1bGHDp8OjbyDDoCBCaWJsaW90ZWNhIFZpcnR1YWwgRkdWLCBzZWphCmVtIHZpbmN1bGHDp8OjbyBhIHF1YWlzcXVlciBzZXJ2acOnb3MgZGUgYnVzY2EgZSBkaXN0cmlidWnDp8OjbyBkZSBjb250ZcO6ZG8KcXVlIGZhw6dhbSB1c28gZGFzIGludGVyZmFjZXMgZSBlc3Bhw6dvIGRlIGFybWF6ZW5hbWVudG8gcHJvdmlkZW5jaWFkb3MKcGVsYSBGdW5kYcOnw6NvIEdldHVsaW8gVmFyZ2FzIHBvciBtZWlvIGRlIHNldXMgc2lzdGVtYXMgaW5mb3JtYXRpemFkb3MuCgoyLiBBIGFzc2luYXR1cmEgZGVzdGEgbGljZW7Dp2EgdGVtIGNvbW8gY29uc2Vxw7zDqm5jaWEgYSB0cmFuc2ZlcsOqbmNpYSwgYQp0w610dWxvIG7Do28tZXhjbHVzaXZvIGUgbsOjby1vbmVyb3NvLCBpc2VudGEgZG8gcGFnYW1lbnRvIGRlIHJveWFsdGllcwpvdSBxdWFscXVlciBvdXRyYSBjb250cmFwcmVzdGHDp8OjbywgcGVjdW5pw6FyaWEgb3UgbsOjbywgw6AgRnVuZGHDp8OjbwpHZXR1bGlvIFZhcmdhcywgZG9zIGRpcmVpdG9zIGRlIGFybWF6ZW5hciBkaWdpdGFsbWVudGUsIHJlcHJvZHV6aXIgZQpkaXN0cmlidWlyIG5hY2lvbmFsIGUgaW50ZXJuYWNpb25hbG1lbnRlIGEgT2JyYSwgaW5jbHVpbmRvLXNlIG8gc2V1CnJlc3Vtby9hYnN0cmFjdCwgcG9yIG1laW9zIGVsZXRyw7RuaWNvcywgbm8gc2l0ZSBkYSBCaWJsaW90ZWNhIFZpcnR1YWwKRkdWLCBhbyBww7pibGljbyBlbSBnZXJhbCwgZW0gcmVnaW1lIGRlIGFjZXNzbyBhYmVydG8uCgozLiBBIHByZXNlbnRlIGxpY2Vuw6dhIHRhbWLDqW0gYWJyYW5nZSwgbm9zIG1lc21vcyB0ZXJtb3MgZXN0YWJlbGVjaWRvcwpubyBpdGVtIDIsIHN1cHJhLCBxdWFscXVlciBkaXJlaXRvIGRlIGNvbXVuaWNhw6fDo28gYW8gcMO6YmxpY28gY2Fiw612ZWwKZW0gcmVsYcOnw6NvIMOgIE9icmEgb3JhIGRlcG9zaXRhZGEsIGluY2x1aW5kby1zZSBvcyB1c29zIHJlZmVyZW50ZXMgw6AKcmVwcmVzZW50YcOnw6NvIHDDumJsaWNhIGUvb3UgZXhlY3XDp8OjbyBww7pibGljYSwgYmVtIGNvbW8gcXVhbHF1ZXIgb3V0cmEKbW9kYWxpZGFkZSBkZSBjb211bmljYcOnw6NvIGFvIHDDumJsaWNvIHF1ZSBleGlzdGEgb3UgdmVuaGEgYSBleGlzdGlyLApub3MgdGVybW9zIGRvIGFydGlnbyA2OCBlIHNlZ3VpbnRlcyBkYSBMZWkgOS42MTAvOTgsIG5hIGV4dGVuc8OjbyBxdWUKZm9yIGFwbGljw6F2ZWwgYW9zIHNlcnZpw6dvcyBwcmVzdGFkb3MgYW8gcMO6YmxpY28gcGVsYSBCaWJsaW90ZWNhClZpcnR1YWwgRkdWLgoKNC4gRXN0YSBsaWNlbsOnYSBhYnJhbmdlLCBhaW5kYSwgbm9zIG1lc21vcyB0ZXJtb3MgZXN0YWJlbGVjaWRvcyBubwppdGVtIDIsIHN1cHJhLCB0b2RvcyBvcyBkaXJlaXRvcyBjb25leG9zIGRlIGFydGlzdGFzIGludMOpcnByZXRlcyBvdQpleGVjdXRhbnRlcywgcHJvZHV0b3JlcyBmb25vZ3LDoWZpY29zIG91IGVtcHJlc2FzIGRlIHJhZGlvZGlmdXPDo28gcXVlCmV2ZW50dWFsbWVudGUgc2VqYW0gYXBsaWPDoXZlaXMgZW0gcmVsYcOnw6NvIMOgIG9icmEgZGVwb3NpdGFkYSwgZW0KY29uZm9ybWlkYWRlIGNvbSBvIHJlZ2ltZSBmaXhhZG8gbm8gVMOtdHVsbyBWIGRhIExlaSA5LjYxMC85OC4KCjUuIFNlIGEgT2JyYSBkZXBvc2l0YWRhIGZvaSBvdSDDqSBvYmpldG8gZGUgZmluYW5jaWFtZW50byBwb3IKaW5zdGl0dWnDp8O1ZXMgZGUgZm9tZW50byDDoCBwZXNxdWlzYSBvdSBxdWFscXVlciBvdXRyYSBzZW1lbGhhbnRlLCB2b2PDqgpvdSBvIHRpdHVsYXIgYXNzZWd1cmEgcXVlIGN1bXByaXUgdG9kYXMgYXMgb2JyaWdhw6fDtWVzIHF1ZSBsaGUgZm9yYW0KaW1wb3N0YXMgcGVsYSBpbnN0aXR1acOnw6NvIGZpbmFuY2lhZG9yYSBlbSByYXrDo28gZG8gZmluYW5jaWFtZW50bywgZQpxdWUgbsOjbyBlc3TDoSBjb250cmFyaWFuZG8gcXVhbHF1ZXIgZGlzcG9zacOnw6NvIGNvbnRyYXR1YWwgcmVmZXJlbnRlIMOgCnB1YmxpY2HDp8OjbyBkbyBjb250ZcO6ZG8gb3JhIHN1Ym1ldGlkbyDDoCBCaWJsaW90ZWNhIFZpcnR1YWwgRkdWLgoKNi4gQ2FzbyBhIE9icmEgb3JhIGRlcG9zaXRhZGEgZW5jb250cmUtc2UgbGljZW5jaWFkYSBzb2IgdW1hIGxpY2Vuw6dhCkNyZWF0aXZlIENvbW1vbnMgKHF1YWxxdWVyIHZlcnPDo28pLCBzb2IgYSBsaWNlbsOnYSBHTlUgRnJlZQpEb2N1bWVudGF0aW9uIExpY2Vuc2UgKHF1YWxxdWVyIHZlcnPDo28pLCBvdSBvdXRyYSBsaWNlbsOnYSBxdWFsaWZpY2FkYQpjb21vIGxpdnJlIHNlZ3VuZG8gb3MgY3JpdMOpcmlvcyBkYSBEZWZpbml0aW9uIG9mIEZyZWUgQ3VsdHVyYWwgV29ya3MKKGRpc3BvbsOtdmVsIGVtOiBodHRwOi8vZnJlZWRvbWRlZmluZWQub3JnL0RlZmluaXRpb24pIG91IEZyZWUgU29mdHdhcmUKRGVmaW5pdGlvbiAoZGlzcG9uw612ZWwgZW06IGh0dHA6Ly93d3cuZ251Lm9yZy9waGlsb3NvcGh5L2ZyZWUtc3cuaHRtbCksIApvIGFycXVpdm8gcmVmZXJlbnRlIMOgIE9icmEgZGV2ZSBpbmRpY2FyIGEgbGljZW7Dp2EgYXBsaWPDoXZlbCBlbQpjb250ZcO6ZG8gbGVnw612ZWwgcG9yIHNlcmVzIGh1bWFub3MgZSwgc2UgcG9zc8OtdmVsLCB0YW1iw6ltIGVtIG1ldGFkYWRvcwpsZWfDrXZlaXMgcG9yIG3DoXF1aW5hLiBBIGluZGljYcOnw6NvIGRhIGxpY2Vuw6dhIGFwbGljw6F2ZWwgZGV2ZSBzZXIKYWNvbXBhbmhhZGEgZGUgdW0gbGluayBwYXJhIG9zIHRlcm1vcyBkZSBsaWNlbmNpYW1lbnRvIG91IHN1YSBjw7NwaWEKaW50ZWdyYWwuCgoKQW8gY29uY2x1aXIgYSBwcmVzZW50ZSBldGFwYSBlIGFzIGV0YXBhcyBzdWJzZXHDvGVudGVzIGRvIHByb2Nlc3NvIGRlCnN1Ym1pc3PDo28gZGUgYXJxdWl2b3Mgw6AgQmlibGlvdGVjYSBWaXJ0dWFsIEZHViwgdm9jw6ogYXRlc3RhIHF1ZSBsZXUgZQpjb25jb3JkYSBpbnRlZ3JhbG1lbnRlIGNvbSBvcyB0ZXJtb3MgYWNpbWEgZGVsaW1pdGFkb3MsIGFzc2luYW5kby1vcwpzZW0gZmF6ZXIgcXVhbHF1ZXIgcmVzZXJ2YSBlIG5vdmFtZW50ZSBjb25maXJtYW5kbyBxdWUgY3VtcHJlIG9zCnJlcXVpc2l0b3MgaW5kaWNhZG9zIG5vIGl0ZW0gMSwgc3VwcmEuCgpIYXZlbmRvIHF1YWxxdWVyIGRpc2NvcmTDom5jaWEgZW0gcmVsYcOnw6NvIGFvcyBwcmVzZW50ZXMgdGVybW9zIG91IG7Do28Kc2UgdmVyaWZpY2FuZG8gbyBleGlnaWRvIG5vIGl0ZW0gMSwgc3VwcmEsIHZvY8OqIGRldmUgaW50ZXJyb21wZXIKaW1lZGlhdGFtZW50ZSBvIHByb2Nlc3NvIGRlIHN1Ym1pc3PDo28uIEEgY29udGludWlkYWRlIGRvIHByb2Nlc3NvCmVxdWl2YWxlIMOgIGFzc2luYXR1cmEgZGVzdGUgZG9jdW1lbnRvLCBjb20gdG9kYXMgYXMgY29uc2Vxw7zDqm5jaWFzIG5lbGUKcHJldmlzdGFzLCBzdWplaXRhbmRvLXNlIG8gc2lnbmF0w6FyaW8gYSBzYW7Dp8O1ZXMgY2l2aXMgZSBjcmltaW5haXMgY2Fzbwpuw6NvIHNlamEgdGl0dWxhciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgcGF0cmltb25pYWlzIGUvb3UgY29uZXhvcwphcGxpY8OhdmVpcyDDoCBPYnJhIGRlcG9zaXRhZGEgZHVyYW50ZSBlc3RlIHByb2Nlc3NvLCBvdSBjYXNvIG7Do28gdGVuaGEKb2J0aWRvIHByw6l2aWEgZSBleHByZXNzYSBhdXRvcml6YcOnw6NvIGRvIHRpdHVsYXIgcGFyYSBvIGRlcMOzc2l0byBlCnRvZG9zIG9zIHVzb3MgZGEgT2JyYSBlbnZvbHZpZG9zLgoKClBhcmEgYSBzb2x1w6fDo28gZGUgcXVhbHF1ZXIgZMO6dmlkYSBxdWFudG8gYW9zIHRlcm1vcyBkZSBsaWNlbmNpYW1lbnRvIGUKbyBwcm9jZXNzbyBkZSBzdWJtaXNzw6NvLCBjbGlxdWUgbm8gbGluayAiRmFsZSBjb25vc2NvIi4K |
dc.title.eng.fl_str_mv |
A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso |
title |
A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso |
spellingShingle |
A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso Mussumeci, Elisa Machine learning Neural networks Time series Forecasting Epidemiology Aprendizado por máquina Redes neurais Matemática Análise de séries temporais Redes neurais (Computação) Modelagem de dados Análise de regressão Dengue |
title_short |
A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso |
title_full |
A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso |
title_fullStr |
A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso |
title_full_unstemmed |
A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso |
title_sort |
A machine learning approach to dengue forecasting: comparing LSTM, Random Forest and Lasso |
author |
Mussumeci, Elisa |
author_facet |
Mussumeci, Elisa |
author_role |
author |
dc.contributor.unidadefgv.por.fl_str_mv |
Escolas::EMAp |
dc.contributor.member.none.fl_str_mv |
Targino, Rodrigo dos Santos Bastos, Leonardo Soares |
dc.contributor.author.fl_str_mv |
Mussumeci, Elisa |
dc.contributor.advisor1.fl_str_mv |
Coelho, Flávio Codeço |
contributor_str_mv |
Coelho, Flávio Codeço |
dc.subject.eng.fl_str_mv |
Machine learning Neural networks Time series Forecasting Epidemiology |
topic |
Machine learning Neural networks Time series Forecasting Epidemiology Aprendizado por máquina Redes neurais Matemática Análise de séries temporais Redes neurais (Computação) Modelagem de dados Análise de regressão Dengue |
dc.subject.por.fl_str_mv |
Aprendizado por máquina Redes neurais |
dc.subject.area.por.fl_str_mv |
Matemática |
dc.subject.bibliodata.por.fl_str_mv |
Análise de séries temporais Redes neurais (Computação) Modelagem de dados Análise de regressão Dengue |
description |
We used the Infodengue database of incidence and weather time-series, to train predictive models for the weekly number of cases of dengue in 790 cities of Brazil. To overcome a limitation in the length of time-series available to train the model, we proposed using the time series of epidemiologically similar cities as predictors for the incidence of each city. As Machine Learning-based forecasting models have been used in recent years with reasonable success, in this work we compare three machine learning models: Random Forest, lasso and Long-short term memory neural network in their forecasting performance for all cities monitored by the Infodengue Project. |
publishDate |
2018 |
dc.date.accessioned.fl_str_mv |
2018-06-14T19:45:29Z |
dc.date.available.fl_str_mv |
2018-06-14T19:45:29Z |
dc.date.issued.fl_str_mv |
2018-04-12 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10438/24093 |
url |
http://hdl.handle.net/10438/24093 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional do FGV (FGV Repositório Digital) instname:Fundação Getulio Vargas (FGV) instacron:FGV |
instname_str |
Fundação Getulio Vargas (FGV) |
instacron_str |
FGV |
institution |
FGV |
reponame_str |
Repositório Institucional do FGV (FGV Repositório Digital) |
collection |
Repositório Institucional do FGV (FGV Repositório Digital) |
bitstream.url.fl_str_mv |
https://repositorio.fgv.br/bitstreams/e39d46fa-2d4f-40a1-bf11-e1d147086657/download https://repositorio.fgv.br/bitstreams/18b2ab41-7c3a-406b-a2cd-7bc642ddb124/download https://repositorio.fgv.br/bitstreams/b8d0840a-d7b1-494b-bea2-42bc7ee0cbc4/download https://repositorio.fgv.br/bitstreams/00e825f7-4581-4509-8067-220b5abbc788/download |
bitstream.checksum.fl_str_mv |
e24a879264a1778f8a8f8e8bfe70877a 52b25abf2711fdd6d1a338316c15c154 dfb340242cced38a6cca06c627998fa1 9234575c9d4ea35156b172ab8f548e0b |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional do FGV (FGV Repositório Digital) - Fundação Getulio Vargas (FGV) |
repository.mail.fl_str_mv |
|
_version_ |
1802749781893709824 |