Scholar trend learner: predicting scholar popularity as early and accurate as possible
Autor(a) principal: | |
---|---|
Data de Publicação: | 2016 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Repositório Institucional da UFMG |
Texto Completo: | http://hdl.handle.net/1843/48884 |
Resumo: | Prediction of scholar popularity has become an important research topic for a number of reasons. In this dissertation, we tackle the problem of predicting the popularity trend of scholars by concentrating on making predictions both as earlier and accurate as possible. In order to perform the prediction task, we first extract the popularity trends of scholars from a training set. To that end, we apply a time series clustering algorithm called K-Spectral Clustering (K-SC) to identify the popularity trends as cluster centroids. We then predict trends for scholars in a test set by solving a classification problem. Specifically, we first compute a set of measures for individual scholars based on the distance between earlier points in their particular popularity curve and the identified centroids. We then combine those distance measures with a set of academic features (e.g., number of publications, number of venues, etc) collected during the same monitoring period, and use them as input to a classification method. One aspect that distinguishes our method from other approaches is that the monitoring period, during which we gather information on each scholar popularity and academic features, is determined on a per scholar basis, as part of our approach. Using total citation count as measure of scientific popularity, we evaluate our solution on the popularity time series of more than 500,000 Computer Science scholars, gathered from Microsoft Azure Marketplace1 . The experimental results show that our prediction method outperforms other alternative prediction methods. We also show how to apply our method jointly with regression models to improve the prediction of scholar popularity values (e.g., number of citations) at a future time. |
id |
UFMG_4c88bcd9084338e5b071ef925636510d |
---|---|
oai_identifier_str |
oai:repositorio.ufmg.br:1843/48884 |
network_acronym_str |
UFMG |
network_name_str |
Repositório Institucional da UFMG |
repository_id_str |
|
spelling |
Marcos André Gonçalveshttp://lattes.cnpq.br/3457219624656691Jussara Marques de Almeida GonçalvesAlberto Henrique Frade LaenderFabrício Benevenuto de Souzahttps://doi.org/10.1145/2910896.2910905Masoumeh Nezhadbiglari2023-01-12T13:08:39Z2023-01-12T13:08:39Z2016-10-11http://hdl.handle.net/1843/48884Prediction of scholar popularity has become an important research topic for a number of reasons. In this dissertation, we tackle the problem of predicting the popularity trend of scholars by concentrating on making predictions both as earlier and accurate as possible. In order to perform the prediction task, we first extract the popularity trends of scholars from a training set. To that end, we apply a time series clustering algorithm called K-Spectral Clustering (K-SC) to identify the popularity trends as cluster centroids. We then predict trends for scholars in a test set by solving a classification problem. Specifically, we first compute a set of measures for individual scholars based on the distance between earlier points in their particular popularity curve and the identified centroids. We then combine those distance measures with a set of academic features (e.g., number of publications, number of venues, etc) collected during the same monitoring period, and use them as input to a classification method. One aspect that distinguishes our method from other approaches is that the monitoring period, during which we gather information on each scholar popularity and academic features, is determined on a per scholar basis, as part of our approach. Using total citation count as measure of scientific popularity, we evaluate our solution on the popularity time series of more than 500,000 Computer Science scholars, gathered from Microsoft Azure Marketplace1 . The experimental results show that our prediction method outperforms other alternative prediction methods. We also show how to apply our method jointly with regression models to improve the prediction of scholar popularity values (e.g., number of citations) at a future time.porUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em Ciência da ComputaçãoUFMGBrasilICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃOhttp://creativecommons.org/licenses/by-nc-nd/3.0/pt/info:eu-repo/semantics/openAccessComputação – TesesMineração de dadosRedação acadêmica – BibliometriaTrend CassificationTrend CassificationTrend CassificationTrend CassificationScholar’s PopularityScholar’s PopularityScholar trend learner: predicting scholar popularity as early and accurate as possibleinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALMasoumeh.pdfMasoumeh.pdfapplication/pdf738800https://repositorio.ufmg.br/bitstream/1843/48884/1/Masoumeh.pdfab9babdfa22c85caa977b9f880d66fe4MD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufmg.br/bitstream/1843/48884/2/license_rdfcfd6801dba008cb6adbd9838b81582abMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82118https://repositorio.ufmg.br/bitstream/1843/48884/3/license.txtcda590c95a0b51b4d15f60c9642ca272MD531843/488842023-01-12 10:08:39.435oai:repositorio.ufmg.br:1843/48884TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KRepositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2023-01-12T13:08:39Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
dc.title.pt_BR.fl_str_mv |
Scholar trend learner: predicting scholar popularity as early and accurate as possible |
title |
Scholar trend learner: predicting scholar popularity as early and accurate as possible |
spellingShingle |
Scholar trend learner: predicting scholar popularity as early and accurate as possible Masoumeh Nezhadbiglari Trend Cassification Trend Cassification Trend Cassification Trend Cassification Scholar’s Popularity Scholar’s Popularity Computação – Teses Mineração de dados Redação acadêmica – Bibliometria |
title_short |
Scholar trend learner: predicting scholar popularity as early and accurate as possible |
title_full |
Scholar trend learner: predicting scholar popularity as early and accurate as possible |
title_fullStr |
Scholar trend learner: predicting scholar popularity as early and accurate as possible |
title_full_unstemmed |
Scholar trend learner: predicting scholar popularity as early and accurate as possible |
title_sort |
Scholar trend learner: predicting scholar popularity as early and accurate as possible |
author |
Masoumeh Nezhadbiglari |
author_facet |
Masoumeh Nezhadbiglari |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
Marcos André Gonçalves |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/3457219624656691 |
dc.contributor.advisor-co1.fl_str_mv |
Jussara Marques de Almeida Gonçalves |
dc.contributor.referee1.fl_str_mv |
Alberto Henrique Frade Laender |
dc.contributor.referee2.fl_str_mv |
Fabrício Benevenuto de Souza |
dc.contributor.authorLattes.fl_str_mv |
https://doi.org/10.1145/2910896.2910905 |
dc.contributor.author.fl_str_mv |
Masoumeh Nezhadbiglari |
contributor_str_mv |
Marcos André Gonçalves Jussara Marques de Almeida Gonçalves Alberto Henrique Frade Laender Fabrício Benevenuto de Souza |
dc.subject.por.fl_str_mv |
Trend Cassification Trend Cassification Trend Cassification Trend Cassification Scholar’s Popularity Scholar’s Popularity |
topic |
Trend Cassification Trend Cassification Trend Cassification Trend Cassification Scholar’s Popularity Scholar’s Popularity Computação – Teses Mineração de dados Redação acadêmica – Bibliometria |
dc.subject.other.pt_BR.fl_str_mv |
Computação – Teses Mineração de dados Redação acadêmica – Bibliometria |
description |
Prediction of scholar popularity has become an important research topic for a number of reasons. In this dissertation, we tackle the problem of predicting the popularity trend of scholars by concentrating on making predictions both as earlier and accurate as possible. In order to perform the prediction task, we first extract the popularity trends of scholars from a training set. To that end, we apply a time series clustering algorithm called K-Spectral Clustering (K-SC) to identify the popularity trends as cluster centroids. We then predict trends for scholars in a test set by solving a classification problem. Specifically, we first compute a set of measures for individual scholars based on the distance between earlier points in their particular popularity curve and the identified centroids. We then combine those distance measures with a set of academic features (e.g., number of publications, number of venues, etc) collected during the same monitoring period, and use them as input to a classification method. One aspect that distinguishes our method from other approaches is that the monitoring period, during which we gather information on each scholar popularity and academic features, is determined on a per scholar basis, as part of our approach. Using total citation count as measure of scientific popularity, we evaluate our solution on the popularity time series of more than 500,000 Computer Science scholars, gathered from Microsoft Azure Marketplace1 . The experimental results show that our prediction method outperforms other alternative prediction methods. We also show how to apply our method jointly with regression models to improve the prediction of scholar popularity values (e.g., number of citations) at a future time. |
publishDate |
2016 |
dc.date.issued.fl_str_mv |
2016-10-11 |
dc.date.accessioned.fl_str_mv |
2023-01-12T13:08:39Z |
dc.date.available.fl_str_mv |
2023-01-12T13:08:39Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1843/48884 |
url |
http://hdl.handle.net/1843/48884 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
http://creativecommons.org/licenses/by-nc-nd/3.0/pt/ info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-nd/3.0/pt/ |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Ciência da Computação |
dc.publisher.initials.fl_str_mv |
UFMG |
dc.publisher.country.fl_str_mv |
Brasil |
dc.publisher.department.fl_str_mv |
ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO |
publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
instname_str |
Universidade Federal de Minas Gerais (UFMG) |
instacron_str |
UFMG |
institution |
UFMG |
reponame_str |
Repositório Institucional da UFMG |
collection |
Repositório Institucional da UFMG |
bitstream.url.fl_str_mv |
https://repositorio.ufmg.br/bitstream/1843/48884/1/Masoumeh.pdf https://repositorio.ufmg.br/bitstream/1843/48884/2/license_rdf https://repositorio.ufmg.br/bitstream/1843/48884/3/license.txt |
bitstream.checksum.fl_str_mv |
ab9babdfa22c85caa977b9f880d66fe4 cfd6801dba008cb6adbd9838b81582ab cda590c95a0b51b4d15f60c9642ca272 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
repository.mail.fl_str_mv |
|
_version_ |
1801676863287853056 |