Never Ending Language metaLearning: model management for CMU's ReadTheWeb project
Autor(a) principal: | |
---|---|
Data de Publicação: | 2015 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/110259 |
Resumo: | The main goal of CMU's ReadTheWeb project is to build a new kind of machine learning system that continuously reads the web, 24 hours per day, 7 days per week. This system is called the Never-Ending Language Learner (NELL) . While this goal is not necessarily unheard-of, NELL stands out as being capable of improving the way it learns over time, that is to say, it learns to read the web better than it did the day before. To succeed in such an arduous quest, NELL combines several subsystem components that implement complementary knowledge extraction methods. For the same task, NELL is able to use different extraction methods. The performance of the components that use such methods, that is the quality of the extracted knowledge, will however change over time. In order to maximize the performance of the system as a whole, it becomes necessary to choose the best component for a task at any given time. Due to the amount of data and algorithm's involved, traditional testing and selection methods are not a viable option. A preliminary approach to use metalearning to address this issue was proposed by Santos . In this project, we extend this work. Our approach seeks to relate the innate (meta)features of the data and the performance of algorithms. A first step will be to gather different sets of data (used in NELL) and test the performance of the above mentioned subsystem components on such data. The results are used to create a meta-learning system that can select the best algorithm for future sets of data. Proven successful, this system can then be implemented on NELL's framework to improve its learning capability. |
id |
RCAP_7494c74e307e6be273f0599b86715b68 |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/110259 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Never Ending Language metaLearning: model management for CMU's ReadTheWeb projectEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringThe main goal of CMU's ReadTheWeb project is to build a new kind of machine learning system that continuously reads the web, 24 hours per day, 7 days per week. This system is called the Never-Ending Language Learner (NELL) . While this goal is not necessarily unheard-of, NELL stands out as being capable of improving the way it learns over time, that is to say, it learns to read the web better than it did the day before. To succeed in such an arduous quest, NELL combines several subsystem components that implement complementary knowledge extraction methods. For the same task, NELL is able to use different extraction methods. The performance of the components that use such methods, that is the quality of the extracted knowledge, will however change over time. In order to maximize the performance of the system as a whole, it becomes necessary to choose the best component for a task at any given time. Due to the amount of data and algorithm's involved, traditional testing and selection methods are not a viable option. A preliminary approach to use metalearning to address this issue was proposed by Santos . In this project, we extend this work. Our approach seeks to relate the innate (meta)features of the data and the performance of algorithms. A first step will be to gather different sets of data (used in NELL) and test the performance of the above mentioned subsystem components on such data. The results are used to create a meta-learning system that can select the best algorithm for future sets of data. Proven successful, this system can then be implemented on NELL's framework to improve its learning capability.2015-07-202015-07-20T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/110259TID:201322269engTiago Miguel Martins Vieirainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T14:28:20Zoai:repositorio-aberto.up.pt:10216/110259Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:01:58.926755Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Never Ending Language metaLearning: model management for CMU's ReadTheWeb project |
title |
Never Ending Language metaLearning: model management for CMU's ReadTheWeb project |
spellingShingle |
Never Ending Language metaLearning: model management for CMU's ReadTheWeb project Tiago Miguel Martins Vieira Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
title_short |
Never Ending Language metaLearning: model management for CMU's ReadTheWeb project |
title_full |
Never Ending Language metaLearning: model management for CMU's ReadTheWeb project |
title_fullStr |
Never Ending Language metaLearning: model management for CMU's ReadTheWeb project |
title_full_unstemmed |
Never Ending Language metaLearning: model management for CMU's ReadTheWeb project |
title_sort |
Never Ending Language metaLearning: model management for CMU's ReadTheWeb project |
author |
Tiago Miguel Martins Vieira |
author_facet |
Tiago Miguel Martins Vieira |
author_role |
author |
dc.contributor.author.fl_str_mv |
Tiago Miguel Martins Vieira |
dc.subject.por.fl_str_mv |
Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
topic |
Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
description |
The main goal of CMU's ReadTheWeb project is to build a new kind of machine learning system that continuously reads the web, 24 hours per day, 7 days per week. This system is called the Never-Ending Language Learner (NELL) . While this goal is not necessarily unheard-of, NELL stands out as being capable of improving the way it learns over time, that is to say, it learns to read the web better than it did the day before. To succeed in such an arduous quest, NELL combines several subsystem components that implement complementary knowledge extraction methods. For the same task, NELL is able to use different extraction methods. The performance of the components that use such methods, that is the quality of the extracted knowledge, will however change over time. In order to maximize the performance of the system as a whole, it becomes necessary to choose the best component for a task at any given time. Due to the amount of data and algorithm's involved, traditional testing and selection methods are not a viable option. A preliminary approach to use metalearning to address this issue was proposed by Santos . In this project, we extend this work. Our approach seeks to relate the innate (meta)features of the data and the performance of algorithms. A first step will be to gather different sets of data (used in NELL) and test the performance of the above mentioned subsystem components on such data. The results are used to create a meta-learning system that can select the best algorithm for future sets of data. Proven successful, this system can then be implemented on NELL's framework to improve its learning capability. |
publishDate |
2015 |
dc.date.none.fl_str_mv |
2015-07-20 2015-07-20T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/110259 TID:201322269 |
url |
https://hdl.handle.net/10216/110259 |
identifier_str_mv |
TID:201322269 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799135943991492609 |