Never Ending Language metaLearning: model management for CMU's ReadTheWeb project

Tiago Miguel Martins Vieira

Never Ending Language metaLearning: model management for CMU's ReadTheWeb project

Detalhes bibliográficos
Autor(a) principal:	Tiago Miguel Martins Vieira
Data de Publicação:	2015
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	https://hdl.handle.net/10216/110259
Resumo:	The main goal of CMU's ReadTheWeb project is to build a new kind of machine learning system that continuously reads the web, 24 hours per day, 7 days per week. This system is called the Never-Ending Language Learner (NELL) . While this goal is not necessarily unheard-of, NELL stands out as being capable of improving the way it learns over time, that is to say, it learns to read the web better than it did the day before. To succeed in such an arduous quest, NELL combines several subsystem components that implement complementary knowledge extraction methods. For the same task, NELL is able to use different extraction methods. The performance of the components that use such methods, that is the quality of the extracted knowledge, will however change over time. In order to maximize the performance of the system as a whole, it becomes necessary to choose the best component for a task at any given time. Due to the amount of data and algorithm's involved, traditional testing and selection methods are not a viable option. A preliminary approach to use metalearning to address this issue was proposed by Santos . In this project, we extend this work. Our approach seeks to relate the innate (meta)features of the data and the performance of algorithms. A first step will be to gather different sets of data (used in NELL) and test the performance of the above mentioned subsystem components on such data. The results are used to create a meta-learning system that can select the best algorithm for future sets of data. Proven successful, this system can then be implemented on NELL's framework to improve its learning capability.

Metadados do item

id	RCAP_7494c74e307e6be273f0599b86715b68
oai_identifier_str	oai:repositorio-aberto.up.pt:10216/110259
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Never Ending Language metaLearning: model management for CMU's ReadTheWeb projectEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringThe main goal of CMU's ReadTheWeb project is to build a new kind of machine learning system that continuously reads the web, 24 hours per day, 7 days per week. This system is called the Never-Ending Language Learner (NELL) . While this goal is not necessarily unheard-of, NELL stands out as being capable of improving the way it learns over time, that is to say, it learns to read the web better than it did the day before. To succeed in such an arduous quest, NELL combines several subsystem components that implement complementary knowledge extraction methods. For the same task, NELL is able to use different extraction methods. The performance of the components that use such methods, that is the quality of the extracted knowledge, will however change over time. In order to maximize the performance of the system as a whole, it becomes necessary to choose the best component for a task at any given time. Due to the amount of data and algorithm's involved, traditional testing and selection methods are not a viable option. A preliminary approach to use metalearning to address this issue was proposed by Santos . In this project, we extend this work. Our approach seeks to relate the innate (meta)features of the data and the performance of algorithms. A first step will be to gather different sets of data (used in NELL) and test the performance of the above mentioned subsystem components on such data. The results are used to create a meta-learning system that can select the best algorithm for future sets of data. Proven successful, this system can then be implemented on NELL's framework to improve its learning capability.2015-07-202015-07-20T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/110259TID:201322269engTiago Miguel Martins Vieirainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T14:28:20Zoai:repositorio-aberto.up.pt:10216/110259Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:01:58.926755Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Never Ending Language metaLearning: model management for CMU's ReadTheWeb project
title	Never Ending Language metaLearning: model management for CMU's ReadTheWeb project
spellingShingle	Never Ending Language metaLearning: model management for CMU's ReadTheWeb project Tiago Miguel Martins Vieira Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering
title_short	Never Ending Language metaLearning: model management for CMU's ReadTheWeb project
title_full	Never Ending Language metaLearning: model management for CMU's ReadTheWeb project
title_fullStr	Never Ending Language metaLearning: model management for CMU's ReadTheWeb project
title_full_unstemmed	Never Ending Language metaLearning: model management for CMU's ReadTheWeb project
title_sort	Never Ending Language metaLearning: model management for CMU's ReadTheWeb project
author	Tiago Miguel Martins Vieira
author_facet	Tiago Miguel Martins Vieira
author_role	author
dc.contributor.author.fl_str_mv	Tiago Miguel Martins Vieira
dc.subject.por.fl_str_mv	Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering
topic	Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering
description	The main goal of CMU's ReadTheWeb project is to build a new kind of machine learning system that continuously reads the web, 24 hours per day, 7 days per week. This system is called the Never-Ending Language Learner (NELL) . While this goal is not necessarily unheard-of, NELL stands out as being capable of improving the way it learns over time, that is to say, it learns to read the web better than it did the day before. To succeed in such an arduous quest, NELL combines several subsystem components that implement complementary knowledge extraction methods. For the same task, NELL is able to use different extraction methods. The performance of the components that use such methods, that is the quality of the extracted knowledge, will however change over time. In order to maximize the performance of the system as a whole, it becomes necessary to choose the best component for a task at any given time. Due to the amount of data and algorithm's involved, traditional testing and selection methods are not a viable option. A preliminary approach to use metalearning to address this issue was proposed by Santos . In this project, we extend this work. Our approach seeks to relate the innate (meta)features of the data and the performance of algorithms. A first step will be to gather different sets of data (used in NELL) and test the performance of the above mentioned subsystem components on such data. The results are used to create a meta-learning system that can select the best algorithm for future sets of data. Proven successful, this system can then be implemented on NELL's framework to improve its learning capability.
publishDate	2015
dc.date.none.fl_str_mv	2015-07-20 2015-07-20T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://hdl.handle.net/10216/110259 TID:201322269
url	https://hdl.handle.net/10216/110259
identifier_str_mv	TID:201322269
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799135943991492609

Never Ending Language metaLearning: model management for CMU's ReadTheWeb project

Registros relacionados