Multi-scale analysis of languages and knowledge through complex networks

Detalhes bibliográficos
Autor(a) principal: Arruda, Henrique Ferraz de
Data de Publicação: 2019
Tipo de documento: Tese
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da USP
Texto Completo: http://www.teses.usp.br/teses/disponiveis/55/55134/tde-26042019-105207/
Resumo: There any many different aspects in natural languages and their related dynamics that have been studied. In the case of languages, some quantitative analyses have been done by using stochastic models. Furthermore, natural languages can be understood as complex systems. Thus, there is a possibility to use set of tools development to analyse complex networks, which are computationally represented by graphs, also to analyse natural languages. Furthermore, these tools can be used to represent and analyse some related dynamics taking place on the networks. Observe that knowledge is intrinsically related to language, because language is the vehicle used by humans beings to transmit dicoveries, and the language itself is also a type of knowledge. This thesis is divided into two types of analyses: (i) texts and (II) dynamical aspects. In the first part, we proposed networks representations of text in different scales analyses, starting from the analysis of writing style considering word adjacency networks (co-occurence) to understand local patterns of words, to a mesoscopic representation, which is created from chunks of text and grasps information of the unfolding of the story. In the second part, we considered the structure and dynamics related to knowledge and language, in this case, starting from the larger scale, in which we studied the connectivity between applied and theoretical physics. In the following, we simulated the knowledge acquisition by researchers in a multi-agent dynamics and an intelligent machine that solves problems, which is represented by a network. At the smallest considered scale, we simulate the transmission of networks. This transmission considers the data as a series of organized symbols that is obtained from a dynamics. In order to improve the speed of transmission, the series can be compacted. For that, we considered the information theory and Huffman code. The proposed network-based approaches were found to be suitable to deal with the employed analysis for all of the tested scales.
id USP_cd1e9b1f345f5bd9eee9379b171fd773
oai_identifier_str oai:teses.usp.br:tde-26042019-105207
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str 2721
spelling Multi-scale analysis of languages and knowledge through complex networksAnálise multi-escala de línguas e conecimento por meio de redes complexasClassificação de textosComplex networksDinâmicas relacionadas ao conhecimentoDynamics related to knowledgeMineração de textosRedes complexasText classificationText miningThere any many different aspects in natural languages and their related dynamics that have been studied. In the case of languages, some quantitative analyses have been done by using stochastic models. Furthermore, natural languages can be understood as complex systems. Thus, there is a possibility to use set of tools development to analyse complex networks, which are computationally represented by graphs, also to analyse natural languages. Furthermore, these tools can be used to represent and analyse some related dynamics taking place on the networks. Observe that knowledge is intrinsically related to language, because language is the vehicle used by humans beings to transmit dicoveries, and the language itself is also a type of knowledge. This thesis is divided into two types of analyses: (i) texts and (II) dynamical aspects. In the first part, we proposed networks representations of text in different scales analyses, starting from the analysis of writing style considering word adjacency networks (co-occurence) to understand local patterns of words, to a mesoscopic representation, which is created from chunks of text and grasps information of the unfolding of the story. In the second part, we considered the structure and dynamics related to knowledge and language, in this case, starting from the larger scale, in which we studied the connectivity between applied and theoretical physics. In the following, we simulated the knowledge acquisition by researchers in a multi-agent dynamics and an intelligent machine that solves problems, which is represented by a network. At the smallest considered scale, we simulate the transmission of networks. This transmission considers the data as a series of organized symbols that is obtained from a dynamics. In order to improve the speed of transmission, the series can be compacted. For that, we considered the information theory and Huffman code. The proposed network-based approaches were found to be suitable to deal with the employed analysis for all of the tested scales.Existem diversos aspectos das linguagens naturais e de dinâmicas relacionadas que estão sendo estudadas. No caso das línguas, algumas análises quantitativas foram feitas usando modelos estocásticos. Ademais, linguagens naturais podem ser entendidas como sistemas complexos. Para analisar linguagens naturais, existe a possibilidade de utilizar o conjunto de ferramentas que já foram desenvolvidas para analisar redes complexas, que são representadas computacionalmente. Além disso, tais ferramentas podem ser utilizadas para representar e analisar algumas dinâmicas relacionadas a redes complexas. Observe que o conhecimento está intrinsecamente relacionado à linguagem, pois a linguagem é o veículo usado para transmitir novas descobertas, sendo que a própria linguagem também é um tipo de conhecimento. Esta tese é dividida em dois tipos de análise : (i) textos e (ii) aspectos dinâmicos. Na primeira parte foram propostas representações de redes de texto em diferentes escalas de análise. A partir da análise do estilo de escrita, considerando redes de adjacência de palavras (co-ocorrência) para entender padrões locais de palavras, até uma representação mesoscópica, que é criada a partir de pedaços de texto e que representa informações do texto de acordo com o desenrolar da história. Na segunda parte, foram consideradas a estrutura e dinâmica relacionadas ao conhecimento e à linguagem. Neste caso, partiu-se da escala maior, com a qual estudamos a conectividade entre física aplicada e física teórica. A seguir, simulou-se a aquisição de conhecimento por pesquisadores em uma dinâmica multi-agente e uma máquina inteligente que resolve problemas, que é representada por uma rede. Como a menor escala considerada, foi simulada a transmissão de redes. Essa transmissão considera os dados como uma série de símbolos organizados que são obtidos a partir de uma dinâmica. Para melhorar a velocidade de transmissão, a série pode ser compactada. Para tanto, foi utilizada a teoria da informação e o código de Huffman. As propostas de abordagens baseadas em rede foram consideradas adequadas para lidar com a análise empregada, em todas as escalas testadas.Biblioteca Digitais de Teses e Dissertações da USPAmancio, Diego RaphaelCosta, Luciano da FontouraArruda, Henrique Ferraz de2019-01-24info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttp://www.teses.usp.br/teses/disponiveis/55/55134/tde-26042019-105207/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2019-06-07T17:53:01Zoai:teses.usp.br:tde-26042019-105207Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212019-06-07T17:53:01Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Multi-scale analysis of languages and knowledge through complex networks
Análise multi-escala de línguas e conecimento por meio de redes complexas
title Multi-scale analysis of languages and knowledge through complex networks
spellingShingle Multi-scale analysis of languages and knowledge through complex networks
Arruda, Henrique Ferraz de
Classificação de textos
Complex networks
Dinâmicas relacionadas ao conhecimento
Dynamics related to knowledge
Mineração de textos
Redes complexas
Text classification
Text mining
title_short Multi-scale analysis of languages and knowledge through complex networks
title_full Multi-scale analysis of languages and knowledge through complex networks
title_fullStr Multi-scale analysis of languages and knowledge through complex networks
title_full_unstemmed Multi-scale analysis of languages and knowledge through complex networks
title_sort Multi-scale analysis of languages and knowledge through complex networks
author Arruda, Henrique Ferraz de
author_facet Arruda, Henrique Ferraz de
author_role author
dc.contributor.none.fl_str_mv Amancio, Diego Raphael
Costa, Luciano da Fontoura
dc.contributor.author.fl_str_mv Arruda, Henrique Ferraz de
dc.subject.por.fl_str_mv Classificação de textos
Complex networks
Dinâmicas relacionadas ao conhecimento
Dynamics related to knowledge
Mineração de textos
Redes complexas
Text classification
Text mining
topic Classificação de textos
Complex networks
Dinâmicas relacionadas ao conhecimento
Dynamics related to knowledge
Mineração de textos
Redes complexas
Text classification
Text mining
description There any many different aspects in natural languages and their related dynamics that have been studied. In the case of languages, some quantitative analyses have been done by using stochastic models. Furthermore, natural languages can be understood as complex systems. Thus, there is a possibility to use set of tools development to analyse complex networks, which are computationally represented by graphs, also to analyse natural languages. Furthermore, these tools can be used to represent and analyse some related dynamics taking place on the networks. Observe that knowledge is intrinsically related to language, because language is the vehicle used by humans beings to transmit dicoveries, and the language itself is also a type of knowledge. This thesis is divided into two types of analyses: (i) texts and (II) dynamical aspects. In the first part, we proposed networks representations of text in different scales analyses, starting from the analysis of writing style considering word adjacency networks (co-occurence) to understand local patterns of words, to a mesoscopic representation, which is created from chunks of text and grasps information of the unfolding of the story. In the second part, we considered the structure and dynamics related to knowledge and language, in this case, starting from the larger scale, in which we studied the connectivity between applied and theoretical physics. In the following, we simulated the knowledge acquisition by researchers in a multi-agent dynamics and an intelligent machine that solves problems, which is represented by a network. At the smallest considered scale, we simulate the transmission of networks. This transmission considers the data as a series of organized symbols that is obtained from a dynamics. In order to improve the speed of transmission, the series can be compacted. For that, we considered the information theory and Huffman code. The proposed network-based approaches were found to be suitable to deal with the employed analysis for all of the tested scales.
publishDate 2019
dc.date.none.fl_str_mv 2019-01-24
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://www.teses.usp.br/teses/disponiveis/55/55134/tde-26042019-105207/
url http://www.teses.usp.br/teses/disponiveis/55/55134/tde-26042019-105207/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1815257357064601600