Data mining tool for academic data exploitation: literature review and first architecture proposal

Detalhes bibliográficos
Autor(a) principal: Barbu, Marian
Data de Publicação: 2017
Outros Autores: Vilanova, Ramon, Lopez Vicario, José, Pereira, Maria João, Alves, Paulo, Podpora, Michal, Ángel Prada, Miguel, Morán, Antonio, Torreburno, Aldo, Marin, Simona, Tocu, Rodica
Tipo de documento: Relatório
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10198/14337
Resumo: Using data for making decisions is not new; companies use complex computations on customer data for business intelligence or analytics. Business intelligence techniques can discern historical patterns and trends from data and can create models that predict future trends and patterns. Analytics, broadly defined, comprises applied techniques from computer science, mathematics, and statistics for extracting usable information from very large datasets. Data itself is not new. Data has always been generated and used to inform decision-making. However, most of this was structured and organised, through regular data collections, surveys, etc. What is new, with the invention and dominance of the Internet and the expansion of digital systems across all sectors, is the amount of unstructured data we are generating. This is what we call the digital footprint: the traces that individuals leave behind as they interact with their increasingly digital world. Data analytics is the process where data is collected and analysed in order to identify patterns, make predictions, and inform business decisions. Our capacity to perform increasingly sophisticated analytics is changing the way we make predictions and decisions, with huge potential to improve competitive intelligence. These examples suggest that the actions from data mining and analytics are always automatic, but that is less often the case. Educational Data Mining (EDM) and Learning Analytics (LA) have the potential to make visible data that have heretofore gone unseen, unnoticed, and therefore unactionable. To help further the fields and gain value from their practical applications, the recommendations are that educators and administrators: • Develop a culture of using data for making instructional decisions; • Involve IT departments in planning for data collection and use; • Be smart data consumers who ask critical questions about commercial offerings and create demand for the most useful features and uses; • Start with focused areas where data will help, show success, and then expand to new areas; • Communicate with students and parents about where data come from and how the data are used; • Help align state policies with technical requirements for online learning systems. This report documents the first steps conducted within the SPEET1 ERASMUS+ project. It describes the conceptualization of a practical tool for the application of EDM/LA techniques to currently available academic data. The document is also intended to contextualise the use of Big Data within the academic sector, with special emphasis on the role that student profiles and student clustering do have in support tutoring actions. The report describes the promise of educational data mining (seeking patterns in data across many student actions), learning analytics (applying predictive models that provide actionable information), and visual data analytics (interactive displays of analyzed data) and how they might serve the future of personalized learning and the development and continuous improvement of adaptive systems. How might they operate in an adaptive learning system? What inputs and outputs are to be expected? In the next sections, these questions are addressed by giving a system-level view of how data mining and analytics could improve teaching and learning by creating feedback loops. Finally, the proposal of the key elements that conform a software application that is intended to give support to this academic data analysis is presented. Three different key elements are presented: data, algorithms and application architecture. From one side we should have a minimum data available. The corresponding relational data base structure is presented. This basic data can always be complemented with other available data that may help to decide or/and to explain decisions. Classification algorithms are reviewed and is presented how they can be used for the generation of the student clustering problem. A convenient software architecture will act as an umbrella that connects the previous two parts. The document is intended to be useful for a first understanding of academic data analysis. What we can get and what we do need to do. This is the first of a series of reports that taken all together will provide a complete and consistent view towards the inclusion of data mining as a helping hand in the tutoring action.
id RCAP_a92da6534b8d03426411cf25cbc93dd9
oai_identifier_str oai:bibliotecadigital.ipb.pt:10198/14337
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Data mining tool for academic data exploitation: literature review and first architecture proposalAcademic analyticsLearning analyticsBig data in educationEducational data miningStudent profileDropout preventionUsing data for making decisions is not new; companies use complex computations on customer data for business intelligence or analytics. Business intelligence techniques can discern historical patterns and trends from data and can create models that predict future trends and patterns. Analytics, broadly defined, comprises applied techniques from computer science, mathematics, and statistics for extracting usable information from very large datasets. Data itself is not new. Data has always been generated and used to inform decision-making. However, most of this was structured and organised, through regular data collections, surveys, etc. What is new, with the invention and dominance of the Internet and the expansion of digital systems across all sectors, is the amount of unstructured data we are generating. This is what we call the digital footprint: the traces that individuals leave behind as they interact with their increasingly digital world. Data analytics is the process where data is collected and analysed in order to identify patterns, make predictions, and inform business decisions. Our capacity to perform increasingly sophisticated analytics is changing the way we make predictions and decisions, with huge potential to improve competitive intelligence. These examples suggest that the actions from data mining and analytics are always automatic, but that is less often the case. Educational Data Mining (EDM) and Learning Analytics (LA) have the potential to make visible data that have heretofore gone unseen, unnoticed, and therefore unactionable. To help further the fields and gain value from their practical applications, the recommendations are that educators and administrators: • Develop a culture of using data for making instructional decisions; • Involve IT departments in planning for data collection and use; • Be smart data consumers who ask critical questions about commercial offerings and create demand for the most useful features and uses; • Start with focused areas where data will help, show success, and then expand to new areas; • Communicate with students and parents about where data come from and how the data are used; • Help align state policies with technical requirements for online learning systems. This report documents the first steps conducted within the SPEET1 ERASMUS+ project. It describes the conceptualization of a practical tool for the application of EDM/LA techniques to currently available academic data. The document is also intended to contextualise the use of Big Data within the academic sector, with special emphasis on the role that student profiles and student clustering do have in support tutoring actions. The report describes the promise of educational data mining (seeking patterns in data across many student actions), learning analytics (applying predictive models that provide actionable information), and visual data analytics (interactive displays of analyzed data) and how they might serve the future of personalized learning and the development and continuous improvement of adaptive systems. How might they operate in an adaptive learning system? What inputs and outputs are to be expected? In the next sections, these questions are addressed by giving a system-level view of how data mining and analytics could improve teaching and learning by creating feedback loops. Finally, the proposal of the key elements that conform a software application that is intended to give support to this academic data analysis is presented. Three different key elements are presented: data, algorithms and application architecture. From one side we should have a minimum data available. The corresponding relational data base structure is presented. This basic data can always be complemented with other available data that may help to decide or/and to explain decisions. Classification algorithms are reviewed and is presented how they can be used for the generation of the student clustering problem. A convenient software architecture will act as an umbrella that connects the previous two parts. The document is intended to be useful for a first understanding of academic data analysis. What we can get and what we do need to do. This is the first of a series of reports that taken all together will provide a complete and consistent view towards the inclusion of data mining as a helping hand in the tutoring action.European UnionProgramme: Erasmus+ Project Reference: 2016-1-ES01-KA203-025452Instituto Politécnico de BragançaBiblioteca Digital do IPBBarbu, MarianVilanova, RamonLopez Vicario, JoséPereira, Maria JoãoAlves, PauloPodpora, MichalÁngel Prada, MiguelMorán, AntonioTorreburno, AldoMarin, SimonaTocu, Rodica2017-07-06T10:31:04Z20172017-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/reportapplication/pdfhttp://hdl.handle.net/10198/14337engBarbu, Marian; Vilanova, Ramon; Lopez Vicario, José; Pereira, Maria João; Alves, Paulo; Popdora, Michal; Ángel Prada, Miguel; Morán, Antonio; Torreburno, Aldo; Marin, Simona; Tocu, Rodica (2017). Data mining tool for academic data exploitation: literature review and first architecture proposal. Bragança: Instituto Politécnico. ISBN 978-972-745-228-6978-972-745-228-6info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-21T10:33:43Zoai:bibliotecadigital.ipb.pt:10198/14337Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:04:12.640200Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Data mining tool for academic data exploitation: literature review and first architecture proposal
title Data mining tool for academic data exploitation: literature review and first architecture proposal
spellingShingle Data mining tool for academic data exploitation: literature review and first architecture proposal
Barbu, Marian
Academic analytics
Learning analytics
Big data in education
Educational data mining
Student profile
Dropout prevention
title_short Data mining tool for academic data exploitation: literature review and first architecture proposal
title_full Data mining tool for academic data exploitation: literature review and first architecture proposal
title_fullStr Data mining tool for academic data exploitation: literature review and first architecture proposal
title_full_unstemmed Data mining tool for academic data exploitation: literature review and first architecture proposal
title_sort Data mining tool for academic data exploitation: literature review and first architecture proposal
author Barbu, Marian
author_facet Barbu, Marian
Vilanova, Ramon
Lopez Vicario, José
Pereira, Maria João
Alves, Paulo
Podpora, Michal
Ángel Prada, Miguel
Morán, Antonio
Torreburno, Aldo
Marin, Simona
Tocu, Rodica
author_role author
author2 Vilanova, Ramon
Lopez Vicario, José
Pereira, Maria João
Alves, Paulo
Podpora, Michal
Ángel Prada, Miguel
Morán, Antonio
Torreburno, Aldo
Marin, Simona
Tocu, Rodica
author2_role author
author
author
author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv Biblioteca Digital do IPB
dc.contributor.author.fl_str_mv Barbu, Marian
Vilanova, Ramon
Lopez Vicario, José
Pereira, Maria João
Alves, Paulo
Podpora, Michal
Ángel Prada, Miguel
Morán, Antonio
Torreburno, Aldo
Marin, Simona
Tocu, Rodica
dc.subject.por.fl_str_mv Academic analytics
Learning analytics
Big data in education
Educational data mining
Student profile
Dropout prevention
topic Academic analytics
Learning analytics
Big data in education
Educational data mining
Student profile
Dropout prevention
description Using data for making decisions is not new; companies use complex computations on customer data for business intelligence or analytics. Business intelligence techniques can discern historical patterns and trends from data and can create models that predict future trends and patterns. Analytics, broadly defined, comprises applied techniques from computer science, mathematics, and statistics for extracting usable information from very large datasets. Data itself is not new. Data has always been generated and used to inform decision-making. However, most of this was structured and organised, through regular data collections, surveys, etc. What is new, with the invention and dominance of the Internet and the expansion of digital systems across all sectors, is the amount of unstructured data we are generating. This is what we call the digital footprint: the traces that individuals leave behind as they interact with their increasingly digital world. Data analytics is the process where data is collected and analysed in order to identify patterns, make predictions, and inform business decisions. Our capacity to perform increasingly sophisticated analytics is changing the way we make predictions and decisions, with huge potential to improve competitive intelligence. These examples suggest that the actions from data mining and analytics are always automatic, but that is less often the case. Educational Data Mining (EDM) and Learning Analytics (LA) have the potential to make visible data that have heretofore gone unseen, unnoticed, and therefore unactionable. To help further the fields and gain value from their practical applications, the recommendations are that educators and administrators: • Develop a culture of using data for making instructional decisions; • Involve IT departments in planning for data collection and use; • Be smart data consumers who ask critical questions about commercial offerings and create demand for the most useful features and uses; • Start with focused areas where data will help, show success, and then expand to new areas; • Communicate with students and parents about where data come from and how the data are used; • Help align state policies with technical requirements for online learning systems. This report documents the first steps conducted within the SPEET1 ERASMUS+ project. It describes the conceptualization of a practical tool for the application of EDM/LA techniques to currently available academic data. The document is also intended to contextualise the use of Big Data within the academic sector, with special emphasis on the role that student profiles and student clustering do have in support tutoring actions. The report describes the promise of educational data mining (seeking patterns in data across many student actions), learning analytics (applying predictive models that provide actionable information), and visual data analytics (interactive displays of analyzed data) and how they might serve the future of personalized learning and the development and continuous improvement of adaptive systems. How might they operate in an adaptive learning system? What inputs and outputs are to be expected? In the next sections, these questions are addressed by giving a system-level view of how data mining and analytics could improve teaching and learning by creating feedback loops. Finally, the proposal of the key elements that conform a software application that is intended to give support to this academic data analysis is presented. Three different key elements are presented: data, algorithms and application architecture. From one side we should have a minimum data available. The corresponding relational data base structure is presented. This basic data can always be complemented with other available data that may help to decide or/and to explain decisions. Classification algorithms are reviewed and is presented how they can be used for the generation of the student clustering problem. A convenient software architecture will act as an umbrella that connects the previous two parts. The document is intended to be useful for a first understanding of academic data analysis. What we can get and what we do need to do. This is the first of a series of reports that taken all together will provide a complete and consistent view towards the inclusion of data mining as a helping hand in the tutoring action.
publishDate 2017
dc.date.none.fl_str_mv 2017-07-06T10:31:04Z
2017
2017-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/report
format report
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10198/14337
url http://hdl.handle.net/10198/14337
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Barbu, Marian; Vilanova, Ramon; Lopez Vicario, José; Pereira, Maria João; Alves, Paulo; Popdora, Michal; Ángel Prada, Miguel; Morán, Antonio; Torreburno, Aldo; Marin, Simona; Tocu, Rodica (2017). Data mining tool for academic data exploitation: literature review and first architecture proposal. Bragança: Instituto Politécnico. ISBN 978-972-745-228-6
978-972-745-228-6
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Instituto Politécnico de Bragança
publisher.none.fl_str_mv Instituto Politécnico de Bragança
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799135288234082304