A fast access big data approach for configurable and scalable object storage Enabling mixed fault-tolerance

Detalhes bibliográficos
Autor(a) principal: Valêncio, Carlos Roberto [UNESP]
Data de Publicação: 2017
Outros Autores: Caetano, André Francisco Morielo [UNESP], Colombini, Angelo Cesar, Tronco, Mário Luiz, Fortes, Márcio Zamboti
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.3844/jcssp.2017.192.198
http://hdl.handle.net/11449/174933
Resumo: The progressive growth in the volume of digital data has become a technological challenge of great interest in the field of computer science. That comes because, with the spread of personal computers and networks worldwide, content generation is taking larger proportions and very different formats from what had been usual until then. To analyze and extract relevant knowledge from these masses of complex and large volume data is particularly interesting, but before that, it is necessary to develop techniques to encourage their resilient storage. Very often, storage systems use a replication scheme for preserving the integrity of stored data. This involves generating copies of all information that, if lost by individual hardware failures inherent in any massive storage infrastructure, do not compromise access to what was stored. However, it was realized that accommodate such copies requires a real storage space often much greater than the information would originally occupy. Because of that, there is error correction codes, or erasure codes, which has been used with a mathematical approach considerably more refined than the simple replication, generating a smaller storage overhead than their predecessors techniques. The contribution of this work is a fully decentralized storage strategy that, on average, presents performance improvements of over 80%in access latency for both replicated and encoded data, while minimizing by 55% the overhead for a terabyte-sized dataset when encoded and compared to related works of the literature.
id UNSP_9cb7bd49cd6ca10bf8dcf5ccfb2e3559
oai_identifier_str oai:repositorio.unesp.br:11449/174933
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling A fast access big data approach for configurable and scalable object storage Enabling mixed fault-toleranceBig dataCacheData storageErasure codingObject storageThe progressive growth in the volume of digital data has become a technological challenge of great interest in the field of computer science. That comes because, with the spread of personal computers and networks worldwide, content generation is taking larger proportions and very different formats from what had been usual until then. To analyze and extract relevant knowledge from these masses of complex and large volume data is particularly interesting, but before that, it is necessary to develop techniques to encourage their resilient storage. Very often, storage systems use a replication scheme for preserving the integrity of stored data. This involves generating copies of all information that, if lost by individual hardware failures inherent in any massive storage infrastructure, do not compromise access to what was stored. However, it was realized that accommodate such copies requires a real storage space often much greater than the information would originally occupy. Because of that, there is error correction codes, or erasure codes, which has been used with a mathematical approach considerably more refined than the simple replication, generating a smaller storage overhead than their predecessors techniques. The contribution of this work is a fully decentralized storage strategy that, on average, presents performance improvements of over 80%in access latency for both replicated and encoded data, while minimizing by 55% the overhead for a terabyte-sized dataset when encoded and compared to related works of the literature.Department of Computer Science and Statistics - DCCE São Paulo State University (Unesp) Institute of Biosciences Humanities and Exact Sciences (Ibilce), Campus São José do Rio PretoDepartment of Computer Science and Statistics Federal University of São Carlos (UFSCar) São CarlosDepartment of Mechanical Engineering - EESC São Paulo University (USP) São CarlosDepartment of Electrical Engineering - TEE Fluminense Federal University (UFF)Department of Computer Science and Statistics - DCCE São Paulo State University (Unesp) Institute of Biosciences Humanities and Exact Sciences (Ibilce), Campus São José do Rio PretoUniversidade Estadual Paulista (Unesp)Universidade Federal de São Carlos (UFSCar)Universidade de São Paulo (USP)Fluminense Federal University (UFF)Valêncio, Carlos Roberto [UNESP]Caetano, André Francisco Morielo [UNESP]Colombini, Angelo CesarTronco, Mário LuizFortes, Márcio Zamboti2018-12-11T17:13:31Z2018-12-11T17:13:31Z2017-07-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article192-198application/pdfhttp://dx.doi.org/10.3844/jcssp.2017.192.198Journal of Computer Science, v. 13, n. 6, p. 192-198, 2017.1549-3636http://hdl.handle.net/11449/17493310.3844/jcssp.2017.192.1982-s2.0-850251293202-s2.0-85025129320.pdf46448122538758320000-0002-9325-3159Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengJournal of Computer Science0,147info:eu-repo/semantics/openAccess2023-10-06T06:05:02Zoai:repositorio.unesp.br:11449/174933Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T14:09:45.699874Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv A fast access big data approach for configurable and scalable object storage Enabling mixed fault-tolerance
title A fast access big data approach for configurable and scalable object storage Enabling mixed fault-tolerance
spellingShingle A fast access big data approach for configurable and scalable object storage Enabling mixed fault-tolerance
Valêncio, Carlos Roberto [UNESP]
Big data
Cache
Data storage
Erasure coding
Object storage
title_short A fast access big data approach for configurable and scalable object storage Enabling mixed fault-tolerance
title_full A fast access big data approach for configurable and scalable object storage Enabling mixed fault-tolerance
title_fullStr A fast access big data approach for configurable and scalable object storage Enabling mixed fault-tolerance
title_full_unstemmed A fast access big data approach for configurable and scalable object storage Enabling mixed fault-tolerance
title_sort A fast access big data approach for configurable and scalable object storage Enabling mixed fault-tolerance
author Valêncio, Carlos Roberto [UNESP]
author_facet Valêncio, Carlos Roberto [UNESP]
Caetano, André Francisco Morielo [UNESP]
Colombini, Angelo Cesar
Tronco, Mário Luiz
Fortes, Márcio Zamboti
author_role author
author2 Caetano, André Francisco Morielo [UNESP]
Colombini, Angelo Cesar
Tronco, Mário Luiz
Fortes, Márcio Zamboti
author2_role author
author
author
author
dc.contributor.none.fl_str_mv Universidade Estadual Paulista (Unesp)
Universidade Federal de São Carlos (UFSCar)
Universidade de São Paulo (USP)
Fluminense Federal University (UFF)
dc.contributor.author.fl_str_mv Valêncio, Carlos Roberto [UNESP]
Caetano, André Francisco Morielo [UNESP]
Colombini, Angelo Cesar
Tronco, Mário Luiz
Fortes, Márcio Zamboti
dc.subject.por.fl_str_mv Big data
Cache
Data storage
Erasure coding
Object storage
topic Big data
Cache
Data storage
Erasure coding
Object storage
description The progressive growth in the volume of digital data has become a technological challenge of great interest in the field of computer science. That comes because, with the spread of personal computers and networks worldwide, content generation is taking larger proportions and very different formats from what had been usual until then. To analyze and extract relevant knowledge from these masses of complex and large volume data is particularly interesting, but before that, it is necessary to develop techniques to encourage their resilient storage. Very often, storage systems use a replication scheme for preserving the integrity of stored data. This involves generating copies of all information that, if lost by individual hardware failures inherent in any massive storage infrastructure, do not compromise access to what was stored. However, it was realized that accommodate such copies requires a real storage space often much greater than the information would originally occupy. Because of that, there is error correction codes, or erasure codes, which has been used with a mathematical approach considerably more refined than the simple replication, generating a smaller storage overhead than their predecessors techniques. The contribution of this work is a fully decentralized storage strategy that, on average, presents performance improvements of over 80%in access latency for both replicated and encoded data, while minimizing by 55% the overhead for a terabyte-sized dataset when encoded and compared to related works of the literature.
publishDate 2017
dc.date.none.fl_str_mv 2017-07-01
2018-12-11T17:13:31Z
2018-12-11T17:13:31Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.3844/jcssp.2017.192.198
Journal of Computer Science, v. 13, n. 6, p. 192-198, 2017.
1549-3636
http://hdl.handle.net/11449/174933
10.3844/jcssp.2017.192.198
2-s2.0-85025129320
2-s2.0-85025129320.pdf
4644812253875832
0000-0002-9325-3159
url http://dx.doi.org/10.3844/jcssp.2017.192.198
http://hdl.handle.net/11449/174933
identifier_str_mv Journal of Computer Science, v. 13, n. 6, p. 192-198, 2017.
1549-3636
10.3844/jcssp.2017.192.198
2-s2.0-85025129320
2-s2.0-85025129320.pdf
4644812253875832
0000-0002-9325-3159
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Journal of Computer Science
0,147
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 192-198
application/pdf
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1808128324949508096