Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus

Detalhes bibliográficos
Autor(a) principal: Almeida, Daniela
Data de Publicação: 2020
Outros Autores: Domínguez-Pérez, Dany, Matos, Ana, Agüero-Chapin, Guillermin, Castaño-Guerrero, Yuselis, Vasconcelos, Vitor, Campos, Alexandre, Antunes, Agostinho
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.22/18595
Resumo: Here we provide all datasets and details applied in the construction of a composite protein database required for the proteogenomic analyses of the article “Putative Antimicrobial Peptides of the Posterior Salivary Glands from the Cephalopod Octopus vulgaris Revealed by Exploring a Composite Protein Database”. All data, subdivided into six datasets, are deposited at the Mendeley Data repository as follows. Dataset_1 provides our composite database “All_Databases_5950827_sequences.fasta” derived from six smaller databases composed of (i) protein sequences retrieved from public databases related to cephalopods’ salivary glands, (ii) proteins identified with Proteome Discoverer software using our original data obtained by shotgun proteomic analyses of posterior salivary glands (PSGs) from three Octopus vulgaris specimens (provided as Dataset_2) and (iii) a non-redundant antimicrobial peptide (AMP) database. Dataset_3 includes the transcripts obtained by de novo assembly of 16 transcriptomes from cephalopods’ PSGs using CLC Genomics Workbench. Dataset_4 provides the proteins predicted by the TransDecoder tool from the de novo assembly of 16 transcriptomes of cephalopods’ PSGs. Further details about database construction, as well as the scripts and command lines used to construct them, are deposited within Dataset_5 and Dataset_6. The data provided in this article will assist in unravelling the role of cephalopods’ PSGs in feeding strategies, toxins and AMP production
id RCAP_da6ad4861c433561c9e365ed13136221
oai_identifier_str oai:recipp.ipp.pt:10400.22/18595
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary ApparatusOctopus vulgarisShotgun proteomicsQ-ExactiveTranscriptome de novo assemblyMass spectrometry-based proteomicsTransDecoderSix-frame translation toolCLC Genomics WorkbenchHere we provide all datasets and details applied in the construction of a composite protein database required for the proteogenomic analyses of the article “Putative Antimicrobial Peptides of the Posterior Salivary Glands from the Cephalopod Octopus vulgaris Revealed by Exploring a Composite Protein Database”. All data, subdivided into six datasets, are deposited at the Mendeley Data repository as follows. Dataset_1 provides our composite database “All_Databases_5950827_sequences.fasta” derived from six smaller databases composed of (i) protein sequences retrieved from public databases related to cephalopods’ salivary glands, (ii) proteins identified with Proteome Discoverer software using our original data obtained by shotgun proteomic analyses of posterior salivary glands (PSGs) from three Octopus vulgaris specimens (provided as Dataset_2) and (iii) a non-redundant antimicrobial peptide (AMP) database. Dataset_3 includes the transcripts obtained by de novo assembly of 16 transcriptomes from cephalopods’ PSGs using CLC Genomics Workbench. Dataset_4 provides the proteins predicted by the TransDecoder tool from the de novo assembly of 16 transcriptomes of cephalopods’ PSGs. Further details about database construction, as well as the scripts and command lines used to construct them, are deposited within Dataset_5 and Dataset_6. The data provided in this article will assist in unravelling the role of cephalopods’ PSGs in feeding strategies, toxins and AMP productionThis study was supported by the Strategic Funding UIDB/04423/2020 and UIDP/04423/2020 through national funds provided by the Foundation for Science and Technology (FCT) and the European Regional Development Fund (ERDF) in the framework of the program PT2020, by the European Structural and Investment Funds (ESIF) through the Competitiveness and Internationalization Operational Program—COMPETE 2020 and by National Funds through the FCT under the project PTDC/CTA-AMB/31774/2017 (POCI-01-0145-FEDER/031774/2017). We are grateful to Robert Carcasses (Full Stack developer at Kenkou GmbH, Germany) for language programming advice and assistance.MDPIRepositório Científico do Instituto Politécnico do PortoAlmeida, DanielaDomínguez-Pérez, DanyMatos, AnaAgüero-Chapin, GuillerminCastaño-Guerrero, YuselisVasconcelos, VitorCampos, AlexandreAntunes, Agostinho2021-09-28T09:05:54Z20202020-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.22/18595eng10.3390/data5040110info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-13T13:10:13Zoai:recipp.ipp.pt:10400.22/18595Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:38:03.469995Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus
title Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus
spellingShingle Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus
Almeida, Daniela
Octopus vulgaris
Shotgun proteomics
Q-Exactive
Transcriptome de novo assembly
Mass spectrometry-based proteomics
TransDecoder
Six-frame translation tool
CLC Genomics Workbench
title_short Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus
title_full Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus
title_fullStr Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus
title_full_unstemmed Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus
title_sort Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus
author Almeida, Daniela
author_facet Almeida, Daniela
Domínguez-Pérez, Dany
Matos, Ana
Agüero-Chapin, Guillermin
Castaño-Guerrero, Yuselis
Vasconcelos, Vitor
Campos, Alexandre
Antunes, Agostinho
author_role author
author2 Domínguez-Pérez, Dany
Matos, Ana
Agüero-Chapin, Guillermin
Castaño-Guerrero, Yuselis
Vasconcelos, Vitor
Campos, Alexandre
Antunes, Agostinho
author2_role author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv Repositório Científico do Instituto Politécnico do Porto
dc.contributor.author.fl_str_mv Almeida, Daniela
Domínguez-Pérez, Dany
Matos, Ana
Agüero-Chapin, Guillermin
Castaño-Guerrero, Yuselis
Vasconcelos, Vitor
Campos, Alexandre
Antunes, Agostinho
dc.subject.por.fl_str_mv Octopus vulgaris
Shotgun proteomics
Q-Exactive
Transcriptome de novo assembly
Mass spectrometry-based proteomics
TransDecoder
Six-frame translation tool
CLC Genomics Workbench
topic Octopus vulgaris
Shotgun proteomics
Q-Exactive
Transcriptome de novo assembly
Mass spectrometry-based proteomics
TransDecoder
Six-frame translation tool
CLC Genomics Workbench
description Here we provide all datasets and details applied in the construction of a composite protein database required for the proteogenomic analyses of the article “Putative Antimicrobial Peptides of the Posterior Salivary Glands from the Cephalopod Octopus vulgaris Revealed by Exploring a Composite Protein Database”. All data, subdivided into six datasets, are deposited at the Mendeley Data repository as follows. Dataset_1 provides our composite database “All_Databases_5950827_sequences.fasta” derived from six smaller databases composed of (i) protein sequences retrieved from public databases related to cephalopods’ salivary glands, (ii) proteins identified with Proteome Discoverer software using our original data obtained by shotgun proteomic analyses of posterior salivary glands (PSGs) from three Octopus vulgaris specimens (provided as Dataset_2) and (iii) a non-redundant antimicrobial peptide (AMP) database. Dataset_3 includes the transcripts obtained by de novo assembly of 16 transcriptomes from cephalopods’ PSGs using CLC Genomics Workbench. Dataset_4 provides the proteins predicted by the TransDecoder tool from the de novo assembly of 16 transcriptomes of cephalopods’ PSGs. Further details about database construction, as well as the scripts and command lines used to construct them, are deposited within Dataset_5 and Dataset_6. The data provided in this article will assist in unravelling the role of cephalopods’ PSGs in feeding strategies, toxins and AMP production
publishDate 2020
dc.date.none.fl_str_mv 2020
2020-01-01T00:00:00Z
2021-09-28T09:05:54Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.22/18595
url http://hdl.handle.net/10400.22/18595
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.3390/data5040110
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv MDPI
publisher.none.fl_str_mv MDPI
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799131470084702208