Publications – 應用生物資訊研究室

3'-tetraethylbenzimidazolocarbocyanine; benzimidazole derivative; benzyloxycarbonylleucyl-leucyl-leucine aldehyde; carbocyanine; enzyme inhibitor; iron; leupeptin; n(g) methylarginine; nitric oxide; nitric oxide synthase; protozoal protein; reactive oxygen metabolite; transcriptome Animals; Base Sequence; Down-Regulation; Gene Expression Profiling; Gene Expression Regulation; Malate Dehydrogenase; MicroRNAs; Molecular Sequence Data; Proteome; Trichomonas vaginalis Antiprotozoal Agents; Drug Resistance; Female; Genome article; automation; bioinformatics; chicken; client server application; gene expression; nonhuman; nucleotide sequence; priority journal; RNA analysis; RNA sequence; sequence database; sequence homology; species comparison; chemistry; computer program; Internet; metabolism; sequence analysis article; bioinformatics; controlled study; flagellate; gene control; gene expression profiling; Giardia lamblia; high throughput sequencing; nonhuman; priority journal; reverse transcription polymerase chain reaction; trichomonas hominis; Trichomonas vaginalis; Tritrichomonas foetus; trophozoite B Raf kinase; cyclin dependent kinase inhibitor 2A; fibroblast growth factor receptor 3; Notch1 receptor; phosphatidylinositol 3 Base Pair Mismatch; Base Pairing; Contig Mapping; Escherichia coli; Gene Library; Genes carrier protein cell; cytochrome; gene expression; honeybee; longevity; molecular analysis; protein; queen Computational Biology; Databases Data Curation; Exome; Genome Databases Gel Human; High-Throughput Nucleotide Sequencing; Humans; Neural Networks hydrogen peroxide; iron; messenger RNA; proteasome; sulfur; ubiquitin; hydrogen peroxide; protozoal DNA Long Noncoding; RNA Mass microRNA Molecular; Expressed Sequence Tags; Gene Dosage; Gene Duplication; Gene Expression Regulation; Gene Library; Genes Neoplastic; Germ-Line Mutation; Humans; Kaplan-Meier Estimate; Male; Middle Aged; Mouth Neoplasms; Polymorphism Plant; Genome Proteins Protista; Trichomonas vaginalis Protozoa; Trichomonas vaginalis RNA; Software; Transcriptome RNA; Vietnam transcriptome Tumor; Carcinoma Two-Dimensional; Expressed Sequence Tags; Proteome; Protozoan Proteins; Spectrometry

Show all

2016

Gan, R. -C.; Chen, T. -W.; Wu, T. H.; Huang, P. -J.; Lee, C. -C.; Yeh, Y. -M.; Chiu, C. -H.; Huang, H. -D.; Tang, P.

PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms Journal Article

In: BMC Bioinformatics, 17 , 2016, ISSN: 14712105, (cited By 3).

Abstract | Links | BibTeX | 標籤: animal; biological model; Cnidaria; comparative study; gene expression profiling; genetics; genomics; high throughput sequencing; Internet; molecular genetics; procedures; sequence analysis; software, Animals; Cnidaria; Gene Expression Profiling; Genomics; High-Throughput Nucleotide Sequencing; Internet; Models, Biological; Molecular Sequence Annotation; Sequence Analysis, Birds; Gene expression; RNA; Websites, Functional annotation; Gene expression profiles; Next-generation sequencing; Quantification methods; Transcriptome assemblies; Transcriptome profiles; Transcriptome quantifications; Transcriptomes, RNA; Software; Transcriptome, transcriptome, Web services

@article{Gan2016,

title = {PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms},

author = {R. -C. Gan and T. -W. Chen and T. H. Wu and P. -J. Huang and C. -C. Lee and Y. -M. Yeh and C. -H. Chiu and H. -D. Huang and P. Tang},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85006892898&doi=10.1186%2fs12859-016-1366-1&partnerID=40&md5=4991defff4e25eeca679425824ef6437},

doi = {10.1186/s12859-016-1366-1},

issn = {14712105},

year  = {2016},

date = {2016-01-01},

journal = {BMC Bioinformatics},

volume = {17},

publisher = {BioMed Central Ltd.},

abstract = {Background: Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared. Results: Here, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours. Conclusions: In this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw. © 2016 The Author(s).},

note = {cited By 3},

keywords = {animal; biological model; Cnidaria; comparative study; gene expression profiling; genetics; genomics; high throughput sequencing; Internet; molecular genetics; procedures; sequence analysis; software, Animals; Cnidaria; Gene Expression Profiling; Genomics; High-Throughput Nucleotide Sequencing; Internet; Models, Biological; Molecular Sequence Annotation; Sequence Analysis, Birds; Gene expression; RNA; Websites, Functional annotation; Gene expression profiles; Next-generation sequencing; Quantification methods; Transcriptome assemblies; Transcriptome profiles; Transcriptome quantifications; Transcriptomes, RNA; Software; Transcriptome, transcriptome, Web services},

pubstate = {published},

tppubtype = {article}

}

Background: Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared. Results: Here, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours. Conclusions: In this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw. © 2016 The Author(s).

2012

Chen, T. W.; Gan, R. C.; Wu, T. H.; Huang, P. J.; Lee, C. Y.; Chen, Y. Y.; Chen, C. C.; Tang, P.

FastAnnotator--an efficient transcript annotation web tool. Journal Article

In: BMC genomics, 13 Suppl 7 , 2012, ISSN: 14712164, (cited By 48).

Abstract | Links | BibTeX | 標籤: animal; article; bacterial genome; Caenorhabditis elegans; computer interface; computer program; genetic database; genetics; genome; Internet; nucleotide sequence; Streptococcus, Animals; Base Sequence; Caenorhabditis elegans; Databases, Bacterial; Internet; Software; Streptococcus; Transcriptome; User-Computer Interface, Genetic; Genome; Genome, transcriptome

@article{Chen2012,

title = {FastAnnotator--an efficient transcript annotation web tool.},

author = {T. W. Chen and R. C. Gan and T. H. Wu and P. J. Huang and C. Y. Lee and Y. Y. Chen and C. C. Chen and P. Tang},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-84878804125&partnerID=40&md5=bcb1de3116f33ddc53ca741a32d393e9},

issn = {14712164},

year  = {2012},

date = {2012-01-01},

journal = {BMC genomics},

volume = {13 Suppl 7},

abstract = {Recent developments in high-throughput sequencing (HTS) technologies have made it feasible to sequence the complete transcriptomes of non-model organisms or metatranscriptomes from environmental samples. The challenge after generating hundreds of millions of sequences is to annotate these transcripts and classify the transcripts based on their putative functions. Because many biological scientists lack the knowledge to install Linux-based software packages or maintain databases used for transcript annotation, we developed an automatic annotation tool with an easy-to-use interface. To elucidate the potential functions of gene transcripts, we integrated well-established annotation tools: Blast2GO, PRIAM and RPS BLAST in a web-based service, FastAnnotator, which can assign Gene Ontology (GO) terms, Enzyme Commission numbers (EC numbers) and functional domains to query sequences. Using six transcriptome sequence datasets as examples, we demonstrated the ability of FastAnnotator to assign functional annotations. FastAnnotator annotated 88.1% and 81.3% of the transcripts from the well-studied organisms Caenorhabditis elegans and Streptococcus parasanguinis, respectively. Furthermore, FastAnnotator annotated 62.9%, 20.4%, 53.1% and 42.0% of the sequences from the transcriptomes of sweet potato, clam, amoeba, and Trichomonas vaginalis, respectively, which lack reference genomes. We demonstrated that FastAnnotator can complete the annotation process in a reasonable amount of time and is suitable for the annotation of transcriptomes from model organisms or organisms for which annotated reference genomes are not avaiable. The sequencing process no longer represents the bottleneck in the study of genomics, and automatic annotation tools have become invaluable as the annotation procedure has become the limiting step. We present FastAnnotator, which was an automated annotation web tool designed to efficiently annotate sequences with their gene functions, enzyme functions or domains. FastAnnotator is useful in transcriptome studies and especially for those focusing on non-model organisms or metatranscriptomes. FastAnnotator does not require local installation and is freely available at http://fastannotator.cgu.edu.tw.},

note = {cited By 48},

keywords = {animal; article; bacterial genome; Caenorhabditis elegans; computer interface; computer program; genetic database; genetics; genome; Internet; nucleotide sequence; Streptococcus, Animals; Base Sequence; Caenorhabditis elegans; Databases, Bacterial; Internet; Software; Streptococcus; Transcriptome; User-Computer Interface, Genetic; Genome; Genome, transcriptome},

pubstate = {published},

tppubtype = {article}

}

Recent developments in high-throughput sequencing (HTS) technologies have made it feasible to sequence the complete transcriptomes of non-model organisms or metatranscriptomes from environmental samples. The challenge after generating hundreds of millions of sequences is to annotate these transcripts and classify the transcripts based on their putative functions. Because many biological scientists lack the knowledge to install Linux-based software packages or maintain databases used for transcript annotation, we developed an automatic annotation tool with an easy-to-use interface. To elucidate the potential functions of gene transcripts, we integrated well-established annotation tools: Blast2GO, PRIAM and RPS BLAST in a web-based service, FastAnnotator, which can assign Gene Ontology (GO) terms, Enzyme Commission numbers (EC numbers) and functional domains to query sequences. Using six transcriptome sequence datasets as examples, we demonstrated the ability of FastAnnotator to assign functional annotations. FastAnnotator annotated 88.1% and 81.3% of the transcripts from the well-studied organisms Caenorhabditis elegans and Streptococcus parasanguinis, respectively. Furthermore, FastAnnotator annotated 62.9%, 20.4%, 53.1% and 42.0% of the sequences from the transcriptomes of sweet potato, clam, amoeba, and Trichomonas vaginalis, respectively, which lack reference genomes. We demonstrated that FastAnnotator can complete the annotation process in a reasonable amount of time and is suitable for the annotation of transcriptomes from model organisms or organisms for which annotated reference genomes are not avaiable. The sequencing process no longer represents the bottleneck in the study of genomics, and automatic annotation tools have become invaluable as the annotation procedure has become the limiting step. We present FastAnnotator, which was an automated annotation web tool designed to efficiently annotate sequences with their gene functions, enzyme functions or domains. FastAnnotator is useful in transcriptome studies and especially for those focusing on non-model organisms or metatranscriptomes. FastAnnotator does not require local installation and is freely available at http://fastannotator.cgu.edu.tw.