DSAP: deep-sequencing small RNA analysis pipeline

Po-Jung Huang Yi-Chung Liu Chi-Ching Lee Wei-Chen Lin Richie Ruei-Chi Gan Ping-Chiang Lyu Petrus Tang

DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log2-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

DSAP workflow. DSAP follows several analysis steps: (a) Cleanup to remove adaptors and poly-A/T/C/G/N nucleotides. (b) Clustering to group cleaned sequence tags into unique sequence clusters. (c) ncRNA matching to map unique sequence clusters against a transcribed sequence library of ncRNA (Rfam). (d) Known miRNA matching to detect known miRNAs in miRBase based on sequence homology. (e) Comparative miRNAomics to show differential miRNA expression profiles from different jobs, and cross-species distributions of identified miRNAs.Figure 1. DSAP workflow.

DSAP follows several analysis steps: (a) Cleanup to remove adaptors and poly-A/T/C/G/N nucleotides. (b) Clustering to group cleaned sequence tags into unique sequence clusters. (c) ncRNA matching to map unique sequence clusters against a transcribed sequence library of ncRNA (Rfam). (d) Known miRNA matching to detect known miRNAs in miRBase based on sequence homology. (e) Comparative miRNAomics to show differential miRNA expression profiles from different jobs, and cross-species distributions of identified miRNAs.