The program performs post-processing on the de Bruijn graph from the Velvet assembler to construct splicing graphs for RNA-Seq libraries, which preserve alternative splicing information. For each node in each splicing graph, the expression level is reported as the number of reads per kilobase of node per million reads (RPKM) with respect to each library.
The source code consists of a single file postprocess.c. It can be compiled with the command "gcc -O3 -o postprocess postprocess.c".
The splicing graphs are represented in an annotated fasta format, in which each potentially non-linear structure is given as a collection of nodes, with connecting edge information embedded within the node names. Different splicing graphs are separated by blank lines.
Each node name is given as >NODE_u:v_1,v_2,...,v_p, where u is the ID of the current node, and u -> v_1, u -> v_2, ..., u -> v_p are edges in the splicing graph, following by one RPKM value for each library that are listed in the same order as the read files.
SNPs are reported within the sequences as IUPAC letters that are not A, C, G, T.
Sze S.-H. and Tarone A.M. (2014) A memory-efficient algorithm to obtain splicing graphs and de novo expression estimates from de Bruijn graphs of RNA-Seq data. BMC Genomics, 15(Suppl 5), S6.
Sze S.-H., Dunham J.P., Carey B., Chang P.L., Li F., Edman R.M., Fjeldsted C., Scott M.J., Nuzhdin S.V. and Tarone A.M. (2012) A de novo transcriptome assembly of Lucilia sericata (Diptera: Calliphoridae) with predicted alternative splices, single nucleotide polymorphisms, and transcript expression estimates. Insect Molecular Biology, 21, 205-221.