Friday, July 27, 2012

Tophat-Fusion and Picard SortSam

I ran TopHat2 on my RNA-Seq paired-end reads with "fusion-search" on in order to detect some fusion events. The resulting output file, "accepted_hits.bam", is coordinate sorted by default. I tried to sort  it by queryname using Picard tools' SortSam.jar. In doing so, SortSam.jar terminates with an error about chromosome coordinate mistatch between mates. After pondering over the error for a moment, I realized that SortSam expects both of mates to be aligned to the same chromosome and throws an error instance in case of any inconsistency which is in-fact the case with the mates spanning a fusion event.
The solution is simple: Initiate the SortSam.jar with VALIDATION_STRINGENCY=SILENT argument.
An alternative is to use the samtools sort but I prefer Picard's SortSam as it is much faster than the samtools sort.