Picard tools is a great set of utilities by the Broad Institute for performing sequence analysis. however, some of the utilities run on the slower side.
To speed things up, I created a new command:
insert-size as part of seq-collection. The command runs much faster, owing in part to parallelization of insert-size calculations.
insert-size does not operate in exactly the same way as picard
CollectInsertSizeMetrics, but the results are very close.
insert-size has some nice advantages over picard. The output is a lot more interpretable and parsable than standard picard output.
For example, if you run:
sc insert-size --basename --header tests/data/test.bam
The outputted table will be:
You can also output the distribution of insert-sizes by count by specifying the