The iter command operates on BAM/CRAM and VCF/BCF files, and is used to generate genomic ranges that can be used to process genomic data in chunks. It works well with tools such as xargs or gnu-parallel.

Example

sc iter test.bam 100,000 # Iterate on bins of 100k base pairs
sc iter test.bam 100000 # Also valid
sc iter test.bam 1e6 # Also valid

# Outputs
> I:0-999999
> I:1000000-1999999
> I:2000000-2999999
> I:3000000-3999999
> I:4000000-4999999

This list of genomic ranges can be used to process a BAM or VCF in parallel:


function process_chunk {
  # Code to process chunk
  vcf=$1
  region=$2
  # e.g. bcftools call -m --region 
  echo bcftools call --region $region $vcf # ...
}

# Export the function to make it available to GNU parallel
export -f process_chunk

parallel --verbose process_chunk ::: test.bam ::: $(sc iter test.bam)

You can also set the [width] option to 0 to generate a list of chromosomes.

iter