Generate fasta sequence lengths

August 13, 2014   

This one liner:

Takes a fasta file as input:

>EF491733
tcagattcaaacaccgacgacgatgacgtggcaaagtctcgacgtgtgcg
caattcgtgtatgtgtccagcaggacctcccggagaacgcggaccagtag
gaccaccaggtctacggggatcgccaggatggcct
>EF491734
tcacagggaatgaaggcactgttcgacttgatcgctttgagaccaagacc
cgtggcaattctcggagggcaatgcactgaagtgaacgagccaatagcga
tggcgctcaagtattggcaaatcgtgcaattatcctatgcggagacacat
gccaa
>EF491735
gtcttgcatgacccaaaaggctcctgctcttctgtttcttcttccaatac
atccttctaaccagttggaagggttgacgtatcaagacttcctgcatcaa
aacttcttgaatttgccttcatttgtcgcaattgtgcagc
>EF491736
taaatggaaggaatcacttggcgctgaagaatttgctctccgcacagctt
aatcagactggaactccaatggttaatccaatgatggctttacaacaaca
agcggccgcagtaaacctgattcccaacacaccaatttacccaccc
>EF491737
actctcgcaatcgtctctccccaaatgatgttaacatcactagaaatgac
aaccgaacatatagcccagtcactcctcgtatcacaacaagtgagcggac
agtaacaccggaacagcggtcgccgggtcgaaaagcgttcgaaaccattc
>EF491738
tccctcgttcattcacaacaaaggaaaagcaaactatgggccattcattg
ttgaaattatgaactatcatcagtattctgcaatgacaagtcatatggtc
aaagtaatgaaacggccccaccaggttccgccaatgaaggtcgaccctga
gg
>EF491739
tccttccaactgttgccaactttccaactacaagacacactgaaccagaa
actacgcggagacctctgtcgccttcaaaaatgacaccttctcttccttc
tcctaccaccaccactttgcctgttttctttttgtcacaaatcactgacg
gcgatgaatcagaagatgaa

Outputs sequence name and length:

EF491733    135
EF491734    155
EF491735    140
EF491736    146
EF491737    150
EF491738    152
EF491739    170

I made this today when I needed a way to generate sequence lengths required for some ChIP-Seq analysis.

Bash  Bioinformatics  Programming  FASTA gist one-liner