I have created a simple workflow for Alfred 2 which makes it easy to create a new text file in the frontmost finder window. Update – at the suggestion of a visitor – James Kachan – I have updated the workflow to automatically open the new file in a text editor. An alternative, more advanced workflow for Alfred 2 has also been created by Ian Isted.
Open Alfred 2, and type new followed by the name of the file. If you just type new, a file called ‘untitled.txt’ will be created.
The Variant Caller Format developed by the 1000 genomes project makes it easy to filter and manage large amounts of variant information for a set of subjects.
STATA offers an easy interface for sorting, filtering, and manipulating large datasets. I have developed a tool, vcf that makes it easy to import .vcf files into Stata (no easy task!).
The program does two challenging things to prepare the file for Stata:
It Splits the INFO column (delimited by ; ) into seperate columns. This is necessary because STATA has a string limit of 244 characters and truncates this column otherwise.
It recodes genotypic data, showing the genotypes of each individual.
ssc install vcf
I have only tested with STATA 12/SE. I believe it will also work with STATA 11 and perhaps earlier.
vcf using "path/to/file.vcf"
While it is possible to read in very large files – this program cannot handle enormous VCF Files. I have successfully loaded in files that are a few gigabytes. Therefore ideally you’ll filter enormous VCF Files prior to using this.
If your VCF Files has more than 9 alternative alleles, this program will incorrectly assign alleles beyond the 9th alternative allele.