December 19, 2012   

ccmatch is used to randomly match cases and controls based on specified criteria. For instance, if you wanted to randomly match cases and controls based on age, you can use ccmatch to pair up people with the same age. You can use multiple variables to match based on multiple criteria.


ssc install ccmatch


ccmatch variable_list, cc( ) [id( )]

*specifying an id is optional

  • variable_list The variable list are categorical or discrete variables you want to match on (example: age, sex, weight class, etc.).
  • cc( ) Specify your case control variable here. 0=control; 1=case makes the most sense to me but it could be the reverse as well.
  • id( ) (optional) Specify a variable you use as an ID and the match_id variable will be created and list the case/control partner.

ccmatch creates one to two variables:

  • match – an integer shared by a case and control.
  • match_id – Optional – the ID partner of the case control pair (specified in a separate variable).


match_id match name case_control age
a6 1 a2 15
a2 1 a6 1 15
a7 2 a4 16
a4 2 a7 1 16
a8 3 a5 17
a5 3 a8 1 17
a10 4 a1 19
a1 4 a10 1 19
. a3 15
. a9 1 18

The above output is an example of what match can do. The highlighted variables were created by ccmatch. The original data (name, case_control, age) is unchanged, except that it has been reordered. The command used was:

ccmatch age, id(name) cc(case_control)

Age was specified following ccmatch to indicate that we wanted to match cases/control who are the same age.

The case/control variable is specified as an option using cc( ), and the id of each individual is specified using id( ).

STATA Programs 

Sync STATA programs and settings with Dropbox

December 18, 2012   

With the release of STATA 12, users are allowed to install STATA across platform (linux, mac, windows) on up to three computers/user. If you frequently install/edit programs you can sync files from the ado directory, where programs are stored, across your computers using Dropbox.

Step 1: Install Dropbox

Go to Dropbox, signup for an account and install.

Step 2: Create an ado directory

Create the following directories within your dropbox directory:

  1. dropbox/ado
  2. dropbox/ado/plus – For installed ado files.
  3. dropbox/ado/personal – For personal ado files

Step 3: Edit is a file that runs every time you startup STATA. STATA will look in a variety of places for the file depending on your operating system. Type help profile for more information on where it is stored on your operating system.

This file needs to be created and edited on each system you want to sync. Here’s where you might store it on Mac and linux:

  • **Mac **/Users/[your username]/Library/Application Support/Stata/ado/personal/
  • Linux /bin/

Next you need to edit this file on each system to point at the appropriate dropbox directory. On Mac this might look like this:

sysdir set PERSONAL "~/Dropbox/ado/personal/"
sysdir set PLUS "~/Dropbox/ado/plus/"

Restart STATA and you should see something like this:

running /Users/Dan/Library/Application Support/Stata/ado/personal/ 

Dropbox does the rest – syncing across your systems. You can run an additional do file located in your dropbox folder if you want to centrally edit startup settings like setting memory or turning the ‘more’ option off.



December 18, 2012   

dataplink is a simple program for importing recoded data from [plink][1]. Dataplink imports genotypic data from .ped files and also imports variable names (snp names) from .map files.


ssc install dataplink


Data from plink must be exported using the following commands:

  • –recode OR –recode12
  • –tab


dataplink using "/path/to/file/without/extension"


When you specify the _filename_ do not use extensions (i.e. do not add __.ped__ or __.map__). Dataplink looks for a .map and .ped file of the same name. </div> #### Limits STATA SE and MP flavors support a maximum of 32,767 variables while IC supports 2,047. This means you can only import ~32,000 SNPs with SE/MP or ~2,000 with IC. [1]:

STATA Programs