If you are doing a lot biological research and are interested in identifying whether an association exists between all the pairwise combinations between two sets of terms (e.g. two gene lists), you can use pubmed search results as a proxy for relative association.
In the example below, I show the results from organisms x diseases to give a rough estimate of how much each disease is studied in a given organism. Of course, this should all be taken with a (big) grain of salt because these organisms and diseases have many synonyms or related terms (e.g. M. Musculus is often referred to as Mouse in the literature). Additionally, the result count is based off of whether or not the terms were found together within the title and abstract of the literature only – and not the body of the text in many cases.
I’ve given bicycle touring a try. Originally I wanted to bike around Lake Michigan, but it turns out to be over 1,400 miles. So I compromised on a three day trip around a good chunk and making use of the ferry from Muskeegon, MI to Milwaukee, WI. This was my first time – so I also decided to stay in hotels. Next time I intend to camp. I learned a few valuable lessons along the way!
Pack less stuff! – I had way too much. In fact, I wound up breaking two spokes on the second day.
Shorten the days – Having never gone more than 40 miles in a single day, I decided to go 108 on the first day. Yeah. I probably should have gone more like 60-70 each day. By the time I got to my destination each day I was too tired to do anything. Part of the experience is seeing new places.
Get a proper touring bike – I didn’t use a touring bike because I don’t have one (yet). I used a Trek 7.2. My wrists hurt a lot for parts of the trip. Next time I’ll get a proper touring bike with the appropriate handle bars.
If you use runkeeper and pay for a yearly subscription (runkeeper elite), you can export your data and plot all of your activities simultaneously using R. I’ve written a script for doing so (Special thanks to flowing data which has a tutorial that helped with a few key parts of this).
The script does a few unique things.
Runkeeper exports data in gpx format. If you ever pause an activity within runkeeper or you lose GPS reception briefly, the GPS path will get split into multiple paths within the same file. The script will retain all paths and plot them separately.
This script will merge in the type of activities so you can plot different types of activities by color.
Finally, cluster analysis is used to segregate different locations when plotting. If you are like me and have moved around a bit – this is necessary as plotting distant locations on the same map (e.g. Chicago and Boston) is not feasible and results in distant locations being plotted as single points.
Export your runkeeper data. The option is available for subscribers only under the settings menu.
Exporting can be done from within the settings menu
Place the script below within a folder containing your runkeeper data. Set the num_locations variable to the number of places you have lived/run. This will be used to pull out the number of distinct running locations automatically.</p>
Install the necessary R packages. You can run the following code within R to do so.
Using runkeeper and with the help of a tutorial at flowing data, I was able to plot all of the running and biking I’ve been doing in Chicago since moving here two years ago. The blue is running and the black is biking.
When you have performed a sequencing project, quality control is one of the first things you will need to do. Unfortunately, sample mix-ups and other issues can and do happen. Systematic biases can also occur by machine and lane.
This script will extracting basic information from a set of FASTQs and output it to summary file (fastq_summary.txt). This will work with demultiplexed FASTQs generated by Illumina machines that appear in the following format:
@HWI-EAS209_0006_FC706VJ – Machine name
5 – lane
58 – tile within flowcell lane
5894 – x coordinate of cluster within tile
21141 – y coordinate of cluster within tile
#ATCACG – index
/1 – member of pair (/1 or /2)
The script below will extract the machine name, lane, index, and pair.