Python Command-line skeleton


February 2, 2017    Python

Writing a command-line interface (CLI) is an easy way to extend the functionality and ease of use of any code you write.

Python comes with the built-in module, argparse, that can be used to easily develop command-line interfaces. To speed up the process, I have developed a ‘skeleton’ application that can be forked on github and used to quickly develop CLI programs in python.

The repo has the following features added:

  • Testing with travis-ci and py.test
  • Coverage analysis using coveralls
  • A setup file that will install the command
  • a simple argparse interface

To get started, you should signup for an account on travis-ci and coveralls, and fork the repo!

repo python-cli-skeleton on Github


Introducing a Chicago Bioinformatics Slack Channel


January 31, 2017   

Today I am introducing a new slack team for bioinformatians in Chicago.

Signup for the Chicago Bioinformatics Slack Channel!

Currently anyone with an email at the following domains can signup:

  • @northwestern.edu
  • @uchicago.edu
  • @uic.edu
  • @depaul.edu
  • @luc.edu
  • @iit.edu

Members can invite anyone. I am happy to add any Chicago-area domains. Please let me know which ones I am missing!

The slack team features channels for bioinformatics-help, general, introductions, meetups, and random currently. We can add more channels!


Alfred Image Utilities


January 15, 2017    Alfred

alfred-image-utilities

A workflow for making quick changes to image files. Alfred-image-utilities grabs any selected images in the frontmost finder window and can apply changes to them. Most of the time a copy of the image is made and its extension is changed to <filename>.orig.<ext>. You can replace the original file by holding command when executing most commands.

Download

Main Menu

home

Convert to png or jpg

You can convert from a large number of formats to these jpg or png. The original file is retained unless you hold command.

convert

Scale images by a maximum width/height, by percent, or generate thumbnails.

Hold command to replace original. This option is not available when generating thumbnails. Generating thumbnails will add a .thumb to the filename (<filename>.thumb.<ext>)

scale

Rotate images (clockwise)

Hold command to replace original.

rotate

Convert images to black and white.

Hold command to replace original.

color


rdatastore


December 15, 2016    R R Package

I’ve developed a new package for R known as rdatastore that is avaliable at cloudyr/rdatastore. rdatastore provides an interface for Google Cloud’s datastore service. Google Cloud Datastore is a NoSQL database, which makes provides a mechanism for storing and retrieving heterogeneous data. Although Google Datastore is not useful for storing large datasets, it has a number of useful applications within R. For example:

  • Saving and loading credentials for use with other services.
  • Caching data. This is implemented using datastore in my version of the memoise package.
  • Saving/loading universally used pieces of data (e.g. parameters, options, settings) across systems or between work/home.
  • Storage and retrieval of small (<10,000 row) datasets. Useful for integration of summary datasets.

The last two reasons are the primary motivation for developing rdatastore. Parallelized pipelines can simultaneously submit results to datastore (across many nodes or machines), and the results are obtainable for analysis within R. Settings can be updated on one machine and retrieved on others as well, obviating the need to modify virtual machines or scripts in many cases.


datastore

The datastore interface can be used to view and edit data.


Setup

  1. Setup a Google Cloud Platform and create a new project.
  2. Download the Google Cloud SDK. This provides a command line based gcloud command.
  3. Install rdatastore
devtools::install_github("cloudyr/rdatastore")

Usage

Authentication

library(rdatastore)
authenticate_datastore("andersen-lab") # Enter your project ID here. rdatastore will authenticate using Oauth.

Storing Data

commit()

Individual entitites can be stored using commit(). You have to supply a kind (which is analogous to a table in relational database systems). You may optionally submit a name. Any additional arguments supplied are added as properties. Datatypes are inferred from R datatypes. For example:

commit(kind = "Car", name = "Tesla", wheels = 4) # Stores a new entity named 'Tesla'

Result

kind name wheels
car Tesla 4

Important! Stick with basic datatypes like character vectors, integers, doubles, binary, and datetime objects. Not all datatypes are supported.

I designed rdatastore to make it easier to append data rather than overwrite it. This is abit against the grain as far as other datastore libraries go. For example:

commit(kind = "Car", name = "Tesla", electric = TRUE) # Stores a new entity named 'Tesla'

The entity will now be:

kind name wheels electric
Car Tesla 4 TRUE

If you want to overwrite the entity, you can use keep_existing = FALSE, and the original data will be wiped and replaced.

When using commit() you can omit the name parameter in which case Google datastore will autogenerate an ID for the entity. I’m not sure where this is useful. You won’t be able to look the item up without knowing its ID or by performing a query on the entities data.

lookup()

Retrieve data by specifying its kind and name.

lookup("Car", "Tesla")
kind name wheels electric
Car Tesla 4 TRUE

gql()

You can query items using the Google Query Language (GQL). GQL is a lot like SQL.

# Lets commit a few more items
commit("Car", "VW", electric = FALSE)
commit("Car", "Honda", make = "Odyssey", wheels = 4)
commit("Car", "Reliant", make = "Robin", wheels = 3)

gql("SELECT * FROM Car")
kind name make wheels electric
Car Honda Odyssey 4 NA
Car Reliant Robin 3 NA
Car Tesla NA 4 TRUE
Car VW NA NA FALSE

Notice that some some properties are NA because they were never specified.

We can also query specific properties - but this will only return entitites with those properties defined.

gql("SELECT make FROM Car")
kind name make
Car Honda Odyssey
Car Reliant Robin

You can also filter on properties with GQL:

gql("SELECT * FROM Car WHERE wheels = 3")
kind name make wheels
Car Reliant Robin 3

A big list of favorites


November 29, 2016   

Here it is! My favorite things in life across all domains. This is a work in progress, but hopefully you’ll find one (or a few) things you like and add it to your life. It’s a bit sparse currently, but it will fill in over time.

Note: There are no referral links here and I am not being paid to advertise anything here. Any companies/products listed have earned it.

Programming/Software


Terminal

  • Homebrew - A phenomenal package manage. Use with homebrew/science!
  • Autojump - Jump among directories by typing their j and their name. Even works if you type it incorrectly! Install with brew install autojump.
  • pyenv - An easy way to manage multiple installations of python, and set which versions to open globally, locally, or by directory. Install with brew install pyenv.

Software (OSX)

  • Dropbox - The best backup/syncing solution. I pay for a subscription. I’ve used Box and Google Drive as well. Both are inferior.
  • Transmit - FTP Client.

Utilities

  • Sublime Text - The best text editor. Extend its functionality with package control.
  • Alfred - Like spotlight but with a lot more functionality. Workflows extend its functionality considerably. See the ones I’ve written!

Databases

  • Sequal Pro - A MYSQL GUI. The best GUI for a database I’ve seen.

Web Development

  • Jekyll - Static sites
  • Flask - Python framework
  • peewee - A very easy to use ORM for simple projects.

Programming

  • Github - A great place to work on projects using git.
  • Python - My favorite all-purpose programming language.

R

Python

Future

Personal


iOS Apps

  • Reeder - A great RSS reader.
  • Strava - Fitness tracker. I’ve used Runkeeper in the past. It’s worth switching. If you want to switch fitness apps and not lose everything, check out tapiriik.

Websites

News

Tech News

Financial

  • Vanguard - Retirement accounts and investing.

Blogs

Services

  • Pinboard - Simple bookmarks.
  • tapiriik - Sync fitness data across fitness tracking services.

Biking


Kitchen


Coffee

Wikipedia Pages


These are mostly a roundup of interesting ideas/facts/concepts I have come acorss.