Alex Holcombe's blog

open science, open access, meta-science, perception, neuroscience, …

summarizing data by combinations of variables with python

with 2 comments

For data analysis, I switched from using MATLAB, partially motivated by a desire to support open source, to using R. But my experiments nowadays are written in Python, so I decided to try analyzing the data with Python as well.

SciPy is an open-source library that helps with this, and duplicates a lot of MATLAB functionality to make it easier to switch from MATLAB. IPython provides an interactive command line with tab-completion, history, and some of the other conveniences that come with MATLAB. It’s been working well for my data plotting, except my code was becoming cumbersome when it came to extracting the data I wanted to plot. The loadtxt function easily imports my data files in a structure called a recarray, similar to a data.frame in R, a lot like a flat spreadsheet with a name for each column. Then, I need to plot the dependent variable as a function of a subset of the independent variables in the experiment, like this: plotting data by eccentricity, subject, petal/fugal
Here I plotted the mean shift, and std dev of the shift, by observer (columns), eccentricity, and direction of motion (colors). This requires collapsing across the other variables that you can’t see here. I think this involves a “PivotTable” in Excel terminology. For python, I wrote a function where I pass a recarray and the names of the variables (datafile columns) that I want to collapse by, and it passes back multi-dimensional arrays providing the mean, standard deviation, and number of data points for every combination of the variables.
collapseBy(data,DV,*factors)
I hope someone finds this code as useful as I do; it seems something like this should be put into SciPy.

Update: Josef schooled me (in a helpful way!) by writing new code for this functionality in three different ways, with each way much cleaner than mine.

About these ads

Written by alexholcombe

January 26, 2009 at 2:30 pm

Posted in science

Tagged with , , ,

2 Responses

Subscribe to comments with RSS.

  1. [...] wrote something for #1, but #2 is too much for me. I have had to start using [...]

  2. Great post! Helped me getting started with scipy. Thank you!

    Jakub Spaeti

    May 16, 2009 at 6:30 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 766 other followers