technical note: d-prime proportion correct in choice experiments (signal detection theory)

If you don’t understand the title of this post, you almost certainly will regret reading further.

We’re doing an experiment in which one target is presented along with m distracters. The participant tries to determine which is the target, and must respond with their best guess regarding which is it. Together the m distracters + 1 target = “number of alternatives”.

In the plots shown are the predictions from vanilla signal detection theory for the relationship between probability correct, d-prime, and number of alternatives. Each distracter is assumed to have a discriminability of d-prime from the target.

signal detection theory relationship among percent correct, d-prime, number of alternatives
The two plots are essentially the inverse of each other.

Note that many studies use two-interval forced choice wherein the basic stimulus containing distracters are presented twice, one with the signal and the participants has to choose which contained the signal. In contrast, here I’m showing predictions for an experiment wherein the target with all its distracters is only presented once, and the participant reports which location contained the target.

I should probably add a lapse rate to these models, and generate curves using a reasonable lapse rate like .01.

I’ll post the R code using ggplot that I made to generate these later; email me if I don’t or you want it now. UPDATE: the code, including a parameter for lapse rate.

reference: Hacker, M. J., & Ratcliff, R. (1979). A revised table for d’ for M-alternative forced choice, 26(2), 168-170.
#To determine the probability of target winning, A, use the law of total probability:
# p(A) = Sum (p(A|B)p(B)) over all B
# Here, B will be all possible target TTC estimates and p(A|B) will be probability distracters
# are all lower than that target TTC estimate, B

# x is TTC estimate of distracter
# Probability that distracter TTC estimate less than target is pnorm(x): area under curve
# less than x.
# m: number of objects, m-1 of which are distracters
# p(A|B)*p(B) = pnorm(x)^(m-1) * dnorm(x-dprime)
# Hacker & Ratcliff, 1979 and Eliot 1964 derive this as did I
# Jakel & Wichmann say that "numerous assumptions necessary for mAFC" where m>2 but not clear
# whether talking about bias only or also about d'

Data analysis with Python, SciPy and R

I’ve transitioned to all open-source software for my science. The Python language and its libraries VisionEgg and Psychopy are more than sufficient to code my perception experiments. For data analysis, I’ve gotten pretty far with the SciPy library for Python, which has probability distributions, function minimization, Fourier transforms, etc. The Matplotlib library makes it easy to make plots in a way familiar for old MATLAB users like me.  Unfortunately however, it appears that nothing’s available for taking a load of data, data that’s formatted with many entries (e.g. rows) each of which has several values associated with it (one for each independent and dependent variable of the experiment), and

  1. summarizing (calculating mean etc.) of the dependent variable contingent on various independent variables (like an Excel pivot-table)
  2. performing the all-important (in experimental psychology and neuroscience) multiple linear regression and ANOVAs.

I wrote something for #1, but #2 is too much for me. I have had to start using R.

R appears to be the best open-source data analysis and statistics program, and has an incredible variety of packages for all sorts of analyses, often programmed as soon as a statistics professor dreams it up. For example, there is a package for the directional statistics I need, which I don’t think you can find in SPSS or SAS. The R syntax is really clunky, as opposed to the beauty that is Python, which is irritating but doesn’t actually slow one down much.

Fortunately RPy2 allows one to call R functions from Python. It’s a fairly basic interface and took me awhile to understand how to pass data between Python and R, but it works well. I’m very grateful to the developers, who deserve more help.

The documentation of all these Python libraries leaves a lot to be desired. The example code snippets for SciPy are still too sparse, and more are sorely needed to help users quickly do specific things without having to spend an hour figuring out exactly what some poorly-documented function’s parameters do. The same goes for RPy2. I hope to help out when I have time.
Update: some RPy help
Update: StackOverflow has some helpful answers for questions regarding how to use RPy2

if it still hasn’t happened yet, it’s likely to take a long time longer!

The Cauchy distribution is a unimodal distribution with fatter tails than a Gaussian. (Fig 1 at right)

Janssen & Shadlen (2005), Nature Neuroscience found that monkey LIP neuron activity followed the subjective hazard function of an objective bimodal probability density function, which goes up, down, then up again. With a Gaussian distribution (bell-shaped curve), the hazard function increases monotonically with time (Fig 2), in other words it is increasingly likely that the event will occur in the next moment if it has not occurred already, because the hazard function is proportional to the likelihood the event will occur in the next moment if it has not yet occurred.

But would the neurons successfully represent a Cauchy distribution for which the hazard rate actually decreases with time after the mean? (Fig 3)
This hazard function is surprising to many, because it seems that for a unimodal distribution, as time elapses and the event still has not occurred it should be increasingly likely that it will occur. But this won’t occur if the tails are fat enough, as pointed out by Nassim Taleb in his book The Black Swan. Hence the title of this post. This kind of hazard function applies to various real-world phenomena, like construction contractors! as time passes after when they said it would be done, every day they don’t finish it indicates the time they’ll finish is probably even further into the future. I think Taleb suggests that humans don’t usually represent this hazard function, but he’s probably referring to cognition. I don’t know if the same is true for a go/no-go learned response time task or the like, something more automatic than cognition. Probably noone has done this experiment. Maybe it is indeed very difficult to learn this.

Indeed I think someone has shown (maybe Taleb) that it is hard, takes a lot of data even in principle to learn the fatness of tails. Maybe our default hazard function is increasing. It might be easier to see this effect in a two-button experiment, where the task is to press one or the other button, and one has a Gaussian (increasing hazard) and the other a Cauchy distribution (decreasing hazard).