Archive for the ‘science 2.0’ Category
Science is broken; let’s fix it. This has been my mantra for some years now, and today we are launching an initiative aimed squarely at one of science’s biggest problems. The problem is called publication bias or the file-drawer problem and it’s resulted in what some have called a replicability crisis.
When researchers do a study and get negative or inconclusive results, those results usually end up in file drawers rather than published. When this is true for studies attempting to replicate already-published findings, we end up with a replicability crisis where people don’t know which published findings can be trusted.
To address the problem, Dan Simons and I are introducing a new article format at the journal Perspectives on Psychological Science (PoPS). The new article format is called Registered Replication Reports (RRR). The process will begin with a psychological scientist interested in replicating an already-published finding. They will explain to we editors why they think replicating the study would be worthwhile (perhaps it has been widely influential but had few or no published replications). If we agree with them, they will be invited to submit a methods section and analysis plan and submit it to we editors. The submission will be sent to reviewers, preferably the authors of the original article that was proposed to be replicated. These reviewers will be asked to help the replicating authors ensure their method is nearly identical to the original study. The submission will at that point be accepted or rejected, and the authors will be told to report back when the data comes in. The methods will also be made public and other laboratories will be invited to join the replication attempt. All the results will be posted in the end, with a meta-analytic estimate of the effect size combining all the data sets (including the original study’s data if it is available). The Open Science Framework website will be used to post some of this. The press release is here, and the details can be found at the PoPS website.
Professor Daniel J. Simons (University of Illlinois) and I are co-editors for the RRRs. The chief editor of Perspectives on Psychological Science is Barbara A. Spellman (University of Virginia), and leadership and staff at the Association for Psychological Science, especially Eric Eich and Aime Ballard, have also played an important role (see their press release).
Three features make RRRs very different from the usual way that science gets published:
1. Preregistration of replication study design and analysis plan and statistics to be conducted BEFORE the data is collected.
- Normally researchers have a disincentive to do replication studies because they usually are difficult to publish. Here we circumvent the usual obstacles to replications by giving researchers a guarantee (provided they meet the conditions agreed during the review process) that their replication will be published, before they do the study.
- There will be no experimenter degrees of freedom to analyse the data in multiple ways until a significant but likely spurious result is found. This is particularly important for complex designs or multiple outcome variables, where those degrees of freedom allow one to always achieve a significant result. Not here.
2. Study is sent for review to the original author on the basis of the plan, BEFORE the data come in.
- Unlike standard replication attempts where the author of the published, replicated study sees it only after the results come in, we will catch the replicated author at an early stage. Many will provide constructive feedback to help perfect the planned protocol so it has the best chance of replicating the already-published target effect.
3. The results will not be presented as a “successful replication” or “failed replication”. Rarely is any one data set definitive by itself, so we will concentrate on making a cumulative estimate of the relevant effect’s size, together with a confidence interval or credibility interval.
- This will encourage people to make more quantitative theories aimed at predicting a certain effect size, rather than only worrying about whether the null hypothesis can be rejected (as we know, the null hypothesis is almost never true, so can almost always be rejected if one gets enough data).
This initiative is the latest in a long journey for me. Ten years ago, thinking that allowing the posting of comments on published papers would result in flaws and missed connections to come to light much earlier, David Eagleman and I published a letter to that effect in Nature and campaigned (unsuccessfully) for commenting to be allowed on PubMed abstracts.
Since then, we’ve seen that even where comments are allowed, few scientists make them, probably because there is little incentive to do so and doing it would risk antagonising their colleagues. In 2007 I became an academic editor and advisory board member for PLoS ONE, which poses fewer obstacles to publishing replication studies than do most journals. I’m lucky to have gone along on the ride as PLoS ONE rapidly became the largest journal in the world (I resigned my positions at PLoS ONE to make time for the gig at PoPS). But despite the general success of PLoS ONE, replication studies were still few and far between.
In 2011, Hal Pashler, Bobbie Spellman, Sean Kang and I started PsychFileDrawer, a website for researchers to post notices about replication studies. This has enjoyed some success, but it seems without the carrot of a published journal article, few researchers will upload results, or perhaps even conduct replication studies.
Finally with this Perspectives on Psychological Science initiative, a number of things have come together to overcome the main obstacles to publication studies: fear of antagonising other researchers and the uphill battle required to get the study published. Some other worthy efforts to encourage replication studies are happening at Cortex and BMC Psychology.
If you’re interested in proposing to conduct a replication study for eventual publication, check out the instructions and then drop us a line at replicationseditor @ psychologicalscience.org!
Below are research presentations I’m involved in for Vision Sciences Society in May. If you’re attending VSS, don’t forget about the Publishing, Open Access, and Open Science satellite which will be Friday at 11am. Let us know your opinion on the issues and what should be discussed here
Splitting attention slows attention: poor temporal resolution in multiple object tracking
Alex O. Holcombe, Wei-Ying Chen
Session Name: Attention: Tracking (Talk session)
Session Date and Time: Sunday, May 13, 2012, 10:45 am – 12:30 pm
Location: Royal Ballroom 4-5
When attention is split into foci at disparate locations, the minimum size of the selection focus at each location is larger than if only one location is targeted (Franconeri, Alvarez, & Enns, 2007)- splitting attention reduces its spatial resolution. Here we tested temporal resolution and speed limits. STIMULUS. Three concentric circular arrays (separated by large distances to avoid spatial interactions between them) of identical discs were centered on fixation. Up to three discs (one from each ring) were designated as targets. The discs orbited fixation at a constant speed, occasionally reversing direction. After the discs stopped, participants were prompted to report the location of one of the targets. DESIGN. Across trials, the speed of the discs and the number in each array was varied, which jointly determined the temporal frequency. For instance, with 9 objects in the array, a speed of 1.1 rps would be 9.9 Hz. RESULTS. With only one target, tracking was not possible above about 9 Hz, far below the limits for perceiving the direction of the motion, and consistent with Verstraten, Cavanagh, & LaBianca (2000). The data additionally suggest a speed limit, with tracking impossible above 1.8 rps, even when temporal frequency was relatively low. Tracking two targets could only be done at lower speeds (1.4 rps) and lower temporal frequencies (6 Hz). This decrease is approximately that predicted if at high speeds and high temporal frequencies, only a single target could be tracked. Tracking three yielded still lower limits. Little impairment was seen at very slow speeds, suggesting these results were not caused by a reduction in spatial resolution. CONCLUSION. Splitting attention reduces the speed limits and the temporal frequency limits on tracking. We suggest a parallel processing resource is split among targets, with less resource on a target yielding poorer spatial and temporal precision and slower maximum speed.
A hemisphere-specific attentional resource supports tracking only one fast-moving object.
Wei-Ying Chen & Alex O. Holcombe
Session Name: Attention: Tracking (Talk session)
Session Date and Time: Sunday, May 13, 2012, 10:45 am – 12:30 pm
Location: Royal Ballroom 4-5
Playing a team sport or taking children to the beach involves tracking multiple moving targets. Resource theory asserts that a limited resource is divided among targets, and performance reflects the amount available per target. Holcombe and Chen (2011) validated this with evidence that tracking a fast-moving target depletes the resource. Using slow speeds Alvarez and Cavanagh (2005) found the resource consumed by additional targets is hemisphere-specific. They didn’t test the effect of speed, and here we tested whether speed also depletes a hemisphere-specific resource. To put any speed limit cost in perspective, we modeled a “total depletion” scenario- the speed limit cost if at high speeds one could not track the additional target at all and had to guess one target. Experiment 1 found that the speed limit for tracking two targets in one hemifield was similar to that predicted by total depletion, suggesting that the resource was totally depleted. If the second target was instead placed in the opposite hemifield, little decrement in speed limit occurred. Experiment 2 extended this comparison to tracking two vs. four targets. Compared to the speed limit for tracking two targets in a single hemifield, adding two more targets in the opposite hemifield left the speed limit largely unchanged. However starting with one target in both the left and right hemifields, adding another to each hemifield had a severe cost similar to that of the total depletion model. Both experiments support the theory that an object moving very fast exhausts a hemisphere-specific attentional tracking resource.
Attending to one green item while ignoring another: Costly, but with curious effects of stimulus arrangement
Shih-Yu Lo & Alex O. Holcombe
Session Name: Attention: Features I (Poster session)
Session Date and Time: Monday, May 14, 2012, 8:15 am – 12:15 pm
Location: Vista Ballroom
Splitting attention between targets of different colors is not costly by itself. As we found previously, however, monitoring a target of a particular color makes one more vulnerable to interference by distracters that share the target color. Participants monitored the changing spatial frequencies of two targets of either the same (e.g., red and red) or different colors (e.g., red and green). The changing stimuli disappeared without warning and participants reported the final spatial frequency of one of the targets. In the different-colors condition, a large cost occurs if a green distracter is superposed on the red target in the first location and a red distracter is superposed on the green target in the second location. This likely reflects a difficulty with attending to a color in one location while ignoring it in another. Here we focus on a subsidiary finding regarding perceptual lags. Participants reported spatial frequency values from the past rather than the correct final value, and such lags were greater in the different-colors condition. This “perceptual lag” cost was found when the two stimuli were horizontally arrayed but not, curiously, when they were vertically arrayed. Arrangement was confounded however with processing by separate brain hemispheres (opposite hemifields). In our new study, we unconfounded arrangement and presentation in separate hemifields with a diagonal condition- targets were not horizontally arrayed but were still presented to different hemifields. No significant different-colors lag cost was found in this diagonal arrangement (5 ms) or in the vertical arrangement (86 ms), but the cost (167 ms) was significant in the horizontal arrangement, as in previous experiments. Horizontal arrangement apparently has a special effect apart from the targets being processed by different hemispheres. To speculate, this may reflect sensitivity to bilateral symmetry and its violation when the target colors are different.
Dysmetric saccades to targets moving in predictable but nonlinear trajectories
Reza Azadi, Alex Holcombe, and Jay Edelman
A saccadic eye movement to a moving object requires taking both the object’s position and velocity into account. While recent studies have demonstrated that saccades can do this quite well for linear trajectories, its ability to do so for stimuli moving in more complex, yet predictable, trajectories is unknown. With objects moving in circular trajectories, we document failures of saccades not only to compensate for target motion, but even to saccade successfully to any location on the object trajectory. While maintaining central fixation, subjects viewed a target moving in a circular trajectory at an eccentricity of 6, 9, or 12 deg for 1-2 sec. The stimulus orbited fixation at a rate of 0.375, 0.75, or 1.5 revolutions/sec. The disappearance of the central fixation point cued the saccade. Quite unexpectedly, the circularly moving stimuli substantially compromised saccade generation. Compared with saccades to non-moving targets, saccades to circularly moving targets at all eccentricities had substantially lower amplitude gains, greater curvature, and longer reaction times. Gains decreased by 20% at 0.375 cycles/sec and more than 50% at 1.5 cycles/sec. Reaction times increased by over 100ms for 1.5 cycles/sec. In contrast, the relationship between peak velocity and amplitude was unchanged. Given the delayed nature of the saccade task, the system ought to have sufficient time to program a reasonable voluntary saccade to some particular location on the trajectory. But, the abnormal gain, curvature, and increased reaction time indicate that something else is going on. The successive visual transients along the target trajectory perhaps engage elements of the reflexive system continually, possibly engaging vector averaging processes and preventing anticipation. These results indicate that motor output can be inextricably bound to sensory input even during a highly voluntary motor act, and thus suggest that current understanding of reflexive vs. voluntary saccades is incomplete.
Bradley Voytek spotted a disturbing question in an official “Responsible Conduct of Research” training program:
This defense of the status quo has no place in a “Responsible Conduct of Research” training program. It reads like the old guard self-interestedly maintaining the current system by foisting unjustified beliefs onto young researchers!
The part that bothers me the most is the sentence “It is likely that the peer review process will evolve to minimize bias and conflicts of interest”. What is the evidence for this?
Has the process been evolving to minimize bias and conflicts, or to increase them? I don’t think the answer is very clear. As counterweight to the official optimistic opinion, here are a few corrupting influences:
- Pharmaceutical companies continue to buy influence with medical journals, by buying hundreds of copies of journal issues that run studies that support their products.
- Pharmaceutical companies continue to ghostwrite journal articles for doctors, to plant their views in the medical literature.
- Scientists of every stripe often fail to disclose their conflicts of interest.
- Journals develop new revenue streams, like fast-tracking articles for a fee, that may open them to favoring the select authors who pay.
- Many reviewers are, like most humans, biased towards their own self-interest. This can yield a bias to recommend rejection of papers by rivals. Because reviewers in most journals are anonymous, they are never held to account.
- Journals don’t have the resources to investigate authors accused of fraud, and universities often try to avoid finding fault with the researchers they employ.
Many people have suggested partial remedies to these problems, but it’s an uphill battle to implement them, due to the slow pace of change in the journal system. We have to remember this and not be lulled into complacency by the propaganda seen in that training program. It was created by an organization of academics called CITI.
UPDATE: In the comments below, Jason Snyder pointed out an article from CITI in which CITI reports that over 6,000 researchers a month are taking this course — being subjected to this biased question. Some of us object not only to their characterization of the peer review process, but also to their suggestion that blogs are not a good place to do science. We don’t want thousands of researchers to continue to be forced to assent to the conservative opinion articulated by CITI, so we’re drafting a letter asking them to delete the question.
The transmission of new scientific ideas and knowledge is needlessly slow:
|Journal subscription fees||Open access mandates|
|Competition to be first-to-publish motivates secrecy||Open Science mandates|
|Jargon||Increase science communication; science blogging|
|Pressure to publish high quantity means no time for learning from other areas||Reform of incentives in academia|
|Inefficient format of journal articles (e.g. prose)||Evidence charts, ?|
|Long lag time until things are published||Peer review post publication, not pre publication|
|Difficulty publishing fragmentary criticisms||Open peer review; incentivize post-publication commenting|
|Information contained in peer reviewers’ reviews is never published||Open peer review or publication of (possibly anonymous) reviews; incentivize online post-publication commenting|
|Difficulty publishing non-replications||Open Science|
UPDATE: Daniel Mietchen, in the true spirit of open science, has put up an editable version of this very incomplete table.
This is something science aspires to in evaluating manuscripts for publication. In fact it’s fundamental to the integrity of science- when you read an article in a scientific journal, the idea is that you should know that the article went through the same review process as all the others.
The science you see in a prestigious journal is not there because the authors paid more than others did, but rather because the science was evaluated as high quality on its own merits. There are flaws in the process, such as the bias that occurs because usually the reviewers know who the manuscript authors are, but at least there’s no real money involved- nothing as meretricious as some cash to grease the wheels.
But now money has started to infiltrate the system. Several journals are now accepting money for “fast-track” services. It is hard to see how this policy can be implemented without sometimes giving the monied authors an advantage over those who don’t pay. Fast-tracking seems likely to leads to shortcut by the editor or reviewers as they seek to meet the fast-tracking deadline. And it seems these journals won’t even indicate which manuscripts benefited from fast-tracking and which didn’t.
Please join us in signing an open protest letter that we’ll soon send to these journals.
Quodlibet is an obscure word that originally referred to a medieval event that included a debate. I haven’t been able to find much information about it, but here is a brief description from Graham (2007):
Beginning in the thirteenth century, quodlibets were a part of the academic program of the theology and philosophy faculties of universities… The first of the two days of the quodlibet was a day of debate presided over by a master who proposed a question of his own for discussion.
I’m interested in this because I rue the lack of debates in modern science. Perhaps it’s only an accident of history that real debates aren’t happening much nowadays.
Also during the quodlibet, according to Graham, the presiding master “accepted questions from anyone present on any subject and answers were suggested by the master and others.” Sounds like a blend of the unconference (1,2) and what we think of as a traditional debate.
We’ve created evidencechart.com as a way debating might be revived in a compact online form. However, we’re still working on the adversarial form of evidence charts designed especially for debating. Let me know if you’re interested in participating.
Graham, BFH (2007). Review of Magistri Johannis Hus: Quodlibet, Disputationis de Quolibet Pragae in Facultate Artium Mense Ianuario anni 1411 habitae Enchiridion. The Catholic Historical Review, Volume 93, Number 3, pp. 639-640.
Richard Feynman, in his 1974 cargo-cult science commencement address:
If you make a theory, for example, and advertise it, or put it out, then you must also put down all the facts that disagree with it, as well as those that agree with it…
In summary, the idea is to try to give all of the information to help others to judge the value of your contribution; not just the information that leads to judgment in one particular direction or another.
Unfortunately, the average scientific journal article doesn’t follow this principle. I wouldn’t go so far as to say that the average article is just a sales job, but the emphasis is really on giving the information that favors the author’s theory. I say this based on my experience as a journal editor (for PLoS ONE), a reviewer (for a few dozen journals), and as a reader and author absorbing the norms of my field.
It’s a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty—a kind of leaning over backwards. For example, if you’re doing an experiment, you should report everything that you think might make it invalid—not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you’ve eliminated by some other experiment, and how they worked—to make sure the other fellow can tell they have been eliminated.
Again, I don’t think most scientists follow this principle. But evidence charts can yield more balanced scientific communication. Currently, formal scientific communication occurs almost entirely through articles— a long series of paragraphs. For someone to easily digest an article, there has to be a strong storyline running throughout, and the paper cannot be too long. Those requirements can tempt even one of the highest integrity to omit some inconvenient truths, to use rhetorical devices to sweep objections under the rug, to unashamedly advance the advantages of one’s own theory and that theory alone. If you don’t make a good sales job of it, the reader will just move on to a scientist who does, and you won’t have much impact. I don’t think the situation is universally that bad, but there is definitely a lot of this going on.
An evidence chart is more like a list of the pluses and minuses of various theories, and how the apparent minuses might be reconciled with the theory. There’s less room for rhetorical devices to obscure or manipulate things, and the form may be more suited for driving the reader to make up their mind for themselves. Of course, an individual scientist making a chart may still omit contrary evidence or make straw men of the opposing theories, but the evidence chart format may make this easier to recognize. And, we’re working on collaborative and adversarial evidence charts to bring the opposing views to the same table.
Email me if you’re interested in participating. I like to think that Feynman would be in favor of it.
Scientific theories are alive. They are debated and actively questioned. Scientists have differing views, strong and informed ones. However, the system of science tends to mask the debate. In the scientific literature, differences are aired, but rarely in a way that most people would recognize as a debate.
‘Debate’ evokes a vision of two parties concisely articulating their positions, disputing points, and rebutting each other. In the courtroom, in politics, and in school debating societies, one side will say or write something, and shortly after the other side will directly address the points raised. In science, this is not at all the norm. Occasionally something like a conventional debate does happen. A journal, after publishing an article attacking one group’s theory, will sometimes publish a reply from the advocates of the attacked theory. I relish such exchanges because in the usual course of things, it’s hard to make out the debate.
Typically, when one scientist publishes an article advocating a particular theory, those who read it and disagree won’t publish anything on the topic for six months or more. That’s just how the system works- to publish an article usually requires a massive effort involving work conducted over many months, if not years. Anything one does over that timescale is unlikely to be a focused rejoinder to another’s article. In any lab, there are many fish to fry, ordinarily something else was already on the boil, and the easiest meals are made by going after different fish than your peers. Most scientists are happy to skate by each other, perhaps after pausing for a potshot. The full debate is dodged.
Even when the work of two scientists work directly clashes, the debate is sometimes stamped out, and frequently heavily massaged as it passes through the research-and-publish pipelines. Debating somebody through scientific journal articles is like having an exchange with someone on another continent using 17th-century bureaucratic dispatches. When and if you hear back a year later, your target may have moved on to something else, or twisted your words, or showily pulverized a man of straw who looks a bit like you. You’re further burdened by niggling editors, meddling reviewers, irksome word limits, and the more pressing business of communicating your latest data.
The scientific literature obscures and bores with its stately rhetoric and authors writing at cross purposes. I’d like to see unadulterated points and counterpoints. With evidencecharts, we’re enabling this with a format adapted from intelligence analysis at the CIA. Most scientists won’t reshape what they do until academic institutional incentives and attitudes change. However, having good formats available to wrangle in should encourage some more debating around the edges.
If you’re a scientist ready to debate, and you think you might be able to talk a worthy opponent into joining you, send me a note! An upcoming iteration of our free evidencechart.com website will support mano a mano adversarial evidencecharts.
How do you get on top of the literature associated with a controversial scientific topic? For many empirical issues, the science gives a conflicted picture. Like the role of sleep in memory consolidation, the effect of caffeine on cognitive function, or the best theory of a particular visual illusion. To form your own opinion, you’ll need to become familiar with many studies in the area.
You might start by reading the latest review article on the topic. Review articles provide descriptions of many relevant studies. Also, they usually provide a nice tidy story that seems to bring the literature all together into a common thread- that the author’s theory is correct! Because of this bias, a review article may not help you much to make an independent evaluation of the evidence. And the evidence usually isn’t all there. Review articles very rarely describe, or even cite, all the relevant studies. Unfortunately, if you’re just getting started, you can’t recognize which relevant studies the author didn’t cite. This omission problem is the focus of today’s blog post.
Here’s five reasons why a particular study might not be cited in a review article, or in the literature review section of other articles. Possibly the author:
- considers the study not relevant, or not relevant to the particular point the author was most interested in pushing
- doesn’t believe the results of the study
- doesn’t think the methodology of the study was appropriate or good enough to support the claims
- has noticed that the study seems to go against her theory, and she is trying to sweep it under the rug
- considers the study relevant but had to leave it out to make room for other things (most journals impose word limits on reviews)
For any given omission, there’s no way to know the reason. This makes it difficult for even experts to evaluate the overall conclusions of the author. The author might have some good reason to doubt that study which seems to rebut the theory. The omission problem may be a necessary evil of the article format. If an article doesn’t omit many studies, then it’s likely to be extremely difficult to digest.
These problems no doubt have something to do with the irritated and frustrated feeling I have when I finish reading a review of a topic I know a lot about. Whereas if I’m not an expert in the topic, I have a different reaction. Wow, I think, somehow in these areas of science that I’m *not* in, everything gets sorted out so beautifully!
Conventional articles can be nice, but science needs new forms of communication. Here I’ve focused on the omission problem in articles. There are other problems, some of which may be intrinsic to use of a series of paragraphs of prose. A reader’s overall take on the view advanced by an article can depend a lot on the author’s skill in exposition and with the use of rhetorical devices.
Hal Pashler and I have created, together with Chris Simon of the Scotney Group who did the actual programming, a tool that addresses these problems. It allows one to create systematic reviews of a topic, without having to write many thousands of words, and without having to weave all the studies together with a narrative unified by a single theory. You do it all in a tabular form called an ‘evidence chart’. Evidence charts are an old idea, closely related to the “analysis of competing hypotheses” technique. Our evidencechart.com website is fully functioning and free to all, but it’s in beta and we’d love any feedback.
I’ll explain more about the website in future posts, as well as laying out further advantages of the evidence chart format for both readers and writers.
First were the Climategate emails. There, Lack of transparency in climate data analyses and climate models contributed to the doubts of skeptics regarding climate change, and made it easier for the skeptics to convince the public that there is good reason for skepticism.
Now, the Marc Hauser affair has cast a shadow across another sub-area of science.
How can we prevent these scientific fiascos from occurring in the future? We can’t, but a little reform can reduce them. We should shift publication practices in a way that deters fraud and sloppy data analysis and at the same time increase transparency. This will improve everyone’s confidence in reported results.
The change I’m suggesting (and I am by no means the first to advocate for this
) is to require authors to post original data on the web, e.g. in the digital repositories provided by many institutions and consortiums. In the case of Marc Hauser’s work for example, there’s no reason why the original videotapes (the analysis of which was apparently one of the main disagreements between him and others) could not have been published online at the time of paper publication and provided at the time of submission. Of course, the particular data or numbers that are actually appropriate to post varies widely across fields (depending on whether the authors might have a right to further mine the data for further publications, human data privacy considerations, etc.).
In many areas posting the data, especially in a way that makes it interpretable by others, would be a lot of work for authors. However, it would be worth it—reducing fraud in science and increasing public confidence is worth a lot! Note that even in areas where scientists could ‘fake’ the data (as some small percentage of scientists always will consider doing, see the results from anonymous surveys of scientists mentioned here, requiring postings of raw numbers will be a deterrent. Because making up raw numbers wholesale is really going very far to outright unvarnished cheating, and even when people do it, they often make up numbers with systematic biases that can be detected by algorithms designed to detect accounting fraud.
At PLoS ONE we were already discussing strengthening the guidelines for publishing data, and hopefully recent events will help push the journal to do it. But this norm needs to be built across many journals and institutions, to make it part of the culture of science generally.