cracks in the edifice of visual time?
Below is a draft of a chapter I’m writing for Subjective Time, an upcoming book from MIT Press edited by Valtteri Arstila and Dan Lloyd.
In a bowling alley, a professional player launches his ball down the lane. As the ball rolls toward the pins, our visual experience of it is smooth and seamless. The ball shifts in position continuously, and this seems to be represented with high fidelity by our brain. There are no subjective gaps, no stutter, and no noticeable blur.
One might assume that, in every instant, the brain simply processes the retinal input through various feature and shape detectors, with the results becoming available to awareness, millisecond by millisecond. This picture of a continuous system, with information continually ascending the system before being replaced by the information from the next instant, is still the predominant way that psychologists and perceptionists think about the visual brain.
However, the results of experiments have steadily chipped away at this image. Together, these findings indicate that the smooth, seemingly high temporal resolution movie we experience during the roll of the bowling ball reflects a massive construction project by the brain. Many problems of ambiguous input are resolved, processing artifacts like blur are suppressed, and missing information is guessed at. Some of the more firmly established of these processes will be described at the end of this chapter. Unfortunately, there is no agreement on the extent to which these complexities should push us to revise our simple framework of millisecond-by-millisecond processing. Here, we will focus on phenomena that are more clearly at odds with the standard view. The bulk of this chapter is devoted to one particular way in which the processes that underly our visual experience have been proposed to not resemble experience itself.
While this proposal is surprising from the perspective of visual experience, perhaps it should not surprise those familiar with the habits of the brain.
Oscillations or rhythmic fluctuations in activity and excitability occur in many biological systems. Although such rhythms likely exist for good reasons, they have the potential to disrupt the processing of sensory information. Visual information that arises during the down phase of a rhythm, when neurons are less excitable, might be entirely missed.
Figure 1 depicts a simple fluctuation, a theoretical example created by combining an 8 cycles per second sinusoid with one that repeats 50 times per second. The faster wave is in the so-called gamma range of oscillation frequencies frequently found in visual cortex and other areas of the brain (Engel et al. 1991; Fries Nikolic & Singer 2007). As far as I know, it has not been suggested that visual information arriving in gamma’s up state is processed to a greater extent than information arriving in the downward deflection. However, it has been suggested that when many neurons oscillate together in this way, the information they represent can then become conscious (Crick & Koch 1990). Although this hypothesis is less popular than it was twenty years ago, it exemplifies the fact that some oscillations have been proposed to play an important role in computation without any proposal that they regularly disrupt the representation of visual information. In the case of the 50 times per second oscillation, perhaps it is not disruptive because it is faster than the temporal scale of perception (Holcombe 2009). The temporal smoothing that likely precedes perception would obliterate 50 Hz waves.
In contrast, waves like the slower one depicted in Figure 1, at 8 cycles per second, are sometimes suggested to have a profound effect on the processing of incoming retinal information (vanRullen Reddy & Koch 2005). This chapter will consider the suggestion that at a particular rate, brief ‘snapshots’ are taken of the visual scene and in between these snapshots, visual information is ignored or at least given little weight. A less radical proposal is that intrinsic brain oscillations do not actually modulate the stream of incoming information, but that particular visual processes only occur once every cycle, even while other aspects occur continuously. This chapter will discuss two sources of evidence for these proposals.
To put things in perspective, it is useful to begin with a less controversial phenomenon, one that indubitably slices visual processing into temporal chunks and periods of non-responsiveness. These are eye movements, specifically saccades—sudden jumps of the eyes.
Sampling the world via eye movements
Many sorts of animals use sensors that provide very intermittent information about the environment. In rats, the whiskers are actively brushed back and forth to identify objects, whereas bats emit emit high-pitched cries to locate insects, and the elephant-nose fish (Gnathonemus petersii) periodically spouts electric pulses in order to find prey.
Humans also sense information intermittently. In vision, the eyes move at irregular but frequent intervals. While the center of our gaze provides very high acuity, vision to the side has much poorer spatial resolution. To build up an adequate representation of the scene front of us, our eyes must jump from one region to another. The rate of these saccades vary, but are made on average three times a second. The eyes move at such a high speed during a saccade that the brain represents the image as a blur. An important consequence is that we are essentially blind for a significant proportion of our waking lives. Yet we are not aware of this.
The visual brain actively inhibits visual processing during the rapid slide of the eyes. This is called “saccadic suppression” (Dodge 1900). For a stark demonstration of the phenomenon, first look in a mirror at your left eye. Then, quickly shift your gaze (saccade) to your right eye. Saccading back and forth, you never see your eyes move. The existence of saccadic suppression lends some plausibility to the possibility of intermittent processing in other contexts. In the case of saccades, certainly, visual processing is interrupted (e.g. Krekelberg 2010). Remarkably, this interruption causes no anomalies in our everyday experience of time. We do not notice the frequent black-outs of vision. The brain may simply ignore these intervals, but this is perhaps not sufficient to account for experience. To compensate for the lost time of a saccade, the brain may actively fill in the interval with the objects seen before and after (Yarrow et al. 2001).
The brain thus has the capability to accommodate rapid variation in the uptake of visual information. In addition to avoiding subjective temporal interruptions, the spatial disruptions caused by the jumping about of the eyes are also rendered inconspicuous. The brain actively anticipates the location of objects on the retina after a saccade, which helps preserve the feeling of a stable world every time an eye movement jerks the image in one direction or another (Wurtz 2008). The mechanisms mediating each of these phenomena are activated intermittently, each time a sudden shift of the eyes is executed.
The relatively well-understood example of eye movements sets the stage for two possible periodic processes that are more controversial.
Periodic processing in a jittering illusion
Imagine again the bowling ball rolling down the lane of a bowling alley. We seem to experience the ball traveling smoothly, shifting in position continuously and seamlessly. Few would suspect that to our brain only occasionally resolves a conflict among cues to its position, and that it does so suddenly. However, this is precisely the conclusion that presents itself when one views a particular visual display.
The relevant display involves a red disc centered on a large green disc with the same luminance as the red. This red-green bull’s-eye moves across a dark background. Rather than appearing as smooth motion, the red circle seems to jitter as it moves (Arnold & Johnston 2003).
It is well-known that equiluminant motion of a red figure against green is perceived as moving more slowly than motion involving a luminance difference, such as the larger green area against the dark background (Cavanagh Tyler & Favreau 1984). Several results support the idea that the jitter is caused by a conflict between the discrepant apparent speeds of the equiluminant figure and the larger luminance-defined figure it sits on (Arnold & Johnston 2003; 2005). This conflict may be detected when a “motion-based forward model” of the spatial pattern “is compared against new input” (Amano et al. 2008, p.8). The motion signal corresponding to the small red figure is weak compared to that of the green figure, therefore the green figure is expected to move farther. Every so often, the system compares this predicted difference in position against the new retinal image, which indicates no difference in position. The percept may be based on the expected shift until the comparison of model and input is done, at which time the conflict is resolved by snapping the two figures into alignment, in accordance with the retinal image.
The jitter seems to occur at a consistent rate—when a truly physically jittering stimulus was compared to the perceived jitter in the illusion, a rate of 10 or 11 Hz was found to match best (Amano Arnold Takeda & Johnston 2008). This was true regardless of speed of the moving stimuli or distance between the discrepant figures. Apparently the ~10 Hz rate corresponds to that of an intrinsic process of the visual system, rather than being triggered when the cue conflict reaches a certain value. This is surprising because from an ecological point of view, it would seem more adaptive to resolve any cue conflict continuously rather than periodically. For example, optimal cue combination over time may be embodied in a Kalman filter (Bryson & Ho 1975) that yields an estimated position for the two objects that compromises between the two estimates. But this should change smoothly over time. Therefore, the periodic jitter may reflect the particular importance of oscillations for brain-style computation (Koepsell et al. 2010).
If the oscillations of the brain are indeed important for visual processing, one might expect more than one manifestation of them in perception. There is in fact another prominent candidate. It too is an anomaly or illusion of motion perception. With this second anomaly, the associated theory has been more radical than that an intermittent process occasionally refines perception. The proposal this time is that the illusion indicates regular visual sampling of the retinal image, with information in between the samples ignored.
Illusory Motion reversals
While watching a movie or television program, almost everyone has seen it, although many have never noticed it. Car commercials on the television, and car chases in the cinema, often feature a fast-moving vehicle. In some cases, the wheels on the car appear to be rotating backwards even as the car moves forwards. This optical illusion is an artifact of the way a video camera captures the scene. The effect is clearest for wheels with spokes, or with a regular pattern on their hubcap, and occurs because video cameras capture the scene as a series of snapshots. For certain wheel speeds, between successive snapshots the spokes will travel far enough that in the resulting movie, the spokes appear to travel backwards. Consider spokes at 12 o’clock and 11 o’clock that are identical and on a wheel rotating clockwise. If the snapshots of the film occur at certain rates (and the wheel is rotating at certain speeds), then the spoke originally at 12 o’clock will move nearly to 1 o’clock by the next frame, let’s say it is in the position corresponding to 12:45. Of course, this also means that the spoke that was at 11 o’clock will be at the 11:45 position (nearly at 12 o’clock). The brain will naturally assume that the identical successive images of a spoke first at 12 o’clock and then at 11:45 are one and the same spoke, caused by the wheel rotating counter-clockwise. This will occur for each spoke and its successor, yielding a clear percept of counter-clockwise motion of the wheel. A good demonstration is available at http://www.michaelbach.de/ot/mot_wagonWheel/.
Viewed under clear and constant illumination in the real world, one would not expect to perceive a wheel to appear to rotate contrary to its true direction. After all, the visual system is not thought to have much in common with a video camera. But with prolonged viewing, an ordinary wagon wheel is very occasionally perceived to rotate in the wrong direction. When this was reported by Dale Purves and others in 1996, they suggested that the visual system really does behave like a video camera, periodically sampling the scene to yield the illusory reversals. This notion that the visual system temporally samples the world at regular intervals was greeted with profound skepticism from many perception researchers. Some even doubted that the illusion really existed. The reaction of one prominent vision scientist to the report of the illusion was “it can’t happen”. The researcher will remain nameless, although his (or her) peremptory judgement soon received some support from an article reporting that the illusion could not be replicated (Pakarian & Yasamy 2003). Although several subsequent investigations have replicated the illusion, the resistance to the initial report and its interpretation was not entirely unreasonable. Even in those laboratories that have documented the illusion (such as my own), apparently some people never experience the illusion. And regarding the illusion’s proposed interpretation, it seems to fly in the face of the enormous body of mainstream motion psychophysics results, which never had yielded much reason to posit a temporal sampling account (a few reports outside of mainstream psychophysics had, such as Kristofferson 1980).
The starting point for the temporal sampling theory had been the apparent similarity between the illusory motion reversals and the wagon-wheel illusion. The plausibility of the newer theory arises in part from the resemblance of the illusion to the motion aftereffect. After an extended interval of viewing a stimulus moving in a particular direction, unambiguously stationary stimuli are perceived to move in the opposite direction (for a demonstration, see http://www.michaelbach.de/ot/mot_adapt/index.html). Moreover, stimuli moving very slowly in the original direction, if they are moving slowly enough, are perceived to move in the opposite direction.
It is important to realize that these hypothetical spurious motion responses are not caused by temporally discrete sampling. There are no temporal intervals that are actually ignored, unlike in the Purves et al. (1995) theory. In the Kline et al. theory, one part of the motion detector continually responds to visual stimulation from farther in the past than does the other part. Information at all times is eventually processed by both parts of the detector. Furthermore, as motion detectors in the visual system are thought to exist at a number of spatial scales, with varying temporal delays, any spurious responses would not occur at the same time for all motion detectors.
In the years since the initial proposals of these theories of the motion reversals, the results of new experiments have weighed against the possibility that regular snapshots are taken of the visual field. According to the snapshot theory, samples from disparate parts of the visual field should occur at the same time, in synchrony. Therefore, if two identical moving stimuli are viewed together, any reversals should occur simultaneously. Simultaneous reversals are instead fairly rare in this situation (Kline, Eagleman, & Holcombe 2004). Still, this result does not categorically exclude temporally discrete sampling. Snapshots might still occur and cause the illusion, but they would have to occur at different times for different parts of the visual field. This is a position now advocated by proponents of the temporally discrete processing theory (vanRullen et al. 2010).
The mechanism that generates reversals, be it temporal sampling or an aberrant motion response, must exist in some class of mechanisms of the visual system; possibly mechanisms involved in motion, local orientation, and complex form, or perhaps only a certain class of motion detector. If the reversals are caused by temporal sampling, which visual analyzers sample?
In addition to its generality across different types of mechanisms, another aspect of the mechanism that generates reversals should be considered. The mechanism that generates reversals must operate on either the entire field of incoming visual stimulation, or some portion of it. The possibility that the visual system temporally samples the entire visual field in unison, as a videocamera does, is contradicted by the existence of independent reversals in spatially separated stimuli (Kline, Holcombe, & Eagleman 2004). If temporally discrete processing causes the illusion, it seems it must be caused by mechanisms with a narrow spatial scope.
vanRullen (2006) extended this approach of examining whether concurrent moving stimuli reverse together. The results led him to conclude that motion reversals are a phenomenon “whose spatial extent is entirely determined by the global perceptual organization of the scene into objects” (p.4094). His observers viewed a rotating textured ring with a gap in the center. The gap was in the shape of a long vertical rectangle, dividing the ring into left and right halves. In one condition, both halves rotated the same direction and at the same speed. When observers viewed this display, reversals overall occurred quite infrequently, as is usual in studies of illusory motion reversals. However, when they occurred, reversals tended to occur in both ring halves simultaneously. In fact, this was about twice as likely to occur (7% of total viewing time) than was a reversal in just one half-ring. In a second condition, the left and right halves of the rings rotated in opposite directions. Now, reversals occurred much more frequently (18% of viewing time) in only one ring then they did in both rings simultaneously (~1.5% of viewing time). vanRullen postulated that the crucial difference between these displays is that in the opposite-motion condition, the rings are represented by the visual system as two separate objects, whereas in the same-direction condition, they are grouped into a single object. If this is true and were the only difference between the stimuli, then the co-occurring reversals would imply that the scope of the motion reversals mechanism is entire objects. In the same-direction condition, the reversal-inducing mechanism processes the entire ring as a whole, causing reversals in the two ring halves to occur together. vanRullen (2006) therefore concluded that the reversal-generating mechanism is “object-based” and “restricted to the object of our attention” (vanRullen 2006). Motion reversals are an “object-based” effect “whose spatial extent is entirely determined by the global perceptual organization of the scene into objects” (p.4094). However, there is reason to question this conclusion.
In a scene of ambiguous moving stimuli, the visual system sometimes favors interpretations in which all the stimuli move in the same direction, even when these stimuli do not appear to be part of the same object. This is quickly apparent when one views a display with many two-frame apparent motion quartets. In the first frame of an apparent-motion quartet, two dots appear at opposite vertices of an imaginary square. In the second frame, the two dots have moved to the other two vertices of the square. If the two frames are set in alternation, for a time viewers experience horizontal motion, with each of two dots traversing back and forth along the top and bottom of the square. At other times, viewers experience vertical motion, with each of two dots traversing up and down along the left and right sides of the square. If several of these ambiguous stimuli are scattered about the screen, then rather than reversing direction independently, they usually do so in unison (Ramachandran & Anstis 1986; for a demonstration, see Peter Schiller’s webpage http://web.mit.edu/bcs/schillerlab/research/A-Vision/A15-24.htm).
Unlike in vanRullen’s stimulus, in the case of the dot quartets there is no reason to think that discrete temporal sampling could determine the perceived direction. Furthermore, few would suggest that dots scattered across the screen appear to be part of the same object. Nevertheless, the perceived motion direction of the stimuli are tightly linked. This link between the perceived motion direction of different stimuli may well be the principal factor that yielded vanRullen’s result. When the two ring halves were moving in the same direction, the tendency for disparate stimuli to seem to move in the same direction should push reversals of the two halves to occur together. When the two ring halves physically move in opposite directions, this tendency would push reversals to occur more independently, because only when the two halves reverse at different times are they perceived to have the same motion direction. Admittedly, we do not know whether the tendency observed with quartets should also apply to other stimuli such as rotating rings. The possibility that it does, however, yields a plausible alternative explanation to vanRullen’s results. The issue of the spatial scope of the mechanism that generates reversals remains open.
Another attempt to determine the scope of mechanisms that generate reversals was made by Kline & Eagleman (2008). In their display, two orthogonal motions at different spatial scales were spatially superposed. One was the movement of the overall shape of the figure, the other was the movement of the local texture. Reversals of these two motions frequently occurred separately, leading Kline & Eagleman to conclude that vanRullen’s conjecture that reversals occur in a mechanism that processes entire objects must be erroneous. Certainly this experiment appears to disconfirm the strong form of this hypothesis. However, the motions moved in different directions (isoeccentric vs. isoradial), and had different spatial frequencies. These factors may affect the propensity towards reversals, which might have contributed to the independence. Future experiments should be able to resolve the role of objecthood more definitively by manipulating it in a single experiment, without the motion-direction confound of vanRullen (2006). For example, using vanRullen’s divided ring display with both ring halves rotating in the same direction, one might vary whether the central dividing strip appears to be in front or behind the depth plane of the ring. When it is behind, the rings will appear to be separate objects, whereas when in front, a single ring will be experienced. Will reversals in the two rings occur together much more frequently when the strip is in front? Such experiments should improve understanding of the nature of the unit analyzed by the motion-reversal mechanism.
Prior to Kline & Eagleman (2008), all published investigations of motion reversals used stimuli made up of repeating patterns, such as a regular array of discs or a grating. If motion reversals are caused by periodic sampling, then with a repetitive pattern the motion percept could reflect matching of each figural element with the identical element immediately behind it. This would cause the motion percept to comprise a procession of all elements stepping backwards. In the case of the wagon-wheel effect induced by a video camera, the periodic pattern of wheel spokes typically results in just this—each of the spokes is phenomenally linked to the following spoke. This linkage in phenomenology of elements in successive frames is known as ‘token matching’.
When discussing the temporally discrete sampling theory of motion reversals, Kline & Eagleman (2008) write that “the temporal sampling mechanism… requires periodicity. That is, the pattern must be composed of identical (or very similar) repeating elements for incorrect token matching to occur. If the perceptual snapshot hypothesis is responsible for [illusory motion reversals] then the illusion should not occur with stimuli such as a random texture or a periodic pattern with distinct elements” (p.1). To reach this conclusion, it appears they assumed that a requirement for motion across frames is that the motion must join identical or very similar elements. This is a strange assumption, because studies of apparent motion have consistently found that the strength of apparent motion depends little or not at all on similarity of the corresponding elements (Kolers 1972; Burt & Sperling 1981; Navon 1976; a demonstration of how apparent motion can easily link objects of very different color and shape is available at the webpage of Peter Schiller http://web.mit.edu/bcs/schillerlab/research/A-Vision/A15-33.htm). Rather than element similarity, standard motion models use ‘motion energy’, which reflects the product of luminance contrast in successive positions (van Santen & Sperling 1984; Werkhoven Sperling & Chubb 1993). The model of motion reversals used by vanRullen et al. (2004) uses Fourier first-order (luminance) energy, so it also does not predict a dependence on similarity of elements. From the model’s perspective, then, there is no surprise that reversals occur with non-periodic patterns such as a strip of black digits printed on a white background. The account proposed by Kline, Holcombe & Eagleman (2004), being based on the response of classic Reichardt detectors, also does not necessarily predict any role for similarity of elements.
The specific proposal of vanRullen et al. (2005) was that reversals occur due to periodic sampling by “attention”, followed by motion analysis. What kind of motion analysis normally follows the action of attention? Although vanRullen et al. (2005) assumed it was a conventional motion energy analysis, it may well be a distinct system (Cavanagh 1992; Lu & Sperling 2001). One specific possibility is that it relies mainly on the movement of attention—if attention shifts with a visible object in a direction, then motion will be perceived in that direction (Cavanagh 1992).
As described above, the existence of reversals with these aperiodic stimuli does not challenge the attentional temporal sampling theory. One stimulus of Kline & Eagleman (2008) has not been mentioned yet, however, and the authors put special emphasis on it in their paper. This stimulus was qualitatively different from the rest. In a wide rectangular aperture with soft edges (a sigmoidal contrast envelope to gradually decrease the contrast at the edge), they presented a random-dot pattern drifting at 17° per second. From the perspective of Kline & Eagleman, with their emphasis on whether a stimulus allowed for token matching, their finding that this stimulus occasionally is perceived to reverse was very important. Because the stimulus never repeats, there are no translational jumps in the backward direction for which the corresponding locations before and after the jump present the same elements. Given that standard motion analysis does not require such matches, this property of the stimulus does not seem important.
The random-dot stimulus presents a problem for the temporal sampling theory for a different reason. For the other stimuli, performing conventional motion analysis on a temporally-sampled version of the stimulus yields motion signals predominantly in the reverse direction for particular stimulus temporal frequency and sampling temporal frequency combinations. However, this is not true of the random dot stimulus. This result again points to the possibility that a different sort of motion analysis is responsible for the reverse direction.
At the upper reaches of the visual system, in the areas most influenced by attention, spatial resolution is coarse and capacity is limited, so each dot of a large pattern may not be represented individually. Instead, a random-dot pattern may be represented as a texture rather than as a collection of individual elements (for related ideas and evidence, see Parkes et al. 2001; Saiki & Holcombe, submitted). At this level, all the frames of the random-dot stimulus may be represented similarly, as they all have similar global statistics. In that case, for any two frames of a random-dot pattern, for the attentional system it may be equally plausible that the pattern has moved in any direction. This is entirely speculative, however, and there remains the problem of explaining why the reverse direction would ever become stronger than the forward direction. Although adaptation may weaken the motion systems’ response to the forward direction, it is unclear whether this could cause the reverse response to be even stronger (unless one is willing to posit a substantial random noise component).
This section has addressed the results that Kline & Eagleman (2008) offered as evidence against a periodic temporal sampling theory, and argued that most of the evidence does not undermine the periodic temporal sampling theory. The random-dot stimulus does present significant difficulties, but does so not only for the sampling theory, but also for the other theories.
Earlier we discussed the flaw in standard motion detectors (Reichhardt detector subunits) that, if not avoided via some prefiltering of their inputs, can cause them to respond to motion in the wrong direction. This aberrant response, however, does not occur as consistently for a random-dot pattern. A moving random dot pattern will include a particular spatial frequency that will stimulate the detector of the wrong direction, however the combination of all the spatial frequencies that the pattern comprises will combine to yield no response (on average at least, for a white-noise pattern, Snippe & Koenderink 1994). Thus the lower incidence of reversals reported for the random-dot pattern is certainly consistent with this theory. The existence of any reversals at all, however, is a problem. Of course, there may be idiosyncrasies of biological motion filters that create the reverse response in a way not predicted by the Reichhardt motion detector abstraction. Still, it is unsatisfying to appeal to some mysterious and unknown property of motion detectors, and it does not lead to new predictions. A concrete example of reverse neural responses has recently been revealed in a moth’s visual system, and further study of it may lead to more specific hypotheses (Theobald et al. 2010).
A somewhat different sort of explanation for the motion reversals was ultimately offered by Kline & Eagleman in 2008. Their theory is that “the motion aftereffect (MAE) can be superimposed on a moving stimulus, creating a motion during-effect that can lead to illusory motion reversal”. Although they did not elaborate on this statement, it raises the possibility that reversals are not a case of adaptation allowing an already-existing spurious reverse-direction response to come to the fore. Rather, it is theoretically possible that adaptation alone can yield the reverse percept without any assistance from spuriously-responding motion detectors.
But the conditions which maximize the conventional motion aftereffect (very low temporal frequencies of adapter and test—Bex et al. 1996; Pantle 1974) are not the same as those that maximize motion reversals. vanRullen (2007) compared the incidence of reversals to the duration of the motion aftereffect with the same stimuli, and found some other dissociations as well. However, these conventional motion aftereffect studies involve viewing a stationary test pattern after one adapts to a particular moving stimulus. But with motion reversals the test pattern is essentially the original moving pattern. This resembles more the ‘flicker test’ motion aftereffect procedure, which uses a flickering pattern during the testing interval. The strength of motion adaptation as assessed by the flicker test can lead to adaptation that is strongest at temporal frequencies closer to the range that yields the highest incidence of reversals (Ashida & Osaka 1995). Hence, the intriguing idea of reversals being a motion aftereffect superposed on the original stimulus remains viable.
The peak at 10 Hz was documented by vanRullen Reddy & Koch (2005). The data of experiments by Simpson Shahani & Manahilov (2005) led those authors to a similar conclusion, but interpretation of their data was problematic because they lumped together reversals with other anomalous motion percepts as well as cases where no motion was perceived. vanRullen et al. (2008; 2010) consider the 10 Hz peak to be good evidence for the theory that periodic temporal sampling results in the reversal illusion. Specifically, they concluded that the rate of the putative sampling process is 13.3 Hz, because “when the system’s sampling period is three-fourths of the motion period, the evidence for the erroneous motion direction will be maximal and outweigh the evidence for the actual direction” (p.525, 2010).
If a periodic pattern were to move three-fourths of a cycle between ‘snapshots’, then from the perspective of a mechanism that sees only the snapshots, the grating might appear to jump in the opposite direction by one-fourth of a cycle at each step. This one-quarter step size is sometimes considered the ‘optimal’ stimulus for the standard quadrature model of the human motion system (e.g. Baker Baydala & Zeitouni 1989). When just two frames are presented, a one-quarter step is indeed optimal (Nakayama & Silverman 1985) but for a multi-frame stimulus, this is not generally the case (Watson 1990). Therefore, the suggestion of vanRullen et al. that a 10 Hz peak for motion reversals points to a 13.3 Hz sampling period cannot be regarded as a by-product of a generic motion energy system. Instead, one must consider the details of a particular model.
In the modeling effort of vanRullen et al., they assumed that the sensitivity of the post-sampling motion detectors was the same as the motion system as a whole. As a measure of the sensitivity of the whole motion system, they used the proportion of times that people reported perceiving motion (forwards or backwards) as a function of temporal frequency. The shape of this curve was used for the relative sensitivity of the motion system to different stimuli. The proportion of responses measure is not likely to be a good measure of relative sensitivity, because proportions are not linearly related to internal response strength. For this reason, visual psychophysicists typically use the inverse of contrast at threshold to estimate sensitivity. Perhaps the predictions of the model would not be affected much by the refinement of this estimate, but more modeling is needed to be sure. The basic assumption that the spatiotemporal characteristics of the strength of the reverse motion response resembles that of overall motion sensitivity should also be questioned.
As mentioned in the previous section that discussed the Kline & Eagleman (2008) results, the nature of the motion system responsible for the reverse direction is uncertain. Under the theory that it receives input that occurs after attentional sampling, it may be high-level, since attention has much bigger effects at higher stages of the visual system. According to some researchers, attention-based motion perception is restricted to quite low temporal frequencies (Lu, Lesmes & Sperling 1999). In the framework of vanRullen et al. (2005), a consequence of this may be to push the predicted peak temporal frequency lower relative to the putative sampling frequency. In the original model, let’s say that a 10 Hz stimulus indicates a 13.3 Hz peak sampling frequency. If the responsible motion detectors are more sensitive to lower temporal frequencies, then a 10 Hz maximal response may actually be the result of a higher peak sampling frequency. Because with the high sampling frequency, although the strongest motion energy may result from a temporal frequency higher than 10 Hz, the system is so insensitive to it that the 10 Hz stimulus is better. However, the finding that reversals are particularly associated with EEG signal change around 13 Hz lends some independent support to the 13 Hz estimate of sampling (vanRullen Reddy & Koch 2006).
A particular rate of temporal sampling makes a specific prediction for the speed of reverse motion. A particular temporal frequency of stimulus motion, together with a sampling frequency, specifies the size of successive steps between samples and therefore a particular speed for the reverse motion. Unfortunately, to date there have been no reports of the speed perceived for motion reversals. In my experience when viewing rotating circular arrays of discs, reversals typically begin with a very flickery percept that may appear nearly stationary before reverse motion begins. This reverse motion appears to quickly accelerate to a particular peak speed before the percept returns to forward motion. One prediction of the sampling account is that the reversals should always be perceived to be slower than the motion in the forward direction. This follows from the basic notion that reversals occur when the pattern moves more than one half of a cycle between samples, causing displacement between successive samples to be shorter in the reverse direction than in the forward direction. In my experience, the perceived speed of the reverse motion is indeed always slower than the speed perceived in the forward direction. However, that the reverse motion tends to begin with flickery, possibly stationary or slow motion certainly complicates interpretation, as does the dependence of speed perception on contrast. At first, the variation in speed during the life of a reversal might seem compatible with vanRullen et al’s suggestion of a variable sampling rate. But it seems difficult for this account to explain why reversals would consistently begin with flicker or near-stationary sensation. Nevertheless, there remains the possibility that the sampling notion could be partially validated or undermined by an experiment investigating perceived speed. According to the periodic sampling account, the speed perceived should never exceed half the speed perceived in the forward direction. This follows from the constraint that between samples, the pattern must travel more than half its period in the forward direction. Observer reports of speeds in excess of this would challenge the sampling account.
Both illusory motion reversals and the motion jitter illusion are potentially manifestations of intermittent, periodic processing of incoming visual information. However, the interpretation of illusory motion reversals seems less straightforward than interpretation of the jitter illusion.
Motion reversals could be caused by mechanisms other than sampling. This argument remained somewhat theoretical until the reports by Seizova-Cajic and colleagues of a strange illusion in the perception of one’s own body (Seizova-Cajic et al. 2007; Holcombe & Seizova-Cajic 2008). With eyes closed, participants had the muscle spindles in their biceps stimulated with a vibrator. The vibration activates the muscle spindles, which signals arm extension, and this is usually the resulting percept (Goodwin, McCloskey & Matthews 1972). However, after prolonged stimulation one occasionally experiences reversals—flexion of the arm for several seconds. Because proprioception is a very different system than vision, the relevance to the reversal mechanism in the visual case is uncertain. Nevertheless, it does provide an existence proof that biological systems can exhibit motion reversals without any role for periodic temporal sampling.
The 10 Hz motion jitter illusion discovered by Arnold & Johnston (2003) is more compelling evidence for a periodic visual process associated with perception. There are additionally other phenomena suggestive of periodic processes (Elliott & Muller 1998; Geissler Schebera & Kompass 1999), but these have not been studied much and their status is uncertain.
If one or more of these phenomena really are caused by periodic processes of the visual system, there remains the issue of how central they are to visual function. Are these phenomena cracks in the smooth edifice of experience that reveal the continual jerking and jittering of basic underlying machinery? The jitter seen in the displays of Arnold & Johnston (2003) may mean that perceptual position computation always follows a constant rhythm. The motion reversals illusion is perhaps a sign that attention always processes things periodically (vanRullen, Carlson & Cavanagh 2007). Alternatively, these phenomena may reflect much more restricted processes.
A second marvel of temporal experience is that our image of the moving bowling ball is crisp rather than a streak. Objects moving at the speed of bowling balls create extended strips of persisting activity in our visual cortices, which must be actively quashed (Burr 1980). A further problem is that at any individual location information is not present on the retina long enough for an accurate representation of the bowling ball’s color and shape. Through so-called ‘mobile computation’ (Cavanagh Holcombe & Chou 2008), the visual system accumulates information from the successive positions the object occupies, and only then can the familiar crisp image be constructed (Nishida et al. 2007; Lu Lesmes & Sperling 1999; Moore & Enns 2004). A moving object’s position is set in a way that may partially overcome the lag introduced by neural processing delays.
I thank David Eagleman, Christina Howard, Keith Kline, Daniel Linares, Shih-Yu Lo, and Rufin vanRullen for comments on drafts of this manuscript, and David O’Carroll for stimulating discussion.
Amano, K., Arnold, D. H., Takeda, T., and Johnston, A. (2008). Alpha band amplification during illusory jitter perception. J Vis. 8, 10, 3.1-3.8. DOI=10.[1167/8.10.3.
Arnold, D. H. and Johnston, A. (2005). Motion induced spatial conflict following binocular integration. Vision Res. 45, 23, 2934-2942. DOI=10.1016/j.visres.2005.04.020.
Arnold, D. H. and Johnston, A. (2003). Motion-induced spatial conflict. Nature. 425, 6954, 181-184
Bryson, A.E.; Ho, Y.C. (1975). Applied optimal control. Washington, DC: Hemisphere
Burt, P., & Sperling, G. (1981). Time, Distance, and Feature Trade-Offs in Visual Apparent Motion. Psychological Review, 88(2), 171-195.
Cavanagh, P. (1992). Attention-based motion perception. Science, 257(11 September), 1563-1565.
Cavanagh, P., Tyler, C. W., and Favreau, O. E. 1984. Perceived velocity of moving chromatic gratings. J Opt Soc Am A. 1, 8, 893-899.
Holcombe, A. (2009). Seeing slow and seeing fast: two limits on perception Trends in Cognitive Sciences, 13 (5), 216-221 DOI: 10.1016/j.tics.2009.02.005
VanRullen R, Reddy L, & Koch C (2005). Attention-driven discrete sampling of motion perception. Proceedings of the National Academy of Sciences of the United States of America, 102 (14), 5291-6 PMID: 15793010