Encoding and Updating of Depth

Reference Frame Transformations: Behavioral Aspects

Classically, reference frame transformations have been considered from the viewpoint of position coding. Target position in eye coordinates is added to eye position to compute the goal in head coordinates. The latter is then added to head position to compute the goal relative to the body, and so on. (In the real world, replace ‘added to’ with ‘rotated by’ to be mathematically correct). These accounts can be found in nearly every review on this topic.

But with the advent of thinking in terms of relative position, i.e., displacement codes, one needs to consider reference frame transformations from a different view point. First, one needs to reject the notion that these codes are frame-independent. This would be true if the relevant frames only translated, but here we are dealing with frames that primarily rotate with respect to each other.

This is not a special case scenario, but rather the general case. The math behind this is more esoteric than it should be, but the principle is actually simple to demonstrate. For example, with gaze pointed straight ahead, a forward reach is forward in eye, head, or body coordinates. But if gaze is directed 90 deg to the right, the same reach in body coordinates is now directed to the left in gaze coordinates, and a rightward reach (in body coordinates) is forward relative to gaze. Failure to account for this difference would result in errors not only in reach direction, but depth.

For orienting movements, the problem arises whenever the desired movement has components orthogonal to the rotation of the frame from its ‘primary position’. The effect is small for small rotations and small deviations of initial orientation, but then rise in a non-linear fashion to become huge within the range of head-unrestrained gaze shifts. For example, suppose one is looking straight, and points straight ahead: a 90deg rightward distant target in retinal coordinates can be foveated or pointed to by a 90deg rightward rotation of gaze or the arm. However, now start with gaze and the arm pointing straight up; the same target still stimulates a 90deg rightward site on the retina. And yet the required movements are equally down and right.

Finally, torsional rotation of the eyes disrupts the spatial relation between retinal direction and movement direction in head or body coordinates. This is not very relevant in lab conditions where the head is restrained and upright, but in real-world circumstances eye-in-space torsion is much more variable.

These non-linear effects are still there if one uses position codes in the brain, but with displacement codes they become the central problem of visuomotor reference frame transformations.

There is no mechanical solution to these problems because neither the eye muscles nor the arm muscles work in eye-fixed coordinates. The eye muscles are thought to be driven by orientation signals and their derivative, necessitating a head-fixed displacement command. Arm movements are more complicated, but for straight arm pointing this simplifies to rotation in a torso-centered frame. (Note that the arm is never controlled in hand-fixed coordinates, this would make no sense at all, but a number of investigators refer to hand-centered coordinates when they really mean hand displacements in some unspecified frame of reference).

Without accounting for these effects, gaze movements, and arm movements would make systematic errors as a function of gaze orientation and the direction of movement. This has been tested systematically for both saccades and pointing movements. It has been shown that the human oculomotor system account very well for initial eye orientations within Listing’s plane during saccades in the light, saccades in the dark, saccades to remembered targets, express saccades, and smooth pursuit eye movement. Saccades are slightly less compensated for torsion, about 50%. And finally, most (but not all) healthy human subjects compensate for the non-linear interaction between initial eye, arm, and target orientations.

This is not the only problem that arises in transforming retinal codes into motor displacement codes. It has been shown that the reach system must and does account for the eye-head-shoulder linkage geometry discussed in lecture 1, when pointing toward distant targets or reaching toward near targets.

A recent study combined all of these features, modeling visually-guided reach with the use of a direct transformation from visual coordinates to shoulder coordinates, only accounting for the translational geometry of the system, versus a system with a full internal model of eye-head-shoulder linkage and non-linear reference frame transformations. As expected, the former model predicted errors in both reach direction and depth as a function of initial eye orientation, whereas the latter model predicted perfect reach. Real reaches were of course noisy and showed various un-related offsets, but they did not show any of the errors predicted by the direct transformation model in the absence of visual feedback, even in the initial stages before proprioceptive feedback could occur.

Thus, although neuroscientists like to ignore complex non-linearities, the brain does not: it possesses a complete internal model of the actual geometry of the system it deals with. The really interesting question then, is how does it do this?