HomeNewIntroductionQuantum Mind BlogQuantum Mind TheoriesRelated TopicsKey ArticlesReferencesContact UsOnline Book

Gestalt view of consciousness


The World in Your Head: A Gestalt View of the Mechanism of Conscious Experience

Steven Lehar, Schepens Eye Research Institute

Lawrence Erlbaum (2003)

INTRODUCTION:  Lehar makes a good case against the computer/AI model of the brain, by highlighting the inability of computers to differentiate the edges needed to construct a model of the world, from the mass of less important input. He contrasts this with the ability of biological vision to deduce information from very flimsy inputs. The Gestalt methods suggested for achieving what the brain can do are not entirely convincing, as a means of sorting the mass of data input, and thus avoiding the combinatorial explosions implied by the requirements of visual perceptions. In this respect, a quantum computing approach might look to have a greater chance of success. Further to this, a weakness of the book is the lack of much attempt to relate what is proposed to the physical components and processing of the brain.

Lehar approaches consciousness from the angle of the relationship between visual image processing and artificial intelligence (AI). A computer has all the data relative to an image in the form of numerical data. However, turning this into usable information in AI/robotics has proved an intractable problem. Computers can detect features such as edges, but the problem is that they can detect too many of such features. Their edge detection includes details of texture, surface fragmentation and shadows, but fails to pick out those edges that are relevant for the outlines or volumes of an object. Further, there is no apparent algorithm to deal with occluded objects, where a small object obstructs the view of part of a larger object, but it can be deduced that the larger object continues behind the smaller object. This is taken to mean that the information of global significance for understanding the image is not available in the local edges.

Computers have problems with the spatial structure of visual scenes, and as a result difficulty in navigating in an environment of irregular forms, which, by contrast, present little problem for biological vision. Lehar points out that the retinal image is two-dimensional, but is perceived as three-dimensional, and that therefore the three-dimensional depth of the image must be the result of cortical processing. A basic function of visual perception is argued to be the transformation from a two-dimensional retinal image to a three-dimensional perception in the brain. Apart from inserting spatial structure into an initially two-dimensional image, the brain must also decompose this image into coherent objects with volume within the spatial structure. From this it is argued that the brain must operate a spatial algorithm, in order to produce this three-dimensional image. What computers have had difficulty in achieving is not receiving the visual data, but in developing the sort of processing that allows the brain to turn this data into a conscious image.

The literature relative to these problems concentrates on restricted domains, with separate algorithms for extracting shape from shading, for motion or for lines. However, the problem of dealing with shape of the conformation of objects that reflect light has remained largely unresolved. This divergence in relative performance is argued to show that the basis of biological and computer vision are very different from one another.

Conscious images:  Lehar takes the view that the conscious image is assembled in the brain, in response to data from the external world. This is described as 'indirect realism,' in contrast to 'direct realism' or 'naive realism', in which it is believed that we perceive the external world as it actually is. The  author thinks that discussions in neuroscience are often implicitly based on direct realism, but he argues that this view is based on false assumptions. The visual experience is at odds with scientific reality, because the subjective world is experienced, as if it were outside the brain, whereas visual processing occurs inside the brain. The causal chain of vision is one, in which the brain can only process material that has already been picked up by the sensory organs. Consciousness is therefore necessarily confined to the experience of internally constructed models. Lehar goes back to Kant, who distinguishes between the 'nouminal' world of light signals etc. and the phenomenal world of internal conscious perception. The 'nouminal' world is only perceived within the phenomenal world.

The author argues that the properties of subjective experience are inconsistent with the present neuroscientific thinking, based on the semi-independent sequential operation of billions of individual neurons. In contrast, our experience is mainly of stable and solid volumes, rather than billions of abstract features. The author accuses the neuroscientific community of evading this problem by assuming the 'naive realism' view, and ignoring subjective experience. This attitude is partly blamed on the mid-twentieth century advent of single-cell recording, which shifted the emphasis from assembly-wide features towards single-cell features. In the same period, the digital computer became a major part of technology, and was seen as an analogy of the brain. At this stage, AI researchers thought that they had the problem of vision solved, and that they could implement robotic vision without paying any attention to biological systems.

Famous Dalmatian: The author discusses the well-known picture of a Dalmatian dog against a speckled background. Much of the dog is missing, and some of the edges that are there are locally indistinguishable from the background. Much of the edge of the dog is missing and some of the edges that are there are locally indistinguishable from the background. The main point about this is that the local information does not allow the observer to distinguish the dog from the background, but when the picture is viewed as a whole, the dog is clearly distinguishable. Lehar argues that this indicates that perception is based on global brain activity, rather than the sequential processing of individual neurons. He claims that no algorithm has ever come close to handling the ambiguity of the Dalmatian dog picture. Furthermore, the picture is viewed as demonstrating, in exaggerated fashion, the principles that underlie biological visual processing. One argument tries to evade this conclusion, by suggesting that an image such as this is a special case that does not apply to normal visual processing. However, Lehar counters that studies that restrict the view of pictures to just a few edges show that humans cannot distinguish between edges that are important to the outline or form of objects, and edges that are just texture or shadows.

Kaniza triangle:  Lehar discusses visual figures, such as the Kaniza triangle, where the mind automatically perceives a triangle, although all that is physically there on the printed paper is three black Pacman features. Thus, the observer perceives edges and a brighter white ground than the surrounding area, where neither exists on the paper. Again, this is argued to be a global processing of the image, rather than derived from the examination of individual edges.

Rubin vase/faces:  The same is true of other well-known examples such as the Rubin face/vase illusion. A black figure on a page may be perceived, as either a vase or the profiles of two faces opposite one another. The brain jumps from one perception to the other, without ever offering a hybrid picture, and can as quickly reverse its perception. It is argued from this that visual recognition is not the result of feed-forward processing of a visual input leading to a perceptual output, as is often assumed in computer models of the brain, but instead involves a dynamic process that is not completely stable.

Invariant perception:  Lehar also discusses the problem of the invariance of our perception of objects, in that they can be recognised from different angles and in different lights, as the same objects, in a way that is not easily achievable by the analysis of individual edges. Conventional computing could only manage this by having a detector for each possible position, which could produce a combinatorial explosion or NP hard problem, where classical computing might only resolve the problem in a time that was longer than the life of the universe. There have been suggestions that local elements of the object are first recognised, and later put together, but this does not take into account instances, where what are actually different elements may form an image of the same object.

Visual agnosia:  The distinction between being able to detect individual features, and gaining a practically useful model of the world can also be demonstrated from human pathology in the form of visual agnosia. There are two forms of this; in a condition known as apperceptive agnosia, the patient can see individual objects, but cannot integrate these features into a spatially coherent three-dimensional whole. The opposite condition is associative agnosia agnosia, where the patient perceives a coherent world, but cannot identify individual objects. This medical finding is argued to contradict the 'naive realism' claim that the brain is just seeing what is out in the world, in which case the whole spatial environment should be perceived.

Gestalt theory attempts to solve the problem of visual recognition by parallel processing, in which the solutions to each part of the visual recognition problem depend on one another, and thus constrain the possible solutions for one another, thus closing in on a single solution. Lehar also proposes the idea of 'harmonic resonance'. This involves resonance between different modules in the brain, with resonance ultimately being communicated to all the relevant systems in the brain. This is seen as a solution to the 'binding problem' or an explanation of the unity of different modalities in conscious experience. This of course relates to the EEG recordings of gamma frequency synchrony in the brain.

Conclusion:  It is not clear that these Gestalt proposals involve sufficient processing capacity to overcome the likely combinatorial explosions/NP hard problems implied by perception. Lehar does relatively little to link his ideas to the physical components and processing of the brain. From the look of it, a quantum computing process would have more chance of bridging the gap between classical computing capacity and the requirements of visual perception as highlighted by Lehar.