|
|
Gestalt view of consciousness
The World in Your Head: A Gestalt View of the Mechanism of Conscious Experience
Steven Lehar, Schepens Eye Research Institute
Lawrence Erlbaum (2003)
INTRODUCTION:
Lehar makes a good case against the computer/AI model of the brain, by
highlighting the inability of computers to differentiate the edges
needed to construct a model of the world, from the mass of less
important input. He contrasts this with the ability of biological vision
to deduce information from very flimsy inputs. The Gestalt methods
suggested for achieving what the brain can do are not entirely
convincing, as a means of sorting the mass of data input, and thus
avoiding the combinatorial explosions implied by the requirements of
visual perceptions. In this respect, a quantum computing approach might
look to have a greater chance of success. Further to this, a weakness of
the book is the lack of much attempt to relate what is proposed to the
physical components and processing of the brain.
Lehar approaches
consciousness from the angle of the relationship between visual image
processing and artificial intelligence (AI). A computer has all the data
relative to an image in the form of numerical data. However, turning
this into usable information in AI/robotics has proved an intractable
problem. Computers can detect features such as edges, but the problem is
that they can detect too many of such features. Their edge detection
includes details of texture, surface fragmentation and shadows, but
fails to pick out those edges that are relevant for the outlines or
volumes of an object. Further, there is no apparent algorithm to deal
with occluded objects, where a small object obstructs the view of part
of a larger object, but it can be deduced that the larger object
continues behind the smaller object. This is taken to mean that the
information of global significance for understanding the image is not
available in the local edges.
Computers have problems with the
spatial structure of visual scenes, and as a result difficulty in
navigating in an environment of irregular forms, which, by contrast,
present little problem for biological vision. Lehar points out that the
retinal image is two-dimensional, but is perceived as three-dimensional,
and that therefore the three-dimensional depth of the image must be the
result of cortical processing. A basic function of visual perception is
argued to be the transformation from a two-dimensional retinal image to
a three-dimensional perception in the brain. Apart from inserting
spatial structure into an initially two-dimensional image, the brain
must also decompose this image into coherent objects with volume within
the spatial structure. From this it is argued that the brain must
operate a spatial algorithm, in order to produce this three-dimensional
image. What computers have had difficulty in achieving is not receiving
the visual data, but in developing the sort of processing that allows
the brain to turn this data into a conscious image.
The
literature relative to these problems concentrates on restricted
domains, with separate algorithms for extracting shape from shading, for
motion or for lines. However, the problem of dealing with shape of the
conformation of objects that reflect light has remained largely
unresolved. This divergence in relative performance is argued to show
that the basis of biological and computer vision are very different from
one another.
Conscious images: Lehar takes the view that the
conscious image is assembled in the brain, in response to data from the
external world. This is described as 'indirect realism,' in contrast to
'direct realism' or 'naive realism', in which it is believed that we
perceive the external world as it actually is. The author thinks that
discussions in neuroscience are often implicitly based on direct
realism, but he argues that this view is based on false assumptions. The
visual experience is at odds with scientific reality, because the
subjective world is experienced, as if it were outside the brain,
whereas visual processing occurs inside the brain. The causal chain of
vision is one, in which the brain can only process material that has
already been picked up by the sensory organs. Consciousness is therefore
necessarily confined to the experience of internally constructed
models. Lehar goes back to Kant, who distinguishes between the
'nouminal' world of light signals etc. and the phenomenal world of
internal conscious perception. The 'nouminal' world is only perceived
within the phenomenal world.
The author argues that the
properties of subjective experience are inconsistent with the present
neuroscientific thinking, based on the semi-independent sequential
operation of billions of individual neurons. In contrast, our experience
is mainly of stable and solid volumes, rather than billions of abstract
features. The author accuses the neuroscientific community of evading
this problem by assuming the 'naive realism' view, and ignoring
subjective experience. This attitude is partly blamed on the
mid-twentieth century advent of single-cell recording, which shifted the
emphasis from assembly-wide features towards single-cell features. In
the same period, the digital computer became a major part of technology,
and was seen as an analogy of the brain. At this stage, AI researchers
thought that they had the problem of vision solved, and that they could
implement robotic vision without paying any attention to biological
systems.
Famous Dalmatian: The author discusses the well-known
picture of a Dalmatian dog against a speckled background. Much of the
dog is missing, and some of the edges that are there are locally
indistinguishable from the background. Much of the edge of the dog is
missing and some of the edges that are there are locally
indistinguishable from the background. The main point about this is that
the local information does not allow the observer to distinguish the
dog from the background, but when the picture is viewed as a whole, the
dog is clearly distinguishable. Lehar argues that this indicates that
perception is based on global brain activity, rather than the sequential
processing of individual neurons. He claims that no algorithm has ever
come close to handling the ambiguity of the Dalmatian dog picture.
Furthermore, the picture is viewed as demonstrating, in exaggerated
fashion, the principles that underlie biological visual processing. One
argument tries to evade this conclusion, by suggesting that an image
such as this is a special case that does not apply to normal visual
processing. However, Lehar counters that studies that restrict the view
of pictures to just a few edges show that humans cannot distinguish
between edges that are important to the outline or form of objects, and
edges that are just texture or shadows.
Kaniza triangle: Lehar
discusses visual figures, such as the Kaniza triangle, where the mind
automatically perceives a triangle, although all that is physically
there on the printed paper is three black Pacman features. Thus, the
observer perceives edges and a brighter white ground than the
surrounding area, where neither exists on the paper. Again, this is
argued to be a global processing of the image, rather than derived from
the examination of individual edges.
Rubin vase/faces: The same
is true of other well-known examples such as the Rubin face/vase
illusion. A black figure on a page may be perceived, as either a vase or
the profiles of two faces opposite one another. The brain jumps from
one perception to the other, without ever offering a hybrid picture, and
can as quickly reverse its perception. It is argued from this that
visual recognition is not the result of feed-forward processing of a
visual input leading to a perceptual output, as is often assumed in
computer models of the brain, but instead involves a dynamic process
that is not completely stable.
Invariant perception: Lehar also
discusses the problem of the invariance of our perception of objects, in
that they can be recognised from different angles and in different
lights, as the same objects, in a way that is not easily achievable by
the analysis of individual edges. Conventional computing could only
manage this by having a detector for each possible position, which could
produce a combinatorial explosion or NP hard problem, where classical
computing might only resolve the problem in a time that was longer than
the life of the universe. There have been suggestions that local
elements of the object are first recognised, and later put together, but
this does not take into account instances, where what are actually
different elements may form an image of the same object.
Visual
agnosia: The distinction between being able to detect individual
features, and gaining a practically useful model of the world can also
be demonstrated from human pathology in the form of visual agnosia.
There are two forms of this; in a condition known as apperceptive
agnosia, the patient can see individual objects, but cannot integrate
these features into a spatially coherent three-dimensional whole. The
opposite condition is associative agnosia agnosia, where the patient
perceives a coherent world, but cannot identify individual objects. This
medical finding is argued to contradict the 'naive realism' claim that
the brain is just seeing what is out in the world, in which case the
whole spatial environment should be perceived.
Gestalt theory
attempts to solve the problem of visual recognition by parallel
processing, in which the solutions to each part of the visual
recognition problem depend on one another, and thus constrain the
possible solutions for one another, thus closing in on a single
solution. Lehar also proposes the idea of 'harmonic resonance'. This
involves resonance between different modules in the brain, with
resonance ultimately being communicated to all the relevant systems in
the brain. This is seen as a solution to the 'binding problem' or an
explanation of the unity of different modalities in conscious
experience. This of course relates to the EEG recordings of gamma
frequency synchrony in the brain.
Conclusion: It is not clear
that these Gestalt proposals involve sufficient processing capacity to
overcome the likely combinatorial explosions/NP hard problems implied by
perception. Lehar does relatively little to link his ideas to the
physical components and processing of the brain. From the look of it, a
quantum computing process would have more chance of bridging the gap
between classical computing capacity and the requirements of visual
perception as highlighted by Lehar.
|
|