Jay McClelland: Neural Networks and the Emergence of Cognition

Jay McClelland: Neural Networks and the Emergence of Cognition | Lex Fridman Podcast #222

Ui38ZzTymDY • 2021-09-20

Transcript preview

Open

Kind: captions
Language: en
the following is a conversation with jay
mcclelland a cognitive scientist at
stanford and one of the seminal figures
in the history of artificial
intelligence and specifically neural
networks having written the parallel
distributed processing book with david
rommelhart who co-authored the
backpropagation paper with jeff hinton
in their collaborations they've paved
the way for many of the ideas at the
center of the neural network-based
machine learning revolution of the past
15 years
to support this podcast please check out
our sponsors in the description
this is the lex friedman podcast and
here is my conversation with jay
mcclelland
you are one of the seminal figures in
the history of neural networks
at the intersection of uh cognitive
psychology and computer science
what do you have over the decades
emerged as the most beautiful aspect
about neural networks
both artificial and biological
the fundamental thing i think about with
neural networks is how they allow us to
link
biology
with
the mysteries of thought and
um
you know in the
when i was first entering the field
myself in
the late 60s early 70s
cognitive psychology had just become a
field there was a book published in 67
called cognitive psychology
um
and the author said
that
you know the study of the nervous system
was only of peripheral interest it
wasn't going to tell us anything about
the mind
and i
didn't
agree with that i i always felt oh look
i'm i'm a physical
being i
from dust
to dust you know ashes to ashes and
somehow i emerged from that
um
so that's really interesting so there
was a sense with cognitive
psychology that
in understanding the sort of neuronal
structure of things you're not going to
be able to understand the mind
and then your senses if we study these
neural networks we might be able to get
at least very close to understanding the
fundamentals of the human mind yeah
i used to think um where i used to talk
about the idea of awakening from the
cartesian dream
so descartes
you know thought about these things
right
he
he was walking in the gardens of
versailles one day and he stepped on a
stone
and a statue moved
and he walked a little further stepped
on another stone and another statue
moved and he like
why did the statue move when i stepped
on the stone and he went and talked to
the gardeners and he found out that they
had a hydraulic system
that allowed
the physical contact with the stone to
cause water to flow in various
directions which caused water to flow
under the statue and move the statue
and he
used this as
the beginnings of a theory about how
animals
act
and
he had this notion that
these little fibers that people had
identified that weren't carrying the
blood
you know were these little hydraulic
tubes that if you touch something there
would be pressure and it would send a
signal of pressure to the
other parts of the system and that would
cause action
so he had a mechanistic theory of
animal behavior
and he thought that the human
had this animal
body but that some
divine something else had to have come
down and been placed in him
to give him the ability to think
right so
the physical world includes the body in
action but it doesn't include thought
according to descartes right
and so the study of physiology at that
time was the study of sensory systems
and motor systems and
things that you could
directly measure when you stimulated
neurons and stuff like that
and um
the study of cognition was something
that
you know was tied in with abstract
computer algorithms and things like that
but when i was an undergraduate i
learned about the physiological
mechanisms uh and so when i'm studying
cognitive psychology as a first year phd
student i'm saying
wait a minute the whole thing is
biological
you had that intuition right away that
was seemed obvious to you yeah yeah
it isn't that magical though that from
just
the little bit of biology can emerge the
full beauty of the human experience is
that
why is that so obvious to you well i
it's obvious
and not obvious at the same time um and
i think about darwin in this context too
because
darwin
knew very early on that none of the
ideas that anybody had ever offered
gave him a sense of understanding how
evolution
could have worked
but he wanted to figure out how it could
have worked that was his goal
and
he
spent a lot of time
working on this idea and coming you know
reading about things that gave him hints
and thinking they were interesting but
not knowing why and
drawing more and more pictures of
different birds that differ slightly
from each other and so on you know and
and then then he figured it out
but after he figured it out he had
nightmares about it
he would dream about the complexity of
the eye
and the arguments that people had given
about how ridiculous it was to imagine
that
that could have ever emerged from some
sort of you know
unguided process right that it hadn't
been the product of design
and and uh so he he didn't publish for a
long time in part because he was
scared of his own ideas he didn't think
they could probably possibly be true
yeah
um
but then you know by the time
the 20th century rolls around we all
uh
you know we understand that evolut or
many people understand or believe
that evolution uh produced you know the
entire uh range of uh animals that there
are
uh and uh you know descartes idea starts
to seem a little wonky after a while
right like well wait a minute
um
there's
the apes and the chimpanzees and the
bonobos and you know like they're pretty
smart in some ways you know so what
oh you know somebody comes oh there's a
certain part of the brain that's still
different they don't you know there's no
hippocampus in the
monkey brain it's only in the human
brain
uh huxley had to do a surgery in front
of many many people in the late 19th
century to show to them there's actually
a hippocampus
in the chimpanzees brain you know
so so their continuity of the species is
another
element uh that you know contributes
to um
this sort of you know
idea that we are
ourselves uh a total product of nature
um and uh
that to me is the is the magic in the
mystery how
how nature could actually um
you know give rise to
uh
organisms that have the capabilities
that we have
so it's interesting because even the
idea of evolution is hard for me to keep
all together in my mind so because we
think of a human time scale
it's hard to imagine that like like the
the development of the human eye would
give me nightmares too
because you have to think across many
many many generations
and it's very tempting to think about
kind of
a growth of a complicated object and
it's like how is it possible for that
such
such a thing to be built because also
me from a robotics engineering
perspective it's very hard to build
these systems how can
through an undirected process can a
complex thing be designed it seems not
it seems wrong yeah so that's absolutely
right and i you know
a slightly different career path that
would have been equally interesting to
me would have
would have been um to actually study the
process of
embryological development
flowing on into brain development and
the
um
exquisite sort of laying down of
pathways and so on that occurs in the
brain and uh i know the slightest bit
about that it's not my field but
um
there are
you know fascinating
aspects to this process that
eventually result in the
you know the complexity of of uh
various brains
at least you know one thing um we're
um
in in the field i think people have felt
for a long time
it
in the study of vision the continuity
between
humans and non-human animals has been
has been second nature for a lot longer
i was having i had this conversation
um with somebody who's a vision
scientist and you're saying oh we we
don't have any problem with this you
know the monkey's visual system and the
human visual system extremely similar
um
up to certain levels of course they they
diverge after a while but um the first
the the visual pathway from the eye to
the brain and the first few
um
layers of
cortex um or cortical areas i guess one
would say
uh are are extremely similar
yeah so on the cognition side is where
the leap seems to happen with humans
that it does seem we're kind of special
and that's a really interesting question
when thinking about alien life or if
there's other intelligent alien
civilizations out there is how special
is this leap so one special thing seems
to be the origin of life itself however
you define that there's a gray area and
the other leap this is very biased
perspective of a human is the
the
origin of intelligence
and again from an engineer perspective
it's a
difficult question to ask an important
one
is how difficult does that leap how
special were humans
did uh did uh a monolith come down did
aliens bring down a monolith and some
apes had to touch a monolith but
to get it it's a lot like dark descartes
uh you know idea right exactly i it's
but it just seems
that it seems one heck of a leap yeah to
get to this level of intelligence yeah
and you know so chomsky um
uh argued um
that you know some
uh
genetic fluke occurred a hundred
thousand years ago and
you know just happened that some
human some
homonym
of current humans
had this
one genetic
tweak that resulted in
language yeah and
language
then provided
this
special thing that separates us from all
other animals
um
i'm
i think there's a lot of truth to the
value and importance of language
but i think it comes along with
um the evolution of a lot of other
related things related to sociality
and mutual engagement with others
and um
establishment of
i don't know rich
mechanisms for
organizing an understanding of the world
which language then plugs into right so
it's uh
language is a tool
that allows you to do this kind of
collective intelligence and whatever is
at the core of
the thing that allows for this
collective intelligence is the main
thing
and it's interesting to think about that
one fluke
one mutation could lead to the like the
the first crack open opening of the door
to human intelligence like all it takes
is one like evolution just kind of opens
the door a little bit and then it time
and selection takes care of the rest
you know there's so many fascinating
aspects to these kinds of things so
we think of evolution as continuous
right we think oh yes okay over
500 million years there could have been
this
you know relatively continuous
uh changes and
um
but that's not what
anthropologists
evolutionary biologists found from the
fossil record they found
you know
hundreds of years of
hundreds of millions of years of stasis
and then you know suddenly a change
occurs well suddenly on that scale is a
million years or something
but but or even 10 million years but
but um
the concept of punctuated equilibrium
was a very important concept in
evolutionary biology
uh and
that
also feels
somehow right about
you know
the stages of our mental abilities we
we seem to have a certain kind of
mindset at a certain age and then
at another
another age we like look at that
four-year-old and say oh my god how
could they have thought that way
so piaget was known for this kind of
stage theory of child development right
and you look at it closely and suddenly
those stages are so discreet and the
transitions but the difference between
the four-year-old and the seven-year-old
is profound and
that's another thing that's always
interested me is how we
something happens over the course of
several years of experience where at
some point we reach the point where
something like an insight or a
transition or a new stage of development
occurs and uh
you know these kinds of things can be
understood
um
in complex systems uh research and so um
evolutionary biology developmental
biology
cognitive development are all things
that have been approached in this kind
of way yeah
just like you said i find both
fascinating
those early years of human life but also
the early like
minutes days of from the embryonic
development
to like how from embryos you get like
the brain that development
again from the engineering perspective
is fascinating so it's not so the early
when you deploy the brain to the human
world and it gets to explore that world
and learn that's fascinating but just
like the assembly of the mechanism that
is capable of learning that's like
amazing the stuff they're doing with
like brain organoids
where you can
build many brains and study that
um
self-assembly of a mechanism from like
the dna material that that's like what
the heck
you have literally like biological
programs
that just generate a system
this mushy thing
that's able to be robust and learn in a
very unpredictable world
and learn seemingly arbitrary things or
like
a very large number of things that
enable survival yeah
ultimately
um that is a very important part of the
whole process of you know understanding
this sort of
emergence of mind from brain kind of
kind of thing and the whole thing seems
to be pretty continuous so let me uh let
me step back to neural networks for for
another brief
minute you wrote parallel distributed
processing books that explored ideas of
neural networks
in the 1980s
together with a few folks but the books
you wrote with
david uh ronald hart who is the first
author on the back propagation paper
with jeff hinton
so these are just some figures at the
time that were thinking about these big
ideas
what are some memorable moments of
discovery and beautiful ideas from those
early days
i'm going to start
sort of with my own
process in the
mid 70s
and then into the late 70s
when
i met jeff hinson and
he came to san diego
and we were all together
in
my time in graduate school as i've
already described to you i had this sort
of feeling of
okay i'm really interested in human
cognition but
this disembodied sort of way of thinking
about it that i'm getting from the
current
mode of thought about it is isn't
working fully for me
and
when i
got my assistant professorship i went to
ucsd and um
that was in 1974.
something amazing had just happened dave
rummelhart had written a book together
with another man named don norman
and the book was called explorations in
cognition and it was
a series of chapters exploring
interesting questions about cognition
but in a completely sort of
abstract you know non-biological kind of
way and i'm saying gee this is amazing
i'm coming to this community where
people can get together and feel like
they've collectively exploring
you know ideas and um
it was a book that had a lot of
i don't know lightness to it and you
know
the the don norman who was
the the more senior figure the roman
heart at that time who led that project
um you know cr always created this
spirit of playful exploration of ideas
and so i'm like wow this is great
but
i was also
you know still trying to get from
the neurons to the
to the cognition
and
i realized at one point i i got this
opportunity to go to a conference where
i heard a talk by a man named james
anderson who is an engineer
but
by then a professor in a psychology
department who had used
linear algebra to create neural network
models of perception and categorization
and memory and i
just
blew me out of the water that one could
you know create a model
that was simulating neurons not
just
kind of
engaged in a stepwise
algorithmic process that was construed
abstractly but it was
simulating remembering and recalling and
um
recognizing the prior occurrence of a
stimulus or something like that so for
me this was
a bridge between the mind and the brain
and i just like stuck and i i remember i
was walking across campus one day in
1977 and i
almost felt like saint paul on the road
to damascus i said to myself you know if
i think about the mind
in terms of a neural network it will
help me answer the questions about the
mind that i'm trying to answer
and that really excited me
so
i think that a lot of people were
becoming excited about that and one of
those people
was jim anderson who i had mentioned
another one was steve grossberg who had
been
writing about neural networks since the
60s
and jeff hinton was yet another and
his phd dissertation showed up uh in an
applicant pool
to a postdoctoral training program
that dave
and don
the two men i mentioned before remember
heart and norman
were administering and rommelhardt got
really excited about hinton's phd
dissertation
um and so
uh hinton was one of the first um people
who came and joined this group of
postdoctoral scholars
uh that was funded by this this
wonderful grant that they got
another one who is also well known in
neural network circus
circles is pulse milenski he was another
one of that group
anyway
um
jeff and jim anderson organized a
conference at ucsd
uh where we we were and uh it was called
parallel models of associative memory
and it brought all the people together
who had been thinking about these kinds
of ideas
in 1979 or 1980 and
this
this began
to kind of
really resonate with some of rommel
hart's um
own thinking some of his reasons for
wanting something other than the kinds
of computation he'd been doing so far so
let me talk about ronald hart now for a
minute okay with that context well let
me also just pause because he said so
many interesting things before we go to
roma heart so first of all for people
who are not familiar
uh neural networks are at the core of
the machine learning deep learning
revolution of today uh jeffrey hidden
that we mentioned is one
of the figures that were important in
the history like yourself in the
development of these neural networks
artificial neural networks that are then
used for the machine learning
application like i mentioned the back
propagation paper is one of the
optimization
mechanisms by which these uh networks uh
can learn
and uh
the word parallel is really interesting
so it's it's almost like synonymous from
a computational perspective what how you
thought at the time about
neural networks that is parallel
computation
is that would that be fair to say well
yeah the the parallel the word parallel
in this
you know comes from the idea that each
neuron is
an independent computational unit right
it it gathers data from other neurons it
integrates it in a certain way and then
it produces a result and it's a very
simple little computational unit but it
it's autonomous in the sense that
you know it does its thing right it's
it's in a biological medium where it's
getting nutrients and various uh
chemicals from that medium
um but it's uh you know you can think of
it as almost like a little
little computer in and of itself
so the idea is that each you know our
brains have oh look you know a hundred
or
hundreds
almost a billion of these
little neurons right
um
and they're all capable of doing their
work at the same time so it's like
instead of just a single central
processor that's engaged in you know
chug chug one step after another
we have
a billion of these little computational
units working at the same time so at the
time that's i don't know maybe you can
comment it seems to me
even still to me uh quite a
revolutionary way to think about
computation
relative to
the development of theoretical computer
science alongside of that where it's
very much like sequential computer
you're analyzing algorithms that are
running on a single computer that's
right you're saying
wait a minute what what
why don't we take a really dumb very
simple computer and just have a lot of
them interconnected together and they're
all operating in their own little world
and they're communicating with each
other and thinking of computation in
that way and from that kind of
computation
on trying to understand how things like
certain characteristics of the human
mind can emerge
right that's quite a revolutionary way
of thinking i would say
well yes i agree with you and um
there's still this sort of sense
of
not sort of knowing how we
kind of get all the way there um i think
and this very much remains
at the core of the questions that
everybody's asking about the
capabilities of deep learning and all
these kinds of things but if i could
just play this out a little bit
a a convolutional neural network or a
cnn which
you know many people may have heard of
is
a set of
you could think of it biologically as
a set of
collections of neurons each one had
each collection has
maybe
10 000 neurons in it but there's many
layers right some of these things are
hundreds or even a thousand layers deep
but
others are closer to the biological
brain and maybe they're like 20 layers
deep or something like that
so we have
within each layer we have
thousands of neurons or tens of
thousands maybe well in the brain we
probably have
millions in each layer so but we're
getting sort of similar in a certain way
right
um
and then we think okay at the bottom
level
there's an array of things that are like
the photoreceptors in there in the eye
they respond to the amount of light of a
certain wavelength at a certain location
on the
on the pixel array
so that's like the biological eye
and then there's several further stages
going up layers of these neuron-like
units
and
you go from that raw
input array of pixels to
a classification
you've actually built a system that
could do the same kind of thing that you
and i do when we open our eyes and we
look around and we see there's a cup
there's a cell phone
there's a water bottle
and these systems are doing that now
right
so
they are
in in terms of the parallel idea that we
were talking about before
they are doing this massively parallel
computation in the sense that
each of the neurons in each of those
layers is
thought of as computing its little bit
of
something about the input uh
simultaneously with all the other ones
in the same layer
we get to the point of abstracting that
away and thinking oh it's just one whole
vector that's being computed one one
activation pattern is computed in a
single step and that
that that abstraction is useful
but it's still that parallel
and distributed processing right each
one of these guys is just contributing a
tiny bit to that whole thing and that's
the excitement that you felt that from
these simple
things you can emerge when you add these
level of abstractions on it yeah you can
start getting all the beautiful things
that we think about as cognition right
and so okay so you have this conference
i forgot the name already but it's
parallel and something associative
memory and so on
very exciting technical and exciting
title and you started talking about dave
romohart so who is this person
that was so
you've spoken very highly of him yeah
can you tell me about him his ideas his
mind
who he was as a human being as a
scientist
so
dave came from a little tiny town in
western south dakota
and
his
mother was the librarian and his father
was the editor of the newspaper
um
and uh i know one of his brothers pretty
well um
they grew up
there were four brothers uh and uh
they grew up
together
uh and their father encouraged them to
compete with each other a lot
they competed in sports and they
competed in mind games you know um
i don't know things like sudoku and
chess and various things like that
and uh
dave um
was
a standout undergraduate he went
as at a younger age than most people do
to college
at the university of south dakota and
majored in mathematics and i don't know
how he got interested in
psychology but he
applied to the mathematical psychology
program at stanford and was accepted as
a phd student to study mathematical
psychology at stanford so mathematical
psychology
is the
use of mathematics to model
mental processes right so something that
i think these days might be called
cognitive modeling that whole space yeah
it's mathematical in the sense that
um
you say
if
this is true and that is true then i can
derive that this should follow okay and
so you say these are my stipulations
about the fundamental principles and
this is my prediction about behavior and
it's all done with equations it's not
done with a computer simulation
right so the you you solve the equation
and that tells you what the
probability that the subject will be
correct on the seventh trial of the
experiment is or something like that
right so it's a it's a it's a
it's a
use of mathematics to descriptively
characterize uh aspects of of behavior
and uh stanford at that time was the
place where
uh there were several really really
strong mathematical thinkers who were
also connected with three or four others
around the country
who um you know brought a lot of really
exciting ideas uh onto the table
and it was a very very prestigious part
of the field of psychology at that time
so remember heart comes into this
um
he was a very strong student within that
program
uh and
uh
he got
this
job at this brand new university in san
diego in 1967
he's one of the first assistant
professors in the department of
psychology
at ucsd
so
i got there
in 74 seven years later
and
reunhard at that time
was
still doing mathematical modeling
um
but he had gotten interested
in cognition he'd gotten interested in
understanding
and you know understanding i think
remains
you know what does it mean to understand
anyway you know
uh it's it's an interesting sort of
curious you know like how would we know
if we really understood something but
but he was
interested in building machines that
would you know hear a couple of
sentences and have an insight about what
was going on so for example one of his
favorite things at that time was
marky was sitting on the front step when
she heard the familiar jingle
of the good humor man
she remembered her birthday money and
ran into the house
what is margie doing
why
well there's a couple of ideas you could
have but
the most natural one is that
the good humor man brings ice cream she
likes ice cream she's
she knows she needs money to buy ice
cream so she's gonna run into the house
and get her money so she can buy herself
an ice cream it's a huge amount of
inference that has to happen to get
those things to link up with each other
and and he was interested in how the
hell that could happen
and he was trying to build um
you know good old-fashioned ai style
models of
representation of language and
and content of
you know things like
has money
so like a lot or like formal logic and
like knowledge bases like that kind of
stuff yeah so he was integrating that
with his thinking about cognition yes
the mechanisms cognition
how can they like mechanistically be
applied to build these knowledge like to
actually build something that looks like
a web of knowledge and thereby from
from there emerges something like
understanding whatever the heck that is
yeah
he was grappling
this was something that they grappled
with at the end of that book that i was
describing explorations and cognition
but he was realizing that the paradigm
of
good old-fashioned ai wasn't giving him
the answers to these questions
yeah
and by the way that's called good
old-fashioned ai now it was called that
well it was it was beginning to be
called that because it was from the 60s
yeah by by the late 70s it was kind of
old-fashioned and it hadn't really
panned out you know and
people were beginning to recognize that
but
and and remember heart was you know like
yeah it was part of the recognition that
this wasn't all working
anyway so he
started thinking in terms of
uh
the idea that we needed systems that
allowed us to integrate multiple
simultaneous constraints
in a way that would be mutually
influencing each other
so
he wrote a paper that
just
really
first time i read it i said oh well you
know yeah
but is this important but after a while
it just got under my skin
and it was called an interactive model
of reading and in this paper he laid out
the idea that
every aspect
of
our
interpretation of
what what's coming off the page when we
read
at every level of analysis you can think
of
actually depends on all the other levels
of analysis
so
what are the actual
pixels
making up each letter and
what do those pixels signify about which
letters they are and what do those
letters tell us about
what words are there
and what do those words tell us about
what ideas the author is trying to
convey and
so he had this model where you know we
have these
little tiny
uh
elements that represent each of the
pixels of each of the letters and then
other ones that represent the line
segments in them and other ones that
represent the letters and other ones
that represent the words
and
um at that time his idea was there's
this set of experts
there's an expert about how to
construct a line out of pixels and
another expert about how
which sets of lines go together to make
which letters and another one about
which letters go together to make mitch
words and another one about what the
meanings of the words are and another
one about
how the meanings fit together and you
know things like that and all these
experts are looking at this data and
they're
they're um
updating
hypotheses at
at other levels so the word expert can
tell the letter expert oh i think there
should be a t there because i think
there should be a word the here and the
bottom up sort of feature to letter
expert could say i think there should be
a t there too and if they agree
then you see a t right and so there's a
top-down bottom-up interactive process
but it's going on at all layers
simultaneously so everything can filter
all the way down from the top as well as
all the way up from the bottom and it's
a completely interactive bi-directional
parallel distributed process that is
somehow because of the abstractions is
hierarchical so like yeah so there's
different layers of responsibilities
different levels of responsibilities
first of all it's fascinating to think
about it in this kind of mechanistic way
so not thinking purely
from the structure of a neural network
or something like a neural network but
thinking about these little little guys
that work on letters and then
the letters come words and words become
sentences
and uh that's a very interesting
hypothesis that from that
kind of hierarchical structure can
emerge
uh understanding yeah so but the thing
is though i want to just sort of
relate this to the earlier part of the
conversation
um
when rommelhart was first thinking about
it there were these experts on the side
one for the features and one for the
letters and one for how the letters make
the words and so on
and and they would each be working sort
of
evaluating various propositions about
you know is this combination of features
here going to be one that looks like the
letter t and so on
and
and what he
realized kind of after reading hinton's
dissertation and
hearing about jim anderson's
linear algebra-based neural network
models that i was telling you about
before was that he could replace those
experts with neuron-like processing
units which just would have their
connection weights that would do this
job
so there so
what ended up happening was that remote
heart and i got together and we created
a model called the interactive
activation model of letter perception
which is
takes these
little pixel level uh inputs
constructs
uh
line segment features
letters and words but now we built it
out of a set of neuron like processing
units that are just connected to each
other with connection weights so the
unit for the word time has a connection
to the unit for the letter t in the
first position and the letter i in the
second position so on
and
because these connections are
bi-directional
if you have prior knowledge that it
might be the word time that starts to
prime the feature to the letters and the
features and if you don't then it's it
has to start bottom up but the
directionality just depends on where the
information comes in first and
and if you have context together with
features at the same time they can
convergently result in an emergent
perception and that
um that was the
um
the piece of work that we did together
that uh
sort of got us both completely convinced
that you know this neural network way of
thinking
was going to be able to
actually address the questions that we
were interested in as cognitive cycle so
the algorithmic side the optimization
side those are all details like when you
first start
the idea that you can get far with this
kind of way of thinking that in itself
is a profound idea so do you like the
term uh connectionism
to describe this kind of set of ideas
i think it's useful
it highlights
the
notion that the knowledge
that the system exploits is
in the connections between the units
right there isn't a separate
dictionary
the connections between the units
so
i already sort of
laid that on the table with the
connections from the letter units to the
unit for the word time right the unit
for the word time isn't a unit for the
word time for any other reason then it's
got the connections to the letters that
make up the word time
those are the units on the input that
excite it when it's excited that
it in a sense represents in the system
that
there's support for the hypothesis that
the word time is present in the input
um
but it's not
there there's the word time isn't
written anywhere inside the bottle it's
only written there in the picture we
drew of the model to say that's the unit
for the word time right yeah and um if
if if somebody wants to tell me well
what are the how do you spell that word
you have to use the connections from
that out to
to then get those letters for example
that's such a
that's a counter-intuitive idea
we humans want to think in this logic
way
this this idea of connectionism
it doesn't it's weird it's weird that
this is how it all works yeah but let's
go back to that cnn right that cnn with
all those layers of neuron like
processing units that we were talking
about before
it's going to come out and say this is a
cat that's a dog
but it has no idea why it said that it's
just got all these connections between
all these
layers of neurons like from the
very first layer to the you know the
like whatever these layers are they just
get numbered after a while because they
you know they they
somehow further in you go the more
the more abstract the features are but
it's a graded and continuous sort of
process of abstraction anyway and
you know it goes from very local very
very specific to much more sort of
global
but it's still
you know another sort of pattern of
activation over an array of units and
then at the output side it says it's cat
or it's a dot and when when we when i
open my eyes and say oh that's lex
or
um
oh
you know there's my own dog and i
recognize my dog
which is a member of the same species as
many other dogs but
i know this one because of some slightly
unique characteristics i don't know how
to describe you know what it is that
makes me know that i'm looking at lex or
at my particular dog right yeah or even
that i'm looking at a particular brand
of car like i could say a few words
about it but if i wrote you a paragraph
about the car you you would have trouble
figuring out which car is he talking
about right so the idea that we have
propositional knowledge of what
it is that allows us to recognize that
this is an actual instance of this
particular natural kind is um has always
been you know something that
uh
it never worked right you couldn't ever
write down a set of propositions for you
know visual recognition
and and and so
in that space it sort of always seemed
very natural that something more
implicit um
you know
you don't have access to what the
details of the computation were in
between you just get the result so
that's the other part of connectionism
you cannot
you don't read the contents of the
connections the connections only
cause
outputs to occur based on inputs
yeah it's it's and for us that like
final layer or
some particular layer is very important
the one that tells us that it's our dog
or like it's a cat or a dog but
you know each layer is probably equally
as important in the grand scheme of
things
like
there's no reason why the cat versus dog
is more important than the lower level
activations it doesn't really matter i
mean all of it is just this beautiful
stacking on top of each other and we
humans live in this particular layers
for us for us it's useful to
to survive to to use those
cat versus dog predator versus prey all
those kinds of things it's fascinating
that it's all continuous but then you
then ask
you know the history of artificial
intelligence you ask are we able to
introspect and convert
the very things that allow us to tell
the difference to cat and dog
into
logic into formal logic that's been the
dream
i would say that's still part of the the
dream of symbolic ai and
i've recently
talked to uh doug
leonard who created psych
and that's that's a project that lasted
for many decades
and still carries a sort of dream in it
right
um
but we still don't know the answer right
it seems like connectionism is really
powerful
but it also seems like there's this
building of knowledge
and so how do we
how do you square those two like do you
think the connections can contain the
depth of human knowledge and the depth
of what uh dave romohart was thinking
about of understanding
well uh that remains the 64 question and
um
with inflation that number yeah
maybe it's the 64 billion dollar
question now
uh
you know i think that
um
from
the emergence side which you know
uh
i placed myself on um
so i i used to sometimes tell people i
was a radical eliminative connectionist
because
i didn't want them to
think
that i wanted to build like anything
into the machine
but um i
don't like the word eliminative
uh anymore because it makes it seem like
it's wrong to think that there is this
emergent level of
understanding and
um
i disagree with that so i think you know
i would call myself in a radical
emergentist
uh connectionist rather than eliminative
connectionist right because i want to
acknowledge
that
that these higher level kinds of
aspects of our cognition are
are real but they're not
they're they don't
they don't exist as
such and so there was an example that uh
doug hofstetter used to use that i
thought was helpful in this respect
just the idea that
we could think about sand dunes
as entities
and talk about like how many there are
even
um
but we also
know that a sand dune is a very fluid
thing it's it's it's a it's a
it's a pile of sand that is capable of
moving around under the wind and the and
and um
you know
reforming itself in somewhat different
ways and and if we think about our
thoughts it's like sand dunes as being
things that
you know emerge from
uh
just the the way all the lower level
elements sort of work together and and
are constrained by external forces
then we can we can say yes they exist as
such but they they also
you know
we shouldn't treat them as completely
monolithic
entities that we
we can understand without understanding
sort of all of the stuff that
allows them to change in the ways that
they do and that's where i think the
connectionist feeds into the
into the cognitive it's like okay so if
the under if the substrate is parallel
distributed connectionist
um then it doesn't mean that the
contents of thought isn't you know like
abstract and symbolic and um
but it's more fluid maybe then uh
is easier to capture with a set of
logical expressions yeah that's a heck
of a sort of thing to put
at the top of
a resume radical emergingist
connectionist
so i there is
just like you said a beautiful dance
between that between the machinery of
intelligence
like the neural network side of it and
the stuff that emerges
i mean the stuff that emerges
seems to be um
i don't know
i don't know what that is
that it seems like maybe all
of reality is emergent
what i what i think about
this is made most distinctly
rich to me when i look at cellular
automata look at game of life
they're from very very simple things
very rich complex things emerge that
start looking very quickly like
organisms
that you forget that the forget how the
actual thing operates they start looking
like they're moving around they're
eating each other some of them are
generating
offspring
it you forget very quickly and it seems
like maybe it's something about the
human mind that wants to operate in some
layer of the emergent
and forget about the the mechanism of
how that emerges happens so i it just
like you are in your radicalness
i'm uh
also it seems like unfair to
eliminate the magic of that emergent
like eliminate the
the fact that that the emergence is real
yeah no i agree i'm not
that's why i got rid of eliminative
right yeah yeah because it seemed like
that was trying to say that you know
it's all
completely like
an illusion of some kindness well it it
you know who knows whether there isn't
there aren't some illusory
characteristics there
um and and
i i think that uh philosophically um
many people have have confronted that
possibility over time but
but uh
it it's still
important to
um you know accept it as magic right so
you know i think of fellini in this
context i think of
um others who have
appreciated uh the role of magic uh of
actual trickery in creating illusions
that
that move
that move us
you know had plato was odd to this too
it's like somehow or other these shadows
you know
give rise to something
much deeper than that and and that's
that's
so you know we won't try to figure out
what it is we'll just accept it as given
that that that occurs and um
you know but he was still on to the
magic of it yeah yeah we won't try to
really really really deeply understand
how it works we just enjoy the fact that
it's kind of fun
okay but you uh worked closely with dave
around my heart
he passed away
as a human being what do you remember
about him
do you miss the guy
absolutely
you know he passed away um
15 ish years ago now
and um
his
his demise was actually one of the most
poignant and
um
you know like relevant
uh tragedies um
relevant to our conversation he
started to
undergo a progressive
neurological condition
that
isn't fully understood that is to say
his particular
course isn't fully understood um
because
certain you know brain scans weren't
done in certain stages
and no autopsy was done or anything like
that
the wishes of the family um
so we don't know as much about the
underlying pathology as we might but
um
i had begun to get interested in this
neurological condition that might have
been the very one that he was succumbing
to
as my own efforts to uh understand
another aspect of this mystery that
we've been discussing
while he was beginning to get
progressively more and more affected
so
i'm going to talk about the disorder and
not about remember heart for a second
okay sure the disorder is something my
colleagues and collaborators have chosen
to call
semantic dementia
so
it's a specific form of
loss of mind
related to meaning
semantic dementia
and it's progressive
in the sense that the patient
loses the ability
to
appreciate the meaning of the
experiences that they have either from
touch from sight from sound
from language
they i hear sounds but i don't know what
they mean kind of thing
um
the
so as as this illness progresses it
starts with
the patient being unable to
um
differentiate like similar breeds of dog
or
remember
you know the the lower frequency
unfamiliar categories that they used to
be able to remember
but as it progresses
it
it it becomes more and more striking and
and
you know the the patient loses the
ability to recognize
um
you know things like
pigs and goats and sheep and calls all
middle-sized animals dogs and all can't
recognize rabbits and
and rodents anymore they call all the
little ones cats and they can't
recognize
hippopotamuses and and cows anymore they
call them all horses you know so
there was this one patient who
went through this progression where uh
at a certain point
any four-legged animal he would call it
either a horse or a dog or a cat
and if it was big he would tend to call
it a horse if it was small he'd tend to
call it a cat middle-sized onesie called
dogs
this is just a part of the syndrome
though it
the the patient loses the ability to
relate
uh concepts to each other so my my
collaborator in this work carolyn
patterson developed
a test called the pyramids and palm
trees test
so
you give the patient a picture of
pyramids and they have a choice which
goes with the pyramids
palm trees or pine trees
and
you know she showed that this wasn't
just a matter of language because
the patient's
loss of this ability shows up whether
you present the material with words or
with pictures the pictures
they can't put the pictures together
with each other properly anymore they
can't relate the pictures to the words
either they can't do word picture
matching but they've lost the conceptual
grounding
from either modality of input and
um so it's that's why it's called
semantic dementia the very semantics is
disintegrating
and and we we understand this in terms
of our
idea that distributed representation a
pattern of activation represents the
concepts really similar ones as you
degrade them they start being
you lose the differences and
and then um so the difference between
the dog and the goat sort of is no
longer part of the pattern anymore and
since dog is really familiar that's the
thing that remains and and we understand
that in the way the models work and
learn but
but remember heart underwent this this
condition so on the one hand it's a
fascinating aspect of parallel
distributed processing to me
uh and it reveals this uh this sort of
texture of distributed representation
in a very nice way i've always felt but
at the same time it was extremely
poignant because
this is exactly the condition that romal
heart was undergoing and there was a
period of time when he was
this man who had been the most
focused
um
goal-directed
competitive
um
thoughtful
person who was willing to work for years
to solve a hard problem you know he
he he starts to disappear
and
there was a period of time when it was
like
hard for any of us to really appreciate
that he was sort of in some sense
not
fully there anymore do you know if he
was able to introspect
this um
the solution of this you know the the
understanding mind
was he i mean this is one of the big
scientists that thinks about this yeah
was he able to look at himself and
understand the fading mind
you know
um we can we can contrast um
hawking and normal heart in this way and
i i like to do that to honor rummelhart
because i think rummelhart is sort of
like the hawking of
you know cognitive science to me in some
ways um
but both of them
suffered from a degenerative
condition
and in hawking's case it affected the
motor system
in in romelhart's case it's it's
affecting the semantics
uh and
um
not
not just the pure uh object semantics
but maybe the self semantics as well and
we don't understand that
broadly but but but it's
so i would say uh he didn't and this was
part of what from the outside was a
profound tragedy
but
but on the other hand at some level he
sort of did because
you know there was a period of time when
it finally was realized that he had
really become
profoundly impaired this was clearly a
biological condition and he wasn't you
know it wasn't just like he was
distracted that day or something like
that
so he retired
uh you know from his professorship at
stanford and he became
um he he
uh lived with his brother for a couple
years and then he moved into a
a facility for people with um
cognitive impairments
um
a
one that
you know many elderly people end up in
when they have cognitive impairments and
i
would spend time with him during that
period this was like in the late 90s
around 2000 even
and
you know i would we would go
bowling
and he could still bowl
uh and um
i after bowling i took him to lunch and
i i said
where would you like to go you want to
go to wendy's and he said nah
and i said okay well where you want to
go and he he just pointed he's turn here
you know so
he still had a certain amount of spatial
cognition and he could get me to the
restaurant
and then when we got to the restaurant
i i said what do you want to order and
um
he couldn't
come up with any of the words but he
knew where on the menu the thing was
that he wanted so
so fascinating it's it you know and
he couldn't say what it was but he knew
that that's what he wanted to eat and
and so there was you know that
it's it's it's like it isn't monolithic
at all this the our cognition is
is
you know first of all graded in certain
kinds of ways but also
multipartite there's many elements to it
and things
uh
certain sort of
partial competencies still exist in the
absence of
of other aspects of these competencies
so this is what always fascinated me
about
what
uh
used to be called cognitive
neuropsychology
you know the effects of brain damage on
cognition but in particular this gradual
disintegration part you know i'm a big
believer that the loss of a
human being that you value is as
powerful as you know first falling in
love with that human being i think
it's all a celebration of the human
being so the disintegration itself too
is a celebration yeah
yeah yeah and
but just to say something more about
the scientists and and the back
propagation idea that you mentioned um
so
in in 1982
hinton had been there as a postdoc and
organized that conference he'd actually
gone away and gotten an assistant
professorship and then
um there was this opportunity to bring
him back so jeff hinton was back
on a sabbatical san diego in san diego
and uh remember heart and i had decided
we wanted to do this
you know we thought it was really
exciting and
um our the papers on the interactive
activation model that i was telling you
about had just been published and we
both sort of saw a huge potential for
this work and
and and jeff was there and so the three
of us
uh
started a research group which we called
the pdp research group
and
several other people
came um francis crick
who was at the salk institute heard
about it from jeff
um and because jeff was known among
brits to be brilliant and francis was
well connected with his british c

Resume

Berikut adalah rangkuman komprehensif dan terstruktur berdasarkan transkrip diskusi dengan Jay McClelland.

---

# Revolusi *Parallel Distributed Processing*: Menghubungkan Biologi, Pikiran, dan Kecerdasan Buatan

### Inti Sari (Executive Summary)
Video ini membahas perjalanan intelektual Jay McClelland, seorang ilmuwan kognitif dari Stanford, yang menjadi tokoh kunci dalam revolusi jaringan saraf tiruan (*neural networks*) melalui pendekatan *Parallel Distributed Processing* (PDP). McClelland menjelaskan bagaimana pendekatan ini menghubungkan mekanisme biologis otak dengan misteri pemikiran manusia, menantang dualisme Cartesian, serta menggantikan pandangan simbolik tradisional dengan model koneksionisme. Diskusi juga mencakup sejarah penemuan *backpropagation*, kolaborasi dengan David Rumelhart dan Geoff Hinton, serta refleksi mendalam tentang sifat matematika, intuisi, dan makna hidup dalam konteks evolusi kecerdasan buatan.

### Poin-Poin Kunci (Key Takeaways)
*   **Paradigma PDP:** Otak manusia bekerja melalui pemrosesan paralel dan terdistribusi di mana pengetahuan tersimpan dalam koneksi antar neuron, bukan dalam aturan simbolik eksplisit.
*   **Kontinuitas Biologis:** Terdapat kesinambungan antara hewan dan manusia dalam hal fungsi biologis; kecerdasan manusia muncul dari proses alam yang sama, bukan elemen ilahi eksternal (*dualisme*).
*   **Sejarah Backpropagation:** Algoritma *backpropagation* (atau *Generalized Delta Rule*) dikembangkan oleh David Rumelhart di bawah pengaruh Geoff Hinton, menjadi fondasi pembelajaran *deep learning* modern.
*   **Intuisi vs Logika:** Dalam matematika dan sains, penemuan besar seringkali berawal dari intuisi (wawasan mendadak) yang kemudian dibuktikan secara logis, sebuah proses yang kini mulai ditiru oleh AI seperti AlphaZero.
*   **Kritis terhadap Psikiatri Biologis:** McClelland menyesalkan pergeseran psikiatri menuju pendekatan farmakologis murni yang mengabaikan aspek perilaku dan bahasa, yang ia anggap sebagai jalan buntu.
*   **Motivasi Intrinsik:** Kunci kesuksesan dalam karir ilmiah adalah menemukan dan memelihara apa yang memotivasi seseorang secara intrinsik, bukan sekadar memenuhi persyaratan akademik.

---

### Rincian Materi (Detailed Breakdown)

#### 1. Filosofi Biologi dan Evolusi Kecerdasan
McClelland membahas keindahan jaringan saraf yang mampu menghubungkan biologi dengan misteri pemikiran. Pada akhir 60an, psikologi kognitif menganggap sistem saraf sebagai sesuatu yang perifer, namun McClelland percaya bahwa pikiran muncul dari substansi fisik ("dust to dust"). Ia menantang *mimpi Cartesian* yang memisahkan tubuh dan pikiran, dengan menunjukkan bahwa evolusi Darwin dan bukti anatomi (seperti *hippocampus* pada kera) mendukung adanya kesinambungan spesies. Kecerdasan manusia dipandang bukan sebagai kejadian mutasi genetik tiba-tiba (seperti teori Chomsky), melainkan evolusi bertahap yang terkait dengan sosialitas dan alat untuk kecerdasan kolektif.

#### 2. Konsep *Parallel Distributed Processing* (PDP) dan Koneksionisme
Bagian ini menjelaskan dasar dari revolusi AI modern:
*   **Pemrosesan Paralel:** Berbeda dengan komputer sekuensial yang menggunakan satu prosesor, otak memiliki ratusan juta neuron (unit komputasi otonom) yang bekerja secara simultan.
*   **Jaringan Saraf Tiruan (CNN):** McClelland menjelaskan bagaimana *Convolutional Neural Networks* meniru penglihatan manusia, di mana lapisan bawah menangkap piksel dan lapisan atas mengenali objek melalui pola aktivasi yang terdistribusi.
*   **Model Interaktif Membaca:** Menggantikan model lama yang berjenjang kaku, McClelland dan Rumelhart mengusulkan model di mana setiap tingkatan (dari huruf hingga makna) berinteraksi secara dua arah dan simultan.
*   **Koneksionisme:** Pengetahuan tidak disimpan dalam "kamus" terpisah, melainkan implisit dalam kekuatan koneksi (bobot) antar unit. Misalnya, kita mengenali anjing bukan karena aturan tertulis, tapi karena pola koneksi yang terbentuk dari pengalaman.

#### 3. Sejarah Backpropagation dan Tragedi David Rumelhart
McClelland mengenang kolaborasi bersejarahnya dengan David Rumelhart dan Geoff Hinton dalam kelompok riset PDP:
*   **Kelahiran Backpropagation:** Pada awal 80an, Hinton menyarankan pendekatan *gradient descent* (meminimalkan kesalahan) untuk mengatur bobot jaringan. Rumelhart kemudian merumuskan "Aturan Delta Tergeneralisasi" (*Generalized Delta Rule*) atau yang dikenal sebagai *backpropagation*, memungkinkan pembelajaran di lapisan tersembunyi (*hidden layers*).
*   **Demensia Semantik:** David Rumelhart menderita penyakit neurologis degeneratif yang menyebabkan hilangnya pemahaman makna secara bertahap. McClelland menggunakan kondisi ini sebagai bukti empiris bahwa representasi kognitif memang terdistribusi; saat koneksi merosot, makna khusus (seperti perbedaan anjing vs kambing) menghilang sebelum makna umum.

#### 4. Matematika, Intuisi, dan AI Modern
Diskusi beralih ke sifat matematika dan penalaran:
*   **Matematika sebagai Alat:** Matematika didefinisikan sebagai seperangkat alat untuk menjelajahi dunia yang diidealkan (seperti segitiga sempurna) yang relevan dengan dunia nyata.
*   **Peran Intuisi:** Mengutip Poincaré, "Kita membuktikan dengan logika, tapi kita menemukan dengan intuisi." AI modern seperti AlphaZero menunjukkan momen "wawasan" yang mirip dengan intuisi manusia, menggunakan jaringan saraf untuk evaluasi posisi dan pencarian logis untuk konfirmasi.
*   **Emergensi:** McClelland menggunakan analogi bukit pasir Doug Hofstadter; pikiran adalah entitas yang nyata namun cair, muncul dari elemen-elemen sederhana, mirip dengan bagaimana aturan sederhana dalam *Game of Life* menciptakan kompleksitas yang tampak seperti makhluk hidup.

#### 5. Pembelajaran, "Blind Spot" Ahli, dan Motivasi Intrinsik
McClelland menekankan pentingnya konteks budaya dan motivasi dalam perkembangan intelektual:
*   **Buta Ahli (*Expert Blind Spot*):** Para ahli (seperti Chomsky atau matematikawan) sering kali salah mengira intuisi mereka yang telah terlatih sebagai kemampuan bawaan alami manusia. Misalnya, bilangan asli ternyata adalah konstruksi budaya, bukan naluri bawaan.
*   **Pentingnya Motivasi:** McClelland berbagi kisah perjalanan akademisnya di Columbia University pada masa protes 1968. Ia menemukan psikologi bukan karena rencana matang, melainkan karena mengikuti rasa ingin tahu dan ketertarikan pribadi yang kemudian menjadi fondasi karirnya.

## Kesimpulan & Pesan Penutup
Diskusi dengan Jay McClelland ini menggambarkan bagaimana pendekatan *Parallel Distributed Processing* merevolusi pemahaman kita tentang otak dan kecerdasan buatan dengan menjembatani biologi dan kognisi. Sejarah di balik *backpropagation* dan refleksi tentang intuisi menegaskan bahwa inovasi besar seringkali lahir dari kolaborasi serta pemahaman mendalam, bukan sekadar logika kaku. Pada akhirnya, perjalanan intelektual McClelland mengingatkan kita bahwa kunci keberhasilan dalam mengeksplorasi ilmu pengetahuan terletak pada motivasi intrinsik dan keberanian untuk mengikuti rasa ingin tahu.

Read

file updated 2026-02-14 17:47:08 UTC