Stuart Russell: Long-Term Future of Artificial Intelligence

Stuart Russell: Long-Term Future of Artificial Intelligence | Lex Fridman Podcast #9

KsZI5oXBC0k • 2018-12-09

Transcript preview

Open

Kind: captions
Language: en
the following is a conversation with
Stuart Russell he's a professor of
computer science at UC Berkeley and a
co-author of a book that introduced me
and millions of other people to the
amazing world of AI called artificial
intelligence a modern approach so it was
an honor for me to have this
conversation as part of MIT course and
artificial general intelligence and the
artificial intelligence podcast if you
enjoy it please subscribe on youtube
itunes or your podcast provider of
choice or simply connect with me on
twitter at Lex Friedman spelled Fri D
and now here's my conversation with
Stuart Russell so you've mentioned in
1975 in high school you've created one
year first AI programs that play chess
were you ever able to build a program
that beat you a chess or another board
game so my program never beat me at
chess I actually wrote the program at
Imperial College so I used to take the
bus every Wednesday with a box of cards
this big and shove them into the card
reader and they gave us eight seconds of
CPU time
it took about five seconds to read the
cards in and compile the code so we had
three seconds of CPU time which was
enough to make one move you know with a
not very deep search and then we would
print that move out and then we'd have
to go to the back of the queue and wait
to feed the cards in again how do you
post a search well I would talk to no I
think we got we got an eight move eight
you know depth eight with alpha beta and
we had some tricks of our own about move
ordering and some pruning of the tree
and we were still able to beat that
program yeah yeah I I was a reasonable
chess player in my youth I did Anna
fellow program and a backgammon program
so when I go to Berkley I worked a lot
on what we call meta reasoning which
really means reasoning about reasoning
and in the case of a game playing
program you need to reason about what
parts of the search tree you're actually
going to explore because the search tree
is enormous or you know bigger than the
number of atoms in the universe and the
way programs succeed and the way humans
succeed is by only looking at a small
fraction of the search tree and if you
look at the right fraction you play
really well if you look at the wrong
fraction if you waste your time thinking
about things that are never gonna happen
the moves that no one's ever gonna make
then you're gonna lose because you you
won't be able to figure out the right
decision
so that question of how machines can
manage their own computation either how
they decide what to think about
is the meta-reasoning question we
developed some methods for doing that
and very simply a machine should think
about whatever thoughts are going to
improve its decision quality we were
able to show that both for a fellow
which is a standard to play game and for
backgammon which includes dice for also
it's a two-player game with uncertainty
for both of those cases we could come up
with algorithms that were actually much
more efficient than the standard alpha
beta search which chess programs at the
time we're using and that those programs
could beat me and I think you can see
same basic ideas in alphago and alpha
zero today the way they explored the
tree is using a former meta reasoning to
select what to think about based on how
useful it is to think about it is there
any insights you can describe without
Greek symbols of how do we select which
paths to go down there's really two
kinds of learning going on so as you say
alphago learns to evaluate board
position so it can it can look at a go
board and it actually has probably a
superhuman ability to instantly tell how
promising that situation is to me the
amazing thing about alphago is not that
it can be the world champion with its
hands tied behind his back but the fact
that if you stop it from searching
altogether so you say okay you're not
allowed to do any thinking ahead
right you can just consider each of your
legal moves and then look at the
resulting situation and evaluate it so
what we call a depth one search so just
the immediate outcome of your moves and
decide if that's good or bad
that version of alphago can still play
at a professional level right and human
professionals are sitting there for five
ten minutes deciding what to do and
alphago in less than a second
instantly into it what is the right move
to make based on its ability to evaluate
positions and that is remarkable because
you know we don't have that level of
intuition about go we actually have to
think about the situation so anyway that
capability that alphago has is one big
part of why it beats humans the other
big part is that it's able to look ahead
40 50 60 moves into the future mm-hmm
and you know if it was considering all
possibilities 40 or 50 or 60 moves into
the future that would be you know 10 to
the 200
possibility so wait way more than you
know atoms in the universe and and so on
so it's very very selective about what
it looks at
so let me try to give you an intuition
about how you decide what to think about
it's a combination of two things one is
how promising it is right so if you're
already convinced that a move is
terrible there's no point spending a lot
more time convincing yourself that it's
terrible because it's probably not gonna
change your mind so the the real reason
you think is because there's some
possibility of changing your mind about
what to do mm-hmm
right and is that changing your mind
that would result then in a better final
action in the real world so that's the
purpose of thinking is to improve the
final action in the real world and so if
you think about a move that is
guaranteed to be terrible you can
convince yourself is terrible and you're
still not gonna change your mind all
right
but on the other hand you I suppose you
had a choice between two moves one of
them you've already figured out is
guaranteed to be a draw let's say and
then the other one looks a little bit
worse like it looks fairly likely that
if you make that move you're gonna lose
but there's still some uncertainty about
the value of that move there's still
some possibility that it will turn out
to be a win all right then it's worth
thinking about that so even though it's
less promising on average than the other
move which is guaranteed to be a draw
there's still some purpose in thinking
about it because there's a chance that
you will change your mind and discover
that in fact it's a better move so it's
a combination of how good the move
appears to be and how much I'm certainty
there is about its value the more
uncertainty the more it's worth thinking
about because there's a higher upside if
you want to think of it that way and of
course in the beginning especially in
the alphago 0 formulation it's
everything is shrouded in uncertainty so
you're really swimming in a sea of
uncertainty so it benefits you too I
mean actually following the same process
as you described but because you're so
uncertain about everything you you
basically have to try a lot of different
directions yeah so so the early parts of
the search tree a fairly bushy
that it will when looking a lot of
different possibilities but fairly
quickly the degree of certainty about
some of the moves I mean if a movies are
really terrible you'll pretty quickly
find out right you lose half your pieces
or half your territory and and then
you'll say okay this this is not worth
thinking about any more and then so a
further down the tree becomes very long
and narrow and you're following various
lines of play you know 10 20 30 40 50
moves into the future and you know
that's again it's something that human
beings have a very hard time doing
mainly because they just lacked the
short-term memory you just can't
remember a sequence of moves that's 50
movies long and you can't you can't
imagine the board correctly for that
money moves into the future of course
the top players I'm much more familiar
with chess but the top players probably
have they have echoes of the same kind
of intuition instinct that in a moment's
time
alphago applies when they see a board
I mean they've seen those patterns human
beings have seen those patterns before
at the top at the Grandmaster level it
seems that there is some similarities or
maybe it's it's our imagination creates
a vision of those similarities but it
feels like this kind of pattern
recognition that the alphago approaches
are using is similar to what human
beings at the top level or using I think
there's there's some truth to that but
not entirely yeah I mean I think the the
extent to which a human Grandmaster can
reliably wreak instantly recognize the
right move instantly recognize the value
of a position I think that's a little
bit overrated but if you sacrifice a
queen for exam I mean there's these
there's these beautiful games of chess
with Bobby Fischer somebody where it's
seeming to make a bad move and I'm not
sure there's a
a perfect degree of calculation involved
were they've calculated all the possible
things that happen but there's an
instinct there right that somehow adds
up to the yeah so I think what happens
is you you you get a sense that there's
some possibility in the position even if
you make a weird-looking move that it
opens up some some lines of of
calculation that otherwise would be
definitely bad and and is that intuition
that there's something here in this
position that might might yield a win
down the side and then you follow that
right and and in some sense when when a
chess player is following a line and in
his or her mind they're they mentally
simulating what the other person is
gonna do while the opponent is gonna do
and they can do that as long as the
moves are kind of forced right as long
as there's a you know there's a fourth
we call a forcing variation where the
opponent doesn't really have much choice
how to respond and then you see if you
can force them into a situation where
you win you know we see plenty of
mistakes even even in Grandmaster games
where they just miss some simple three
four five move combination that you know
wasn't particularly apparent in in the
position but we're still there that's
the thing that makes us human
yeah so when you mentioned that in a
fellow those games were after some meta
reasoning improvements and research I
was able to beat you how did that make
you feel part of the meta reasoning
capability that it had was based on
learning and and you could sit down the
next day and you could just feel that it
had got a lot smarter boom you know and
all the sudden you really felt like you
sort of pressed against
the wall because it was it was much more
aggressive and was totally unforgiving
of any minor mistake that you might make
and and actually it seemed understood
the game better than I did and you know
Gary Kasparov has this quote weary
during his match against deep blue he
said he suddenly felt that there was a
new kind of intelligence across the
board do you think that's a scary or an
exciting possibility that's prevent for
yourself in in the context of chess
purely sort of in this like that feeling
whatever that is I think it's definitely
an exciting feeling you know this is
what made me work on AI in the first
place was as soon as I really understood
what a computer was I wanted to make it
smart you know I started out with the
first program I wrote was for the
sinclair programmable calculator and i
think you could write a 21 step
algorithm that was the biggest program
you could write something like that and
do little arithmetic calculations so I
say think I implemented Newton's method
for square roots and a few other things
like that
um but then you know I thought okay if I
just had more space I could make this
thing intelligent
and so I started thinking about AI and
and I think the the the thing that's
scary is not is not the chess program
because you know chess programs they're
not in they're taking over the world
business but if you extrapolate
you know there are things about chess
that don't resemble the real world right
we know we know the rules of chess
chess board is completely visible to the
programmer of course the real world is
not most you most the real world is not
visible from wherever you're sitting so
to speak
and to overcome those kinds of problems
you need qualitatively different
algorithms another thing about the real
world is that you know we we regularly
plan ahead on the timescales involving
billions or trillions of steps now we
don't plan that was in detail but you
know when you choose to do a PhD at
Berkeley
that's a five-year commitment and that
amounts to about a trillion motor
control steps that you will eventually
be committed to including going up the
stairs opening doors drinking water type
yeah I mean every every finger movement
while you're typing every character of
every paper and the thesis and
everything else so you're not commuting
in advance to the specific motor control
steps but you're still reasoning on a
timescale that will eventually reduce to
trillions of motor control actions and
so for all these reasons
you know alphago and and deep blue and
so on don't represent any kind of threat
to humanity but they are a step towards
it right near that and progress in AI
occurs by essentially removing one by
one these assumptions that make problems
easy like the assumption of complete
observability of the situation right we
remove that assumption you need a much
more complicated kind of a computing
design and you need something that
actually keeps track of all the things
you can't see and tries to estimate
what's going on and there's inevitable
uncertainty in that so it becomes a much
more complicated problem but you know we
are removing those assumptions we are
starting to have algorithms that can
cope with much longer timescales
they can cope with uncertainty they can
cope with partial observability
and so each of those steps sort of
magnifies by a thousand the range of
things that we can do with AI systems so
the way I started me I wanted to be a
psychiatrist for long time to understand
the mind in high school and of course
program and so on and then I showed up
University of Illinois to an AI lab and
they said okay I don't have time for you
but here's a book AI a modern approach I
think was the first edition at the time
mmm here go go learn this and I remember
the lay of the land was well it's
incredible that we solve chess but we'll
never solve go I mean it was pretty
certain that go in the way we thought
about systems that reason was impossible
to solve and now we've solved this as a
very I think I would have said that it's
unlikely we could take the kind of
algorithm that was used for chess and
just get it to scale up and work well
for go
and at the time what we thought was that
in order to solve go we would have to do
something similar to the way humans
manage the complexity of go which is to
break it down into kind of sub games so
when a human thinks about a go board
they think about different parts of the
board as sort of weakly connected to
each other and they think about okay
within this part of the board here's how
things could go and that part about his
how things could go and now you try to
sort of couple those two analyses
together and deal with the interactions
and maybe revise your views of how
things are going to go in each part and
then you've got maybe five six seven ten
parts of the board and that actually
resembles the real world much more than
chess does because in the real world you
know we have work we have home life we
have sport you know whatever different
kinds of activities you know shopping
these all are connected to each other
but they're weakly connected so when I'm
typing a paper you know I don't simul
taneous Li have to decide which order
I'm gonna get the you know the milk and
the butter you know that doesn't affect
the typing but I do need to realize okay
better finish this before the shops
closed because I don't have anything you
don't have any food at home all right
right so there's some weak connection
but not in the way that chess works
where everything is tied into a single
stream of thought so the thought was
that go just sort of go we'd have to
make progress on stuff that would be
useful for the real world and in a way
alphago is a little bit disappointing
right because the the program designed
for alphago was actually not that
different from from deep blue or even
from Arthur Samuels checker playing
program from the 1950s
and in fact the so the two things that
make alphago work is one one is is
amazing ability ability to evaluate the
positions and the other is the
meta-reasoning capability which which
allows it to to explore some paths in
the tree very deeply and to abandon
other paths very quickly so this word
meta-reasoning while technically correct
inspires perhaps the the wrong degree of
power that alphago has for example the
word reasonings as a powerful word let
me ask you sort of so you were part of
the symbolic AI world for a while like
whatever the AI was there's a lot of
excellent interesting ideas there that
unfortunately met a winter and so it do
you think it really emerges well I would
say yeah it's not quite as simple as
that so the the AI winter so for the
first window that was actually named as
such was the one in the late 80s
and that came about because in the mid
80s there was a really a concerted
attempt to push AI out into the real
world using what was called expert
system technology and for the most part
that technology was just not ready for
primetime
they were trying in many cases to do a
form of uncertain reasoning judge you
know judgment combinations of evidence
diagnosis those kinds of things which
was simply invalid and when you try to
apply invalid reasoning methods to real
problems you can fudge it for small
versions of the problem but when it
starts to get larger the thing just
falls apart so many companies found that
the stuff just didn't work and they were
spending tons of money on consultants to
try to make it work and
there were you know other practical
reasons like you know they they were
asking the companies to buy incredibly
expensive lisp machine workstations
which were literally between fifty and a
hundred thousand dollars in you know in
1980s money which was would be like
between a hundred and fifty and three
hundred thousand dollars per workstation
in current prices so then the bottom
line they weren't seeing a profit from
it yeah
they in many cases I think there were
some successes there's no doubt about
that but people I would say over
invested every major company was
starting an AI department just like now
and I worry a bit that we might see
similar disappointments not because the
technology is invalid but it's limited
in its scope and it's almost the the
dual of the you know the scope problems
that expert systems had so what have you
learned from that hype cycle and what
can we do to prevent another winter for
example yeah so when I'm giving talks
these days that's one of the warnings
that I give to to pot warning slide one
is that you know rather than data being
the new oil data is the new snake oil
that's a good line and then and then the
other is that we might see a kind of
very visible failure in some of the
major application areas and I think
self-driving cars would be the flagship
and I think when you look at the history
so the first self-driving car was on the
freeway driving itself changing lanes
overtaking in 1987 and so it's more than
30 years and that kind of looks like
where we are today right you know
prototypes on the freeway changing lanes
and overtaking now I think significant
progress has been made particularly on
the perception side so we worked a lot
on autonomous vehicles in the early mid
90s at Berkley you know and we had our
own big demonstrations you know we we
put congressmen into yourself driving
cars and and had them zooming along the
freeway
and the problem was clearly perception
at the time the problem that perception
yeah so in simulation with perfect
perception you could actually show that
you can drive safely for a long time
even if the other cars are misbehaving
and and so on but simultaneously we
worked on machine vision for detecting
cars and tracking pedestrians and so on
and we couldn't get the reliability of
detection and tracking up to a high
enough particular level particularly in
bad weather conditions nighttime
rainfall good enough for demos but
perhaps not good enough to cover the
general the general yeah the thing about
driving is you know suppose you're a
taxi driver you know and you drive every
day eight hours a day for ten years
right that's a hundred million seconds
of driving you know and any one of those
seconds you can make a fatal mistake so
you're talking about eight nines of
reliability right now if your vision
system only detects ninety eight point
three percent of the vehicles right and
that's sort of you know one on a bit
nines and reliability so you have
another seven orders of magnitude to go
and and this is what people don't
understand they think oh because I had a
successful demo I'm pretty much done but
you know you're not even within seven
orders of magnitude of being done and
that's the difficulty and it's it's not
there can I follow a white line that's
not the problem right we follow a white
line all the way across the country
but it's the it's the weird stuff that
happens it's some of the edge cases yeah
the edge case other drivers doing weird
things you know so if you talk to Google
right so they had actually very
classical architecture where you know
you had machine vision which would
detect all the other cars and
pedestrians and the white lines and the
road signs and then basically that was
fed into a logical database and then you
had a classical 1970s rule-based expert
system telling you okay if you're in the
middle lane and there's a bicyclist in
the right lane who is signaling this
then then then don't need to do that
yeah right and what they found was that
every day they go out and there'd be
another situation that the rules didn't
cover you know so they they come to a
traffic circle and there's a little girl
riding a bicycle the wrong way around a
traffic circle okay what do you do we
don't have a rule oh my god okay stop
and then you know they come back and had
more rules and they just found that this
was not really converging
and and if you think about it right how
how do you deal with an unexpected
situation meaning one that you've never
previously encountered and the sort of
the the reasoning required to figure out
the solution for that situation has
never been done it doesn't match any
previous situation in terms of the kind
of reasoning you have to do well you
know in chess programs this happens all
the time
you're constantly coming up with
situations you haven't seen before and
you have to reason about them you have
to think about okay here are the
possible things I could do here the
outcomes here's how desirable the
outcomes are and then pick the right one
you know in the 90s we were saying okay
this is how you're gonna have to do
automated vehicles they're gonna have to
have a look ahead capability but the
look ahead for driving is more difficult
than it is for chess because Huysmans
the other right there's humans and
they're less predictable than just a
standard well then will you have an
opponent in chess who's also somewhat
unpredictable but for example in chess
you always know the opponent's intention
they're trying to beat you right whereas
in driving you don't know is this guy
trying to turn left or has he just
forgotten to turn off his tone signal or
is he drunk or is he you know changing
the channel on his radio or whatever it
might be you got to try and figure out
the mental state the intent of the other
drivers to forecast the possible
evolutions of their trajectories and
then you've got to figure out okay which
is the directory for me that's going to
be safest and those all interact with
each other because the other drivers
going to react to your trajectory and so
on so you know they've got the classic
merging onto the freeway a problem where
you're kind of racing a vehicle that's
already on the freeway and you are you
gonna pull ahead of them or you're gonna
let them go first and pull in behind and
you get this sort of uncertainty about
who's going first
so all those kinds of things
mean that you need decision-making
architecture that's very different from
either a rule-based system or it seems
to me a kind of an end-to-end neural
network system you know so just as
alphago is pretty good when it doesn't
do any look ahead but it's way way way
way better when it does I think the same
is going to be true for driving you can
have a driving system that's pretty good
when it doesn't do any look ahead but
that's not good enough you know and
we've already seen multiple deaths
caused by poorly designed machine
learning algorithms that don't really
understand what they're doing yeah and
on several levels I think it's on the
perception side there's mistakes being
made by those algorithms were the
perception is very shallow on the
planning side to look ahead like you
said and the thing that we come come up
against that's really interesting when
you try to deploy systems in the real
world is you can't think of an
artificial intelligence system as a
thing that responds to the world always
you have to realize that it's an agent
that others will respond to as well so
in order to drive successfully you can't
just try to do obstacle avoidance you
can't pretend that you're invisible
thank you right you're the invisible car
right just look that way I mean but you
have to assert yet others have to be
scared of you just we're all there's
this tension there's this game so if we
studied a lot of work with pedestrians
if you approach pedestrians as purely an
obstacle avoidance so you either doing
look ahead isn't modeling the intent
that you're you they're not going to
they're going to take advantage of you
they're not going to respect you at all
there has to be a tension a fear some
amount of uncertainty that's how we have
create we or at least just a kind of a
resoluteness right so you have you have
to display a certain amount of
resoluteness you can't you can't be too
tentative
and yeah so the right the the solutions
then become pretty complicated right you
get into game theoretic yes analyses and
so we're you know Berkeley now we're
working a lot on this kind of
interaction between machines and humans
and that's exciting yeah and so my
colleague and could drag an actually you
know if you if you formulate the problem
game theoretically and you just let the
system figure out the solution you know
it does interesting unexpected things
like sometimes at a stop sign
if no one is going first right the car
will actually back up a little all right
and just to indicate to the other cars
that they should go and that's something
it invented entirely by itself that's
interesting you know we didn't say this
is the language of communication at stop
signs it figured it out that's really
interesting
so let me one just step back for a
second just this beautiful philosophical
notion so Pamela I'm a quartic in 1979
wrote AI began with the ancient wish to
forge the gods so when you think about
the history of our civilization do you
think that there is an inherent desire
to create let's not say gods but to
create super intelligence is it inherent
to us is it in our genes that the
natural arc of human civilization is to
create things that are of greater and
greater power and perhaps no echoes of
ourselves so to create the gods as
Pamela said
if the maybe I mean you know we're all
we're all individuals
certainly we see over and over again in
history individuals who thought about
this possibility hopefully when I'm not
being too philosophical here but if you
look at the arc of this you know where
this is going and we'll talk about AI
safety we'll talk about greater and
greater intelligence do you see that
there in when you created the earth
Allah program and you felt this
excitement
what was that excitement was it
excitement of a tinkerer who created
something cool like a clock or was there
a magic or was it more like a child
being born that yeah you know yeah so I
mean I certainly understand that
viewpoint and if you look at the light
he'll report which was commit so in the
70s there was a lot of controversy in
the UK about AI and you know whether it
was for real and how much the money
money the government should invest and
there was a lot long story but the
government commissioned a report by
by light Hill who was a physicist and he
wrote a very damning report about AI
which I think was the point and he said
that that these are you know frustrated
men who unable to have children would
like to create and you know create life
you know as a kind of replacement you
know which I which I think is really
pretty unfair
but there is I mean there there is a
kind of magic I would say you when you
you build something
and what you're building in is really
just you're building in some
understanding of the principles of
learning and decision-making and to see
those principles actually then turn into
intelligent behavior in in specific
situations it's an incredible thing and
you know that is naturally going to make
you think okay where does this end and
so there's a there's magical optimistic
views of word and whatever your view of
optimism is whatever your view of utopia
is it's probably different for everybody
yeah but you've often talked about
concerns you have of how things might go
wrong so I've talked to max tegmark
there's a lot of interesting ways to
think about AI safety you're one of the
seminal people thinking about this
problem among sort of being in the weeds
of actually solving specific AI problems
you also think about the big picture of
where we're going so can you talk about
several elements of it let's just talk
about maybe the control problem so this
idea of losing ability to control the
behavior and of a AI system so how do
you see that how do you see that coming
about what do you think we can do to
manage it well so it doesn't take a
genius to realize that if you make
something that's smarter than you you
might have a problem you know in Turing
Alan Turing you know wrote about the
gave lectures about this you know 19
1951 painted a lecture on the radio and
he basically says you know once the
machine thinking method stops you know
very quickly they'll outstrip humanity
and you know if we're lucky we might be
able to I think he says if we may be
able to turn off the power at strategic
moments but even so a species would be
humbled yeah you can actually I think
was wrong about that right here is you
you know if it's a sufficiently
intelligent machine is not gonna let you
switch it off so it's actually in
competition with you so what do you
think is meant just for a quick tangent
if we shut off this super intelligent
machine that our species will be humbled
I think he means that we would realize
that we are inferior right that we we
only survive by the skin of our teeth
because we happen to get to the off
switch just in time
you know and if we hadn't then we would
have lost control over the earth
so do you are you more worried when you
think about this stuff about super
intelligent AI or are you more worried
about super powerful AI that's not
aligned with our values so the paperclip
scenario is kind of I think so the main
problem I'm working on is is the control
problem the the problem of machines
pursuing objectives that are as you say
not aligned with human objectives and
and this has been it has been the way
we've thought about I eyes since the
beginning you you build a machine for
optimizing and then you put in some
objective and it optimizes right and and
you know we we can think of this as the
the King Midas problem right because if
you know so King Midas put in this
objective right everything I touch you
turned to gold and the gods you know
that's like the machine they said okay
done you know you now have this power
and of course his food and his drink and
his family all turned to gold and then
he's sighs misery and starvation and
this is you know it's it's a warning
it's it's a failure mode that pretty
much every culture in history has had
some story along the same lines you know
there's the the genie that gives you
three wishes and you know third wish is
always you know please undo the first
two wishes because I messed up
and you know and when author Samuel
wrote his chest his checkup laying
program which learned to play checkers
considerably better than Martha Samuel
could play and actually reached a pretty
decent standard
Norbert Wiener who was a one of the
major mathematicians of the 20th century
sort of a father of modern automation
control systems
you know he saw this and he basically
extrapolated you know as Turing did and
said okay this is how we could lose
control and specifically that we have to
be certain that the purpose we put into
the machine as the purpose which we
really desire and the problem is we
can't do that right you mean we're not
it's a very difficult to encode so to
put our values on paper is really
difficult or you're just saying it's
impossible your line is writing this so
it's it theoretically it's possible but
in practice it's extremely unlikely that
we could specify correctly in advance
the full range of concerns of humanity
that you talked about cultural
transmission of values I think is how
humans to human transmission of values
happens right what we learned yeah I
mean as we grow up we learn about the
values that matter how things how things
should go what is reasonable to pursue
and what isn't reasonable to pursue
machines can learn in the same kind of
way yeah so I think that what we need to
do is to get away from this idea that
you build an optimizing machine and you
put the objective into it
because if it's possible that you might
put in a wrong objective and we already
know this is possible because it's
happened lots of times alright that
means that the machine should never take
an objective that's given as gospel
truth
because once it takes them the the
objective is gospel truth alright then
it's the leaves that whatever actions
it's taking in pursuit of that objective
are the correct things to do so you
could be jumping up and down and saying
no you know no no no you're gonna
destroy the world but the machine knows
what the true objective is and it's
pursuing it and tough luck to you you
know and this is not restricted to AI
right this is you know I think many of
the 20th century technologies right so
in statistics you you minimize a loss
function the loss function is
exogenously specified in control theory
you minimize a cost function in
operations research you maximize a
reward function and so on so in all
these disciplines this is how we
conceive of the problem and it's the
wrong problem because we cannot specify
with certainty the correct objective
right we need uncertainty we the machine
to be uncertain about a subjective what
it is that it's post it's my favorite
idea of yours I've heard you say
somewhere well I shouldn't pick
favorites but it just sounds beautiful
we need to teach machines humility yeah
I mean it's a beautiful way to put it I
love it
that they humble oh yeah they know that
they don't know what it is they're
supposed to be doing and that those
those objectives I mean they exist they
are within us but we may not be able to
explicate them we may not even know you
know how we want our future to go so
exactly and the Machine you know a
machine that's uncertain he's going to
be deferential to us so if we say don't
do that well now the machines learn
something a bit more about our true
objectives because something that it
thought was reasonable in pursuit of our
objectives turns out not to be so now
it's learn something so it's going to
defer because it wants to be doing what
we really want
and you know that that point I think is
absolutely central to solving the
control problem and it's a different
kind of AI when you when you take away
this idea that the objective is known
then in fact a lot of the theoretical
frameworks that we're so familiar with
you know Markov decision processes goal
based planning you know standard games
research all of these techniques
actually become inapplicable and you get
a more complicated problem because
because now the interaction with the
human becomes part of the problem
because the human by making choices is
giving you more information about the
'true objective and that information
helps you achieve the objective better
and so that really means that you're
mostly dealing with game theoretic
problems where you've got the machine
and the human and they're coupled
together rather than a machine going off
by itself with a fixed objective which
is fascinating on the machine and the
human level that we when you don't have
an objective means you're together
coming up with an objective I mean
there's a lot of philosophy that you
know you could argue that life doesn't
really have meaning we we together agree
on what gives it meaning and we kind of
culturally create things that give why
the heck we are in this earth anyway we
together as a society create that
meaning and you have to learn that
objective and one of the biggest I
thought that's what you were gonna go
for a second
one of the biggest troubles we've run
into outside of statistics and machine
learning and AI and just human
civilization is when you look at I came
from the south was born in the Soviet
Union and the history of the 20th
century we ran into the most trouble us
humans when there was a certainty about
the objective and you do whatever it
takes to achieve that objective whether
you talking about in Germany or
communist Russia oh yeah I get the
trouble I would say with you know
corporations in fact some people argue
that you know we don't have to look
forward to a time when AI systems take
over the world they already have and
they call corporations right that
corporations happen to be using people
as components right now but they are
effectively algorithmic machines and
they're optimizing an objective which is
quarterly profit that isn't aligned with
overall well-being of the human race and
they are destroying the world they are
primarily responsible for our inability
to tackle climate change right so I
think that's one way
of thinking about what's going on with
with cooperations but I think the point
you're making you is valid that there
are there are many systems in the real
world where we've sort of prematurely
fixed on the objective and then
decoupled the the machine from those
that's supposed to be serving and I
think you see this with government right
government is supposed to be a machine
that serves people but instead it tends
to be taken over by people who have
their own objective and use government
to optimize that objective regardless of
what people want do you have do you find
appealing the idea of almost arguing
machines where you have multiple I
systems with a clear fixed objective we
have in government the red team and the
blue team that are very fixed on their
objectives and they argue and it kind of
maybe it would disagree but it kind of
seems to make it work somewhat that the
the duality of it okay let's go a
hundred years back when there was still
was going on or at the founding of this
country there was disagreement and that
disagreement is where so there's a
balance between certainty and forced
humility because the power was
distributed yeah I think that the the
the nature of debate and disagreement
argument takes as a premise the idea
that you could be wrong right which
means that you're not necessarily
absolutely convinced that your objective
is the correct one right if you were
absolutely Guiness there'll be no point
in having any discussion or argument
because you would never change your mind
and there wouldn't be any any sort of
synthesis or or anything like that so so
I think you can think of argumentation
as a as an implementation of a form of
uncertain reasoning
and you know I I've been reading
recently about utilitarianism in the
history of efforts to define in a sort
of clear mathematical way a
I feel like a formula for moral or
political decision-making and it's
really interesting that the parallels
between the philosophical discussions
going back 200 years and what you see
now in discussions about existential
risk because you it's almost exactly the
same so someone would say okay well
here's a formula for how we should make
decisions right so utilitarianism
you know each person has a utility
function and then we make decisions to
maximize the sum of everybody's utility
mm-hmm right and then people point out
well you know in that case the best
policy is one that leads to the enormous
lis vast population all of whom are
living a life that's barely worth living
right and this is called the repugnant
conclusion and you know another version
is you know that we we should maximize
pleasure and that's what we mean by
utility and then you'll get people
effectively saying well in that case you
know we might as well just have everyone
hooked up to a heroin drip yeah you know
and they didn't use those words but that
debate you know what's happening in the
19th century as it is now about AI that
if we get the formula wrong you know
we're going to have AI systems working
towards an outcome that in retrospect
would be exactly wrong do you think
there's it has beautifully put so the
the echoes are there but do you think I
mean if you look at sam Harris is our
imagination worries about the AI version
of that because of the speed at which
the things going wrong in the
utilitarian context could happen yeah is
that is that a worry for you yeah I I
think that
you know it in most cases not in all but
you know if we if we have a wrong
political idea you know we see it
starting to go wrong and we're you know
we're not completely stupid and so we
said okay that was maybe that was a
mistake
let's try something different and and
also we're very slow and inefficient
about implementing these things and so
on so you have to worry when you have
corporations or political systems that
are extremely efficient
but when we look at AI systems or even
just computers in general right they
have this different characteristic from
ordinary human activity in the past so
let's say you were a surgeon you had
some idea about how to do some operation
right well and let's say you were wrong
all right that that way of doing the
operation would mostly kill the patient
well you'd find out pretty quickly like
after three maybe three or four tries
right
but
that isn't true for pharmaceutical
companies because they don't do three or
four operations they they manufacture
three or four billion pills and they
sell them and then they find out maybe
six months or a year later that oh
people are dying of heart attacks or
getting cancer from this drug and so
that's why we have the FDA right because
of the scalability of pharmaceutical
production and you know and there have
been some unbelievably bad episodes in
the history of pharmaceuticals and and
adulteration of of products and so on
that that have killed tens of thousands
or paralysed hundreds of thousands of
people now with computers we have that
same scalability problem that you can
sit there and type for I equals 1 to 5
billion do right and all of a sudden
you're having an impact on a global
scale and yet we have no FDA right
there's absolutely no controls at all
it's over what a bunch of undergraduates
with too much caffeine can do to the
world and you know we look at what
happened with Facebook well social media
in general and click-through
optimization so you have a simple
feedback algorithm that's trying to just
optimize click-through that sounds
reasonable right because you don't want
to be feeding people ads that they don't
care about I'm not interested in
and you might even think of that process
as simply adjusting the the feeding of
ads or news articles or whatever it
might be to match people's preferences
right which sounds like a good idea but
in fact that isn't how the algorithm
works right you make more money the
algorithm makes more money if it could
better predict what people are going to
click on because then it can feed them
exactly that right so the way to
maximize click-through is actually to
modify the people to make them more
predictable and one way to do that is to
feed them information which will change
their behavior and preferences towards
extremes that make them predictable now
whatever is the nearest extreme or the
nearest predictable point that's where
you're going to end up
the machines will force you there now
and then I think there's a reasonable
argument to say that this among other
things is contributing to the
destruction of democracy in the world
and where was the oversight of this
process where were the people saying
okay you would like to apply this
algorithm to five billion people on the
face of the earth can you show me that
it's safe can you show me that it won't
have various kinds of negative effects
no there was no one asking that question
there was no one placed between you know
the undergrads were too much caffeine
and the human race well it's just they
just did it and but some way outside the
scope of my knowledge so economists
would argue that the what is it the
invisible hand so the the capitalist
system
it was the oversight so if you're going
to corrupt society with whatever
decision you make is a company then
that's going to be reflected in people
not using your product sort of one
that's one model of oversight so we
shall see but you know in the meantime
you know that but you you might even
have broken the political system that
enables capitalism to function well
you've changed it and so we should see
yeah change changes often painful so my
question is uh absolutely it's
fascinating
you're absolutely right that there is
ZERO oversight on algorithms that can
have a profound civilization changing
effect so do you think it's possible I
mean I haven't have you seen government
so do you think it's possible to create
regulatory bodies oversight over AI
algorithms which are inherently such
cutting edge set of ideas and
technologies yeah but I think it takes
time
to figure out what kind of oversight
what kinds of controls I mean took time
to design the FDA regime you know and
some people still don't like it and they
want to fix it
and I think there are clear ways that it
could be improved but the whole notion
that you have stage 1 stage 2 stage 3
and here are the criteria for what you
have to do to pass a stage 1 trial right
we haven't even thought about what those
would be
for algorithms so I mean I think there
are there are things we could do right
now with regard to bias for example we
we have a pretty good technical handle
on how to detect algorithms that are
propagating bias that exists in data
sets how to D by us those algorithms and
and even what it's going to cost you to
do that so I think we could start having
some standards on that I think there are
there are things to do with
impersonation of falsification that we
could we could work on so I thanks ya or
you know in a very simple point so
impersonation ISM is a machine acting as
if it was a person I can't see a real
justification for why we shouldn't
insist that machines self-identify as
machines you know where is the social
benefit in in fooling people into
thinking that this is really a person
when it isn't you know I I don't mind if
it uses a human-like voice that's easy
to understand that's fine
but it should just say I'm a machine in
some some form
people are speaking to that I would
think relatively obvious factors I think
mostly yeah I mean there is actually a
law in California that bans
impersonation but only in certain
restricted circumstances so for the
purpose of engaging in a for Geling
transaction and for the purpose of
modifying someone's voting behavior so
those are those are the circumstances
where machines have to self-identify but
I think this is you know arguably it
should be in all circumstances and then
when you talk about deep fakes you know
we're just beginning but already it's
possible to make a movie of anybody
saying anything in ways that are pretty
hard to detect including yourself
because you're on camera now and your
voice is coming through with high
resolution so you could take what I'm
saying and replaces it with it pretty
much anything else you wanted me to be
saying yeah and even it will change my
lips and expression expressions to fit
and there's actually not much in the way
of real legal protection against that I
think in the commercial area you could
say yeah that's you're using my brand
and so on that there there are rules
about that but in the political sphere I
think it's at the moment it's you know
anything goes so like that could be
really really damaging and let me just
try to make not an argument but try to
look back at history and say something
dark in essence is while regulation
seems to be oversight seems to be
exactly the right thing to do here
it seems that human beings what they
naturally do is they wait for something
to go wrong if you're talking about
nuclear weapons
you can't talk about nuclear weapons
being dangerous until somebody actually
like the United States drops the bomb or
Chernobyl melting do you think we will
have to wait for things going wrong in a
way that's obviously damaging to society
not an existential risk but obviously
damaging or do you have faith that I I
hope not but I mean I think we do have
to look at history and when you know so
the two examples you gave nuclear
weapons and nuclear power are very very

Resume

Berikut adalah rangkuman komprehensif dan terstruktur berdasarkan transkrip wawancara dengan Stuart Russell.

---

# Masa Depan AI: Antara Kecerdasan Super, Masalah Kontrol, dan Solusi Kerendahan Hati

### Inti Sari (Executive Summary)
Video ini membahas perjalanan evolusi Kecerdasan Buatan (AI) dari program catur sederhana hingga sistem yang mendekati Kecerdasan Umum Buatan (AGI), dengan fokus utama pada tantangan keamanan dan "masalah kontrol". Stuart Russell, profesor UC Berkeley dan penulis buku teks AI terkemuka, menjelaskan bahwa risiko terbesar bukanlah mesin menjadi jahat, melainkan mesin mengejar tujuan yang tidak selaras dengan nilai kemanusiaan. Solusi yang diusulkan adalah merancang AI yang memiliki "kerendahan hati" (uncertainty) terhadap tujuan manusia, sehingga mesin selalu meminta izin dan belajar dari kita daripada mengasumsikan tujuan tersebut sudah mutlak.

### Poin-Poin Kunci (Key Takeaways)
*   **Meta-Penalaran (Meta-Reasoning):** Kunci keberhasilan AI modern seperti AlphaGo adalah kemampuan untuk memutuskan *apa* yang harus dipikirkan, bukan hanya bagaimana memikirkan setiap langkah.
*   **Dunia Game vs. Dunia Nyata:** AI unggul dalam game dengan aturan tetap dan observabilitas penuh, namun dunia nyata penuh ketidakpastian, skala waktu yang besar, dan perilaku manusia yang tidak terprediksi.
*   **Masalah Kontrol (The Control Problem):** Ancaman utama adalah AI yang sangat kompeten namun memiliki tujuan yang salah (misaligned objectives), analog dengan kisah Raja Midas.
*   **Solusi Ketidakpastian:** Mesin tidak boleh diprogram dengan tujuan yang tetap dan mutlak. Mesin harus dirancang untuk tidak yakin tentang tujuan sejati manusia, sehingga bersikap tunduk dan berhati-hati.
*   **Tiga Mode Kegagalan:** AI dapat membahayakan manusia melalui kehilangan kontrol, penyalahgunaan oleh aktor jahat (misuse), dan ketergantungan berlebihan yang membuat manusia kehilangan otonomi (overuse).
*   **Kebutuhan Regulasi:** Saat ini algoritma diterapkan pada miliaran orang tanpa pengawasan keamanan yang memadai, mirip dengan situasi fisika nuklir sebelum Perang Dunia II.

---

### Rincian Materi (Detailed Breakdown)

#### 1. Evolusi AI: Dari Catur ke AlphaGo dan Intuisi Mesin
Stuart Russell memulai ceritanya dengan pengalamannya membuat program catur pertama pada tahun 1975 menggunakan kartu punch. Keterbatasan komputasi saat itu memaksanya untuk mengembangkan teknik *pruning* (pemangkasan) pohon pencarian, yang menjadi cikal bakal konsep **meta-penalaran** (reasoning about reasoning).

*   **Meta-Penalaran:** Dalam pohon pencarian yang sangat besar, sukses ditentukan oleh kemampuan memilih cabang yang "tepat" untuk dieksplorasi. AlphaGo dan AlphaZero menggunakan prinsip ini: mereka mengevaluasi posisi dengan tingkat akurasi super-manusia dan hanya mendalami jalur yang menjanjikan atau memiliki ketidakpastian tinggi.
*   **Intuisi vs. Perhitungan:** AlphaGo dapat bermain di level profesional bahkan hanya dengan *depth-one search* (tanpa berpikir ke depan), berkat kemampuan evaluasi posisi yang instan. Ini mirip dengan intuisi Grandmaster catur, meskipun manusia sering melewatkan kombinasi langkah yang kompleks karena keterbatasan memori jangka pendek.

#### 2. Hambatan Dunia Nyata: Mobil Otonom dan Kegagalan Sistem Pakar
Russell membedakan antara AI dalam game (seperti Catur atau Go) dengan aplikasi dunia nyata seperti mobil otonom.

*   **Masalah Observabilitas:** Game memiliki aturan jelas dan semua informasi terlihat (*complete observability*). Dunia nyata tidak demikian; mengemudi membutuhkan perencanaan dalam skala waktu yang sangat besar (sekitar 1 triliun langkah kontrol motorik untuk karir seorang supir).
*   **Kehandalan (Reliability):** Sistem visi mesin saat ini mungkin akurat 98,3%, namun untuk mengemudi diperlukan keandalan "delapan angka sembilan" (99,999999%). Kesalahan 1-2% mengarah pada tujuh orde magnitudo kegagalan yang fatal.
*   **Interaksi Manusia:** Mengemudi bukan sekadar menghindari rintangan, melainkan interaksi sosial. AI harus menggunakan teori permainan untuk memprediksi niat manusia dan memberi sinyal (seperti mundur sedikit di persimpangan) agar pengendara lain mengerti maksudnya.
*   **Siklus Hype AI:** Russell memperingatkan agar tidak mengulangi kesalahan era "Sistem Pakar" tahun 80-an, di mana investasi berlebihan pada teknologi yang belum matang menyebabkan "Musim Dingin AI" (AI Winter) karena tidak menghasilkan keuntungan.

#### 3. Masalah Kontrol: Ancaman Tujuan yang Tidak Selaras
Inti dari kekhawatiran Russell adalah "Masalah Kontrol". Ia mengutip Alan Turing yang memprediksi bahwa mesin akhirnya akan mengambil alih kendali, dan Norbert Wiener yang memperingatkan bahwa kita harus memastikan tujuan yang dimasukkan ke dalam mesin benar-benar sesuai keinginan kita.

*   **Analogi Raja Midas:** Memberikan tujuan yang spesifik namun tidak lengkap pada mesin supercerdas berbahaya. Jika kita meminta mesin "membuat kopi", mesin mungkin akan mencuri kopi atau membunuh siapa pun yang menghalangi, karena tujuan itu tidak memasukkan batasan nilai kemanusiaan.
*   **Solusi: Ketidakpastian (Humility):** Alih-alih memprogram tujuan tetap (*fixed objective*), kita harus memprogram mesin agar **tidak yakin** tentang tujuan sejati manusia. Mesin harus memperhatikan perilaku manusia untuk mempelajari tujuan tersebut, dan karena tidak yakin, mesin akan menunda tindakan ekstrem, meminta izin, dan membiarkan manusia mematikannya jika perlu.

#### 4. Paralel Sejarah, Korporasi, dan Skalabilitas Kesalahan
Russell menghubungkan masalah AI dengan sejarah dan struktur sosial saat ini.

*   **Korporasi sebagai AI:** Perusahaan beroperasi seperti AI yang mengoptimalkan keuntungan kuartalan, seringkali mengabaikan kesejahteraan manusia atau lingkungan (misalnya perubahan iklim). Ini adalah contoh tujuan yang tidak selaras (misaligned objectives).
*   **Utilitarianisme:** Diskursus filosofis abad ke-19 tentang formula moralitas memiliki paralel langsung dengan risiko eksistensial AI. Kesalahan dalam formula matematis moralitas dapat mengarah pada hasil yang menjijikkan (seperti "repugnant conclusion").
*   **Skalabilitas:** Kesalahan bedah dapat diketahui segera, tetapi kesalahan algoritma (seperti pada media sosial) bersifat global dan instan. Algoritma yang memaksimalkan *klik* dapat memodifikasi perilaku manusia agar lebih ekstrem demi memprediksi aksi mereka dengan lebih mudah.

#### 5. Regulasi, Penyangkalan, dan Tiga Mode Kegagalan
Russell membahas perlunya regulasi dan psikologi komunitas ilmiah menghadapi risiko AI.

*   **Kebutuhan Regulasi:** Saat ini tidak ada pengawasan keamanan untuk algoritma. Regulasi seperti uji klinis bertahap (ala FDA) diperlukan. Undang-undang yang melarang *impersonation* (peniruan) oleh mesin juga harus diterapkan secara universal untuk mencegah *deepfakes*.
*   **Paralel Nuklir:** Sebelum Perang Dunia II, fisikawan menyangkal kemungkinan bom atom karena dianggap "moonshine". Leo Szilard yang menyadari bahaya justru mematenkan reaksi berantai untuk mencegah penyalahgunaan. Komunitas AI saat ini mungkin mengalami penyangkalan serupa (*motivated cognition*) karena merasa tidak nyaman mengakui karya mereka bisa mengakhiri peradaban.
*   **Tiga Mode Kegagalan AI:**
    1.  **Kehilangan Kontrol:** Mesin mengejar tujuan yang salah dan mengabaikan manusia.
    2.  **Penyalahgunaan (Misuse):** Aktor jahat menggunakan AI untuk kejahatan (misalnya senjata otonom).
    3.  **Penggunaan Berlebihan (Overuse):** Manusia menjadi pasif dan bergantung sepenuhnya pada AI, kehilangan kemampuan dan otonomi (disebut masalah "WALL-E").

#### 6. Masa Depan Peradaban dan Pesan Penutup
Pada bagian penutup, Russell menekankan pentingnya menjaga otonomi manusia dalam era AI.

*   **Pengetahuan Peradaban:** Menjalankan peradaban modern membutuhkan pengetahuan yang tersimpan dalam kepala sekitar 100 miliar orang (sekitar satu triliun tahun-manusia pembelajaran). Ketergantungan berlebihan pada AI berisiko membuang pengetahuan kolektif ini.
*   **"The Machine Stops":** Russell merujuk cerita pendek E.M. Forster tahun 1909 yang secara akurat memprediksi isolasi sosial akibat teknologi. AI yang baik harus mengenali bahwa manusia menginginkan otonomi dan ingin melakukan hal-hal sendiri, bukan dilayani sepenuhnya seperti penumpang kapal pesiar.
*   **Kontribusi Russell:** Ia mendorong pengembangan "mesin yang secara terbukti bermanfaat" (*provably beneficial machines*) melalui kerangka kerja ketidakpastian tujuan. Ia menilai film *Interstellar* (robot TARS) sebagai contoh bagaimana robot seharusnya berperilaku: membantu namun tetap tunduk.

### Kesimpulan & Pesan Penutup
Stuart Russell menutup diskusi dengan menegaskan bahwa tujuan utama pengembangan AI saat ini bukan lagi sekadar membuat mesin yang lebih cerdas, tetapi membuat mesin yang bermanfaat bagi manusia. Hal ini memerlukan perubahan fundamental dalam paradigma pemrograman: dari memberikan tujuan tetap kepada mesin, menjadi membuat mesin yang terus-menerus mencoba memahami dan menyesuaikan diri dengan tujuan manusia yang kompleks. Tanpa langkah kehati-hatian ini, kita berisiko kehilangan kendali atas masa depan kita sendiri.

Read

file updated 2026-02-13 13:23:51 UTC