Transcript

dEv99vxKjVI • Elon Musk: Tesla Autopilot | Lex Fridman Podcast #18
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0071_dEv99vxKjVI.txt
Back Raw
Kind: captions
Language: en
the following is a conversation with
Elon Musk he's the CEO of Tesla SpaceX
your link and a co-founder of several
other companies this conversation is
part of the artificial intelligence
podcast the series includes leading
researchers in academia and industry
including CEOs and CTOs of automotive
robotics AI and technology companies
this conversation happened after the
release of the paper from our group at
MIT on driver functional vigilance
during use of Tesla's autopilot the
tesla team reached out to me offering a
podcast conversation with mr. musk I
accepted with full control of questions
that could ask and the choice of what is
released publicly I ended up editing out
nothing of substance I've never spoken
with Elon before this conversation
publicly or privately neither he nor his
companies have any influence on my
opinion nor on the rigor and integrity
of the scientific method that I practice
in my position at MIT Tesla has never
financially supported my research and
I've never owned a Tesla vehicle I've
never owned Tesla stock this podcast is
not a scientific paper it is a
conversation I respect Elon as I do all
other leaders and engineers I've spoken
with we agree on some things and
disagree on others my goal is always
with these conversations is to
understand the way the guest sees the
world one particular point of this
agreement in this conversation was the
extent to which camera-based
driver monitoring will improve outcomes
and for how long he will remain relevant
for a I assisted driving as someone who
works on and is fascinated by human
centered artificial intelligence I
believe that if implemented and
integrated effectively camera-based
driver monitoring is likely to be of
benefit in both the short term and the
long term
in contrast Elon and Tesla's focus is on
the improvement of autopilot such that
it's statistical safety benefits
override any concern of human behavior
and psychology Elon and I may not agree
on everything
but I deeply respect the engineering and
innovation behind the efforts that he
leads my goal here is to catalyze a
rigorous nuanced and objective
discussion in industry in academia
an AI assisted driving one that
ultimately makes for a safer and better
world and now here's my conversation
with Elon Musk what was the vision the
dream of autopilot when in the beginning
the big-picture system level when it was
first conceived and started being
installed in 2014 the hardware in the
cars was division the dream I would
characterize the division or dream
simply that there are obviously two
massive revolutions and in the
automobile industry one is the
transition to electrification and then
the other is autonomy
and yeah became obvious to me that in
the future any any car that does not
have autonomy I would be about as useful
as a horse which is not to say that
there's no use it's just rare and
somewhat idiosyncratic if somebody has a
horse at this point it's just obvious
that cars will drive themselves
completely it's just a question of time
and if we did not participate in the
autonomy revolution then our cause would
not be useful to people relative to cars
that are autonomous I mean an autonomous
car is arguably worth five to ten times
more than by not colored which is not
autonomous in a long term depends what
you mean by a long term but would say at
least for the next five years perhaps
ten years so there a lot of very
interesting design choices with
autopilot early on first is showing on
the instrument cluster or in the model
three and the center stack display what
the combined sensor suite sees what was
the thinking behind that choice was
there debate what was the process the
whole point of the display is to provide
a health check on the vehicles
perception of reality so the vehicles
taking in information for a motion
sensor is primarily cameras but also
radar and ultrasonics GPS and so forth
and then that that information is then
rendered into vector space and that you
know with a bunch of objects with
product with properties like lane lines
and traffic lights and other cars and
then in vector space that is re-rendered
onto your display so you can confirm
whether the car knows what's going on or
not by looking at the window right I
think that's a extremely powerful thing
for people to get an understanding sort
of become one with the system and
understanding what the system is capable
of now have you considered showing more
so if we look at the computer vision you
know like Road segmentation Lane
detection vehicle detection object
detection underlying the system there is
at the edges some uncertainty have you
considered revealing the parts that the
uncertainty in the system the said apart
movies associated with with say image
recognition
yeah right now it shows like the
vehicles in the vicinity a very clean
crisp image and people do confirm
there's a car in front of me and the
system sees there's a car in front of me
but to help people build an intuition of
what computer vision is by showing some
of the uncertainty well I think it's my
car I always look look at this sort of
the debug view and there's this to debug
views one is augmented vision where
which I'm sure you've seen where it's
basically we draw boxes and labels
around objects that are recognized
and then there's we're called the
visualizer which is basically a Vectis
based representation
summing up the input from all sensors
that doesn't does not show any pictures
but it shows all of the it's basically
shows the cause view of of the world in
vector space but I think this is very
difficult for people to know normal
people to understand they would not know
what thing they're looking at so it's
almost the nature my challenge to the
current things that are being displayed
is optimized for the general public
understanding of what the system is
capable it's like if you no idea what
how computer vision works or anything
you can still look at the screen and see
if the car knows what's going on and
then if you're you know if you're a
development engineer or if you're you
know if you're if you have the
development build like I do then you can
see you know all the debug information
but those would just be like total
gibberish to most people what's your
view on how to best distribute effort so
there's three I would say technical
aspects of autopilot that are really
important since the underlying
algorithms like then you'll network
architecture there's the data so the
distrain on and then there's a hardware
development there may be others but so
look algorithm data hardware you don't
you only have so much money only have so
much time what do you think is the most
important thing to to allocate resources
to do you see it as pretty evenly
distributed between those three we
automatically get a fast amount of data
because all of our cars have
eight external facing cameras and radar
and usually twelve ultrasonic sensors
GPS obviously and I am you and so we
basically have a fleet that has we're
about four hundred thousand cars on the
road that have that level of data I
think you keep quite close track of it
actually yes yeah so we're we're
approaching half a million cars on the
road that have the full sensor suite
yeah this so this is I'm I'm not sure
how many other cars on the road have the
sensor suite but I'll be surprised if
it's more than five thousand which means
that we were we have 99% of all the data
so there's this huge inflow of data
absolutely massive inflow of data and
then we it's it's taken us about three
years but now we've finally developed a
full self-driving computer which can
process and an order magnitude as much
as the Nvidia system that we currently
have in the in the cars and it's really
just a to use it you've unplugged the
Nvidia computer and plug the tells the
computer in and that's it and it's it's
a in fact we're not even we're still
exploring the boundaries of the
capabilities we were able to run the
camera is a full frame rate full
resolution not even crop of the images
and it's still got Headroom even on one
of the the system's the heart full
self-driving computer is really two
computers two systems on a chip that are
fully redundant so you could put a boat
through basically any part of that
system and it still works the redundancy
are they perfect copies of each other or
yeah
also it's purely for redundancy as
opposed to an arguing machine kind of
architecture where they're both making
this is this is purely for redundancy
you think more like it's if you have is
a twin-engine aircraft commercial
aircraft
this system will operate best if both
systems are operating but it's it's
capable of operating safely on one so
but as is right now we can just run we
haven't even hit the at the edge of
performance so with there's no need to
actually distribute the functionality
across both SOC s we can actually just
run a full duplicate on but on one each
one you haven't really explored or hit
the limit of this not yet at the limiter
so the magic of deep learning is the
that it gets better with data you said
there's a huge inflow of data but yeah
the thing about driving the really
valuable data to learn from is the edge
cases so how do you I mean I've heard
you talk somewhere about autopilot
disengagement is being an important
moment of time yes to use is there other
edge cases or perhaps can you speak to
those edge cases what aspects of that
might be valuable or if you have other
ideas how to discover more and more and
more educators in driving well there's a
lot of things that I learned though
certainly edge cases where I say
somebody's on order pot and they they
take over and then okay that that that's
a trigger that goes at to assist and I
says okay so they take over for
convenience or do they take over because
the autopilot wasn't working properly
there's also like let's say we're trying
to figure out what is the optimal spline
for traversing an intersection then then
the ones where there are no
interventions and we are the right ones
see you then say okay when it looks like
this do the following and then the end
and then you get the optimal spline for
a complex know getting a complex
intersection so that's for this is kind
of the common case you're trying to
capture a huge amount of samples of a
particular intersection how when things
went right and then there's the edge
case where as you said not for
convenience but something
somebody took over somebody asserted
manual control from autopilot and really
liked the way to look at this as view
all input his error if the user had to
do input if there's something all input
is error that's a powerful line to think
of it that way because it may very well
be error but if you want to exit the
highway or if you want to its a
navigation decision that all autopilot
is not currently designed to do then the
driver takes over how do you know yes
that's gonna change with navigate an
autopilot which we're just released and
and without still confirm so the
navigation like lane change based it
likes a certain control in order to
change the lane change or Exeter freeway
or or doing a highway interchange the
vast majority that will go away with the
release that just went out yeah that
that I don't think people quite
understand how big of a step that is
yeah they don't so if you drive the car
than you do so you still have to keep
your hands on the steering wheel
currently when it does the automatic and
lane change what are so there's these
these big leaps through the development
of autopilot through its history and
what stands out to you as the big leaps
I would say this one navigate an
autopilot without confirm without having
to confirm there's a huge leap it is a
huge leap but it also automatically
overtake slow cars so it's it's both
navigation and seeking the fastest lane
so it'll it'll - you know overtake a
slower cause and exit the freeway and
take highway interchanges and
and then we have traffic like traffic
light to recognition which introduced
put initially as a as a warning I mean
on the development version that I'm
driving the car fully fully stops and
goes at traffic lights so those are the
steps right you've just mentioned
somethings that an inkling of a step
towards full autonomy what would you say
are the biggest technological roadblocks
to full self-driving actually I don't
think we I think we're just the full
self-driving computer that we just let
but it has a Oracle the FST computer
that that's now in production so if you
order any Model S or X or any model
three that has the full self-driving
package you'll get the FST computer that
that was that's important that have
enough base computation then refining
the neural net and the control software
but all of that can just providers know
their update the thing that's really
profound and where I'll be emphasizing
at the auto sort of what that investor
day they were having focused on autonomy
is that the cars currently being
produced but the hardware currently
being produced is capable of full
self-driving but capable is an
interesting word because like the
hardware is yeah and as we refine the
software the capabilities will increase
dramatically and then the reliability
will increase dramatically and then it
will receive regulatory approval so it's
actually buying a car today is an
investment in the future what you're
essentially buying a you're buying the I
think the most profound thing is that if
you buy a Tesla today I believe you are
buying an appreciating asset not a
depreciating asset so that's a really
important statement there because if
hardware is capable enough that's the
hard thing to upgrade yes usually exact
so then the rest is a software problem
yes I've software has no marginal cost
really
but what's your intuition on the
software side how hard are the remaining
steps to get it to where you know the
the experience not just the safety but
the full experience is something that
people would enjoy I think we will enjoy
it very much the under knee on the
highway sits it's a total game changer
for quality of life for using you know
Tesla motor pilot on the highways is so
it's really just extending that
functionality to city streets adding in
the traffic like traffic light
recognition navigating complex
intersections and and then being able to
navigate complicated parking lots so the
car can exit a parking space and come
and find you even if it's in a complete
maze of a parking lot and and then if
and then you can just pick just drop you
off and find a parking spot by itself
yeah in terms of enjoy ability and
something that people would would
actually find a lot of use from the
parking lot is a really you know it's
it's rich of annoyance when you have to
do it manually so there's a lot of
benefit to be gained from automation
there so let me start injecting the
human into this discussion a little bit
so let's talk about full autonomy if you
look at the current level for vehicles
being test on row like way mow and so on
they're only technically autonomous
they're really level two systems
with just the different design
philosophy because there's always a
safety driver in almost all cases and
they're monitoring the system right do
you see Tesla's full self-driving as
still for a time to come
the requiring supervision of the human
being so its capabilities a powerful
enough to drive but nevertheless
requires a human to still be supervising
just like a safety driver is in a other
fully autonomous vehicles I think it
will require detecting hands on wheel
for at least six months or something
like that from here really it's a
question of like from a regulatory
standpoint what how much safer than a
person just autopilot need to be for it
took to be okay to not monitor the car
you know and and this is a debate that
one can have it and then if you need
even a large sample a large amount of
data so you can prove with high
confidence so statistically speaking
that the car is dramatically safe
without a person and that adding in the
person monitoring does not materially
affect the safety so it might not need
to be like two or three hundred percent
safe in a person and how do you prove
that incidence per mile incidents per
mile you know crashes and fatalities
yeah fatality would be a factor but is
there they're just not enough fatalities
to be statistically significant Miguel
but there are enough crashes you know
there are four more crashes and there
were fatalities so you can assess
where's the probability of of crash
that then there's another step which
probability of injury and probability of
opponent injury the probability of death
and all of those need to be much better
than a person by at least perhaps two
hundred percent and you think there's a
the ability to have a healthy discourse
with the regulatory bodies on this topic
I mean there's no question that the
regulator's paid a disproportionate
amount of attention to that which
generates press this is just an
objective fact
and Tesla generates a lot of press so
the you know in the United States this I
think almost 40,000 automotive deaths
per year but if there are four and Tesla
they will probably receive a thousand
times more press than anyone else so the
psychology of that is actually
fascinating I don't think we'll have
enough time to talk about that but I
have to talk to you about the human side
of things so myself and our team at MIT
recently released the paper on
functional vigilance of drivers while
using autopilot this is work we've been
doing since autopilot was first released
publicly over three years ago collecting
video driver faces and driver body so I
saw that you tweeted a quote from the
abstract so I can at least guess that
you've glanced at it yeah all right can
I talk you through what we found sure
okay so it appears that in the data that
we've collected that drivers are
maintaining functional vigilance such
that we're looking at 18-thousand
disengagement from autopilot
18900 and annotating were they able to
take over control in a timely manner so
they were there present looking at the
road to take over control okay
so this goes against what what many
would predict from the body of
literature on vigilance with automation
now the question is do you think these
results hold across the broader
population so ours is just a small
subset do you think one of the criticism
is
that you know there's a small minority
of drivers that may be highly
responsible where their vigilance
decrement would increase with auto pilot
use I think this is all really gonna be
swept I mean that the systems are
proving so much so fast that this is
gonna be a moot point very soon where
vigilance is like if something's many
times safer than a person then adding a
person does if the effect on safety is
is limited and in fact it could be
negative that's really interesting so
the the fact that a human may some
percent of the population may exhibit a
vigilance decrement will not affect the
overall status of safety no in fact I
think it will become very very quickly
maybe in towards in this year but I say
I'll be shocked if it's not next year at
the latest that having the post having a
human intervene will decrease safety
decrease it's like imagine if you're an
elevator
I used to be the third elevator
operators and and you couldn't go in an
elevator by yourself and work the the
lever to move between floors and now
nobody wants it an elevator operator
because the automated elevator that
stops the floors is much safer than the
elevator operator and in fact it would
be quite dangerous to have someone with
the lever that can move the elevator
between floors so that's a that's a
really powerful statement and really
interesting one but I also have to ask
from a user experience and from a safety
perspective one of the passions for me
algorithmically is camera based
detection of obvious sensing the human
but detecting what the driver is looking
at cognitive load body pose on the
computer vision side that's a
fascinating problem but do you think
there's many an industry you believe you
have to have camera based driver
monitoring do you think there could be
benefit gained from driver monitoring if
you have system that's that's out of
that's out or below human level
reliability then drive monitoring you
make sense but if your system is
dramatically better more level than than
a human then drive Montaigne monitoring
is not just not help much and like said
you just like as an you wouldn't want
someone interview like you don't want
someone in the elevator future an
elevator do you really want someone with
a big lever some random person operating
elevator between floors I think they
could I wouldn't trust that or rather
have the buttons ok you're optimistic
about the pace of improvement of the
system from what you've seen with a full
self-driving car computer the rate of
improvement is exponential so one of the
other very interesting design choices
early on that connects to this is the
operational design domain of autopilot
so where autopilot is able to be turned
on the so contrast another vehicle
system that we are studying is the
Cadillac super cruise system that's in
terms of OGD very constraint is
particular cause of highways
well mapped tested was much narrower
than the OD of Tesla vehicles what's
theirs there's a TD yeah that's good
this is it's a good life what was a
design decision within that different
philosophy of thinking where there's
pros and cons what we see with a wide OD
d is drive Tesla's drivers are able to
explore more the limitations of the
system at least early on and they
understand together with the instrument
cluster display they start to understand
what are the capabilities so that's a
benefit the con is you go you're letting
drivers use it basically anywhere
anywhere detect lanes with continents
was their philosophy design decisions
they were challenging they were being
made there or from the very beginning
was that done on purpose with intent
well I mean I think it's frankly it's
pretty crazy giving it letting people
drive it a 2-ton death machine manually
that's crazy like like in the future of
people were like I can't believe anyone
was just allowed to drive one of these
two-ton breath machines and they just
drive wherever they wanted just like
elevators use like move the elevator
with the lever wherever you want it can
stop it halfway between floors if you
want
it's pretty crazy so it's gonna seem
like a mad thing in the future that
people were driving cars so I have a
bunch of questions about the human
psychology about behavior and so on
and that would be coming that grammar
told ya because you have faith in the AI
system not faith but the both on the
hardware side and the deep learning
approach of learning from data we'll
make it just far safer than humans yeah
exactly
recently there are a few hackers who
tricked autopilot act and not expected
ways of the adversarial examples so we
all know that neural network systems are
very sensitive to minor disturbances to
these adversarial examples on input do
you think it's possible to defend
against something like this wrong for
the street for yeah can you elaborate on
the on the confidence behind that answer
well the you know in your own air is
just like basic punch up make matrix
math oh you have to be like a very
sophisticated somebody who really
understands neural nets and like
basically reverse engineer how the
matrix is being built and then create a
little thing that's just exactly causes
the matrix math to be slightly off but
it's very easy to then block it block
that by having but basically ant here a
negative recognition it's like if you if
the system sees something that looks
like a matrix hack excluded because
Sophia is such a easy thing to do so
learn both on the validator and the
invalid data so basically learn on the
adversarial examples to be able to
exclude them yeah you like your
basically order both know what is what
is a car and what is definitely not a
car and you trained for this is a car
and this is definitely not a car those
are two different things
people have no idea neural nets really
they probably think your license balls
like you know fishing net only
so as you know so taking a step beyond
just Tesla and autopilot current deep
learning approaches still seem in some
ways to be far from general intelligent
systems do you think the current
approaches will take us to general
intelligence or do totally new ideas
need to be invented I think we're
missing a few key ideas for general
intelligence general artificial general
intelligence
but it's going to be upon us very
quickly and then we'll need to figure
out what shall we do if we even have
that choice good but it's amazing how
people can't differentiate between say
the narrow AI that you know allows a car
to figure out what a lane line is and
and and you know and navigate streets
versus general intelligence like these
are just very different things like your
toaster and your computer or both
machines but once much more
sophisticated than another you're
confident with Tesla you can create the
world's best toaster world's best
toaster yes but with the world's best
self-driving I'm I yes I do to me right
now this seems game set match I don't
know I mean that's how I almost be
complacent overconfident but that's what
it appears that is just literally what
it how it appears right now I could be
wrong but it appears to be the case that
Tesla is vastly ahead of everyone do you
think we'll ever create an AI system
that we can love and loves us back and a
deep meaningful way like in the movie
her I think AI will be capable of
convincing you to fall in love with it
very well and that's different than us
humans you know we start getting into a
metaphysical question of like do
emotions and thoughts exist in a
different realm in the physical and
maybe they do maybe they don't
I don't know but but from a physics
standpoint I didn't think I tend to
think of things you know like physics
was my main sort of training and and
from a physics standpoint essentially if
it loves you in a way that is that you
can't tell whether it's real or not it
is real it's a physics view of love yeah
if there's no if you if you cannot just
if you can't prove that it does not if
there's no test that you can apply that
would make it
may allow you to tell the difference
then there is no difference right and
it's similar to seeing our world a
simulation there may not be a test to
tell the difference between what the
real world is a simulation and therefore
from a physics perspective it might as
well be the same thing yes and there may
be ways to test whether it's a
simulation there might be I'm not saying
there aren't but you could certainly
imagine that a simulation could could
correct
that once an entity in the simulation
found a way to detect the simulation it
could either restart the you know pause
that simulation start a new simulation
or do one of many other things that then
corrects for that error so when maybe
you or somebody else creates an AGI
system and you get to ask her one
question what would that question be
what's outside the simulation Ilan thank
you so much for talking today as a
pleasure all right thank you
you