Transcript
dEv99vxKjVI • Elon Musk: Tesla Autopilot | Lex Fridman Podcast #18
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0071_dEv99vxKjVI.txt
Kind: captions Language: en the following is a conversation with Elon Musk he's the CEO of Tesla SpaceX your link and a co-founder of several other companies this conversation is part of the artificial intelligence podcast the series includes leading researchers in academia and industry including CEOs and CTOs of automotive robotics AI and technology companies this conversation happened after the release of the paper from our group at MIT on driver functional vigilance during use of Tesla's autopilot the tesla team reached out to me offering a podcast conversation with mr. musk I accepted with full control of questions that could ask and the choice of what is released publicly I ended up editing out nothing of substance I've never spoken with Elon before this conversation publicly or privately neither he nor his companies have any influence on my opinion nor on the rigor and integrity of the scientific method that I practice in my position at MIT Tesla has never financially supported my research and I've never owned a Tesla vehicle I've never owned Tesla stock this podcast is not a scientific paper it is a conversation I respect Elon as I do all other leaders and engineers I've spoken with we agree on some things and disagree on others my goal is always with these conversations is to understand the way the guest sees the world one particular point of this agreement in this conversation was the extent to which camera-based driver monitoring will improve outcomes and for how long he will remain relevant for a I assisted driving as someone who works on and is fascinated by human centered artificial intelligence I believe that if implemented and integrated effectively camera-based driver monitoring is likely to be of benefit in both the short term and the long term in contrast Elon and Tesla's focus is on the improvement of autopilot such that it's statistical safety benefits override any concern of human behavior and psychology Elon and I may not agree on everything but I deeply respect the engineering and innovation behind the efforts that he leads my goal here is to catalyze a rigorous nuanced and objective discussion in industry in academia an AI assisted driving one that ultimately makes for a safer and better world and now here's my conversation with Elon Musk what was the vision the dream of autopilot when in the beginning the big-picture system level when it was first conceived and started being installed in 2014 the hardware in the cars was division the dream I would characterize the division or dream simply that there are obviously two massive revolutions and in the automobile industry one is the transition to electrification and then the other is autonomy and yeah became obvious to me that in the future any any car that does not have autonomy I would be about as useful as a horse which is not to say that there's no use it's just rare and somewhat idiosyncratic if somebody has a horse at this point it's just obvious that cars will drive themselves completely it's just a question of time and if we did not participate in the autonomy revolution then our cause would not be useful to people relative to cars that are autonomous I mean an autonomous car is arguably worth five to ten times more than by not colored which is not autonomous in a long term depends what you mean by a long term but would say at least for the next five years perhaps ten years so there a lot of very interesting design choices with autopilot early on first is showing on the instrument cluster or in the model three and the center stack display what the combined sensor suite sees what was the thinking behind that choice was there debate what was the process the whole point of the display is to provide a health check on the vehicles perception of reality so the vehicles taking in information for a motion sensor is primarily cameras but also radar and ultrasonics GPS and so forth and then that that information is then rendered into vector space and that you know with a bunch of objects with product with properties like lane lines and traffic lights and other cars and then in vector space that is re-rendered onto your display so you can confirm whether the car knows what's going on or not by looking at the window right I think that's a extremely powerful thing for people to get an understanding sort of become one with the system and understanding what the system is capable of now have you considered showing more so if we look at the computer vision you know like Road segmentation Lane detection vehicle detection object detection underlying the system there is at the edges some uncertainty have you considered revealing the parts that the uncertainty in the system the said apart movies associated with with say image recognition yeah right now it shows like the vehicles in the vicinity a very clean crisp image and people do confirm there's a car in front of me and the system sees there's a car in front of me but to help people build an intuition of what computer vision is by showing some of the uncertainty well I think it's my car I always look look at this sort of the debug view and there's this to debug views one is augmented vision where which I'm sure you've seen where it's basically we draw boxes and labels around objects that are recognized and then there's we're called the visualizer which is basically a Vectis based representation summing up the input from all sensors that doesn't does not show any pictures but it shows all of the it's basically shows the cause view of of the world in vector space but I think this is very difficult for people to know normal people to understand they would not know what thing they're looking at so it's almost the nature my challenge to the current things that are being displayed is optimized for the general public understanding of what the system is capable it's like if you no idea what how computer vision works or anything you can still look at the screen and see if the car knows what's going on and then if you're you know if you're a development engineer or if you're you know if you're if you have the development build like I do then you can see you know all the debug information but those would just be like total gibberish to most people what's your view on how to best distribute effort so there's three I would say technical aspects of autopilot that are really important since the underlying algorithms like then you'll network architecture there's the data so the distrain on and then there's a hardware development there may be others but so look algorithm data hardware you don't you only have so much money only have so much time what do you think is the most important thing to to allocate resources to do you see it as pretty evenly distributed between those three we automatically get a fast amount of data because all of our cars have eight external facing cameras and radar and usually twelve ultrasonic sensors GPS obviously and I am you and so we basically have a fleet that has we're about four hundred thousand cars on the road that have that level of data I think you keep quite close track of it actually yes yeah so we're we're approaching half a million cars on the road that have the full sensor suite yeah this so this is I'm I'm not sure how many other cars on the road have the sensor suite but I'll be surprised if it's more than five thousand which means that we were we have 99% of all the data so there's this huge inflow of data absolutely massive inflow of data and then we it's it's taken us about three years but now we've finally developed a full self-driving computer which can process and an order magnitude as much as the Nvidia system that we currently have in the in the cars and it's really just a to use it you've unplugged the Nvidia computer and plug the tells the computer in and that's it and it's it's a in fact we're not even we're still exploring the boundaries of the capabilities we were able to run the camera is a full frame rate full resolution not even crop of the images and it's still got Headroom even on one of the the system's the heart full self-driving computer is really two computers two systems on a chip that are fully redundant so you could put a boat through basically any part of that system and it still works the redundancy are they perfect copies of each other or yeah also it's purely for redundancy as opposed to an arguing machine kind of architecture where they're both making this is this is purely for redundancy you think more like it's if you have is a twin-engine aircraft commercial aircraft this system will operate best if both systems are operating but it's it's capable of operating safely on one so but as is right now we can just run we haven't even hit the at the edge of performance so with there's no need to actually distribute the functionality across both SOC s we can actually just run a full duplicate on but on one each one you haven't really explored or hit the limit of this not yet at the limiter so the magic of deep learning is the that it gets better with data you said there's a huge inflow of data but yeah the thing about driving the really valuable data to learn from is the edge cases so how do you I mean I've heard you talk somewhere about autopilot disengagement is being an important moment of time yes to use is there other edge cases or perhaps can you speak to those edge cases what aspects of that might be valuable or if you have other ideas how to discover more and more and more educators in driving well there's a lot of things that I learned though certainly edge cases where I say somebody's on order pot and they they take over and then okay that that that's a trigger that goes at to assist and I says okay so they take over for convenience or do they take over because the autopilot wasn't working properly there's also like let's say we're trying to figure out what is the optimal spline for traversing an intersection then then the ones where there are no interventions and we are the right ones see you then say okay when it looks like this do the following and then the end and then you get the optimal spline for a complex know getting a complex intersection so that's for this is kind of the common case you're trying to capture a huge amount of samples of a particular intersection how when things went right and then there's the edge case where as you said not for convenience but something somebody took over somebody asserted manual control from autopilot and really liked the way to look at this as view all input his error if the user had to do input if there's something all input is error that's a powerful line to think of it that way because it may very well be error but if you want to exit the highway or if you want to its a navigation decision that all autopilot is not currently designed to do then the driver takes over how do you know yes that's gonna change with navigate an autopilot which we're just released and and without still confirm so the navigation like lane change based it likes a certain control in order to change the lane change or Exeter freeway or or doing a highway interchange the vast majority that will go away with the release that just went out yeah that that I don't think people quite understand how big of a step that is yeah they don't so if you drive the car than you do so you still have to keep your hands on the steering wheel currently when it does the automatic and lane change what are so there's these these big leaps through the development of autopilot through its history and what stands out to you as the big leaps I would say this one navigate an autopilot without confirm without having to confirm there's a huge leap it is a huge leap but it also automatically overtake slow cars so it's it's both navigation and seeking the fastest lane so it'll it'll - you know overtake a slower cause and exit the freeway and take highway interchanges and and then we have traffic like traffic light to recognition which introduced put initially as a as a warning I mean on the development version that I'm driving the car fully fully stops and goes at traffic lights so those are the steps right you've just mentioned somethings that an inkling of a step towards full autonomy what would you say are the biggest technological roadblocks to full self-driving actually I don't think we I think we're just the full self-driving computer that we just let but it has a Oracle the FST computer that that's now in production so if you order any Model S or X or any model three that has the full self-driving package you'll get the FST computer that that was that's important that have enough base computation then refining the neural net and the control software but all of that can just providers know their update the thing that's really profound and where I'll be emphasizing at the auto sort of what that investor day they were having focused on autonomy is that the cars currently being produced but the hardware currently being produced is capable of full self-driving but capable is an interesting word because like the hardware is yeah and as we refine the software the capabilities will increase dramatically and then the reliability will increase dramatically and then it will receive regulatory approval so it's actually buying a car today is an investment in the future what you're essentially buying a you're buying the I think the most profound thing is that if you buy a Tesla today I believe you are buying an appreciating asset not a depreciating asset so that's a really important statement there because if hardware is capable enough that's the hard thing to upgrade yes usually exact so then the rest is a software problem yes I've software has no marginal cost really but what's your intuition on the software side how hard are the remaining steps to get it to where you know the the experience not just the safety but the full experience is something that people would enjoy I think we will enjoy it very much the under knee on the highway sits it's a total game changer for quality of life for using you know Tesla motor pilot on the highways is so it's really just extending that functionality to city streets adding in the traffic like traffic light recognition navigating complex intersections and and then being able to navigate complicated parking lots so the car can exit a parking space and come and find you even if it's in a complete maze of a parking lot and and then if and then you can just pick just drop you off and find a parking spot by itself yeah in terms of enjoy ability and something that people would would actually find a lot of use from the parking lot is a really you know it's it's rich of annoyance when you have to do it manually so there's a lot of benefit to be gained from automation there so let me start injecting the human into this discussion a little bit so let's talk about full autonomy if you look at the current level for vehicles being test on row like way mow and so on they're only technically autonomous they're really level two systems with just the different design philosophy because there's always a safety driver in almost all cases and they're monitoring the system right do you see Tesla's full self-driving as still for a time to come the requiring supervision of the human being so its capabilities a powerful enough to drive but nevertheless requires a human to still be supervising just like a safety driver is in a other fully autonomous vehicles I think it will require detecting hands on wheel for at least six months or something like that from here really it's a question of like from a regulatory standpoint what how much safer than a person just autopilot need to be for it took to be okay to not monitor the car you know and and this is a debate that one can have it and then if you need even a large sample a large amount of data so you can prove with high confidence so statistically speaking that the car is dramatically safe without a person and that adding in the person monitoring does not materially affect the safety so it might not need to be like two or three hundred percent safe in a person and how do you prove that incidence per mile incidents per mile you know crashes and fatalities yeah fatality would be a factor but is there they're just not enough fatalities to be statistically significant Miguel but there are enough crashes you know there are four more crashes and there were fatalities so you can assess where's the probability of of crash that then there's another step which probability of injury and probability of opponent injury the probability of death and all of those need to be much better than a person by at least perhaps two hundred percent and you think there's a the ability to have a healthy discourse with the regulatory bodies on this topic I mean there's no question that the regulator's paid a disproportionate amount of attention to that which generates press this is just an objective fact and Tesla generates a lot of press so the you know in the United States this I think almost 40,000 automotive deaths per year but if there are four and Tesla they will probably receive a thousand times more press than anyone else so the psychology of that is actually fascinating I don't think we'll have enough time to talk about that but I have to talk to you about the human side of things so myself and our team at MIT recently released the paper on functional vigilance of drivers while using autopilot this is work we've been doing since autopilot was first released publicly over three years ago collecting video driver faces and driver body so I saw that you tweeted a quote from the abstract so I can at least guess that you've glanced at it yeah all right can I talk you through what we found sure okay so it appears that in the data that we've collected that drivers are maintaining functional vigilance such that we're looking at 18-thousand disengagement from autopilot 18900 and annotating were they able to take over control in a timely manner so they were there present looking at the road to take over control okay so this goes against what what many would predict from the body of literature on vigilance with automation now the question is do you think these results hold across the broader population so ours is just a small subset do you think one of the criticism is that you know there's a small minority of drivers that may be highly responsible where their vigilance decrement would increase with auto pilot use I think this is all really gonna be swept I mean that the systems are proving so much so fast that this is gonna be a moot point very soon where vigilance is like if something's many times safer than a person then adding a person does if the effect on safety is is limited and in fact it could be negative that's really interesting so the the fact that a human may some percent of the population may exhibit a vigilance decrement will not affect the overall status of safety no in fact I think it will become very very quickly maybe in towards in this year but I say I'll be shocked if it's not next year at the latest that having the post having a human intervene will decrease safety decrease it's like imagine if you're an elevator I used to be the third elevator operators and and you couldn't go in an elevator by yourself and work the the lever to move between floors and now nobody wants it an elevator operator because the automated elevator that stops the floors is much safer than the elevator operator and in fact it would be quite dangerous to have someone with the lever that can move the elevator between floors so that's a that's a really powerful statement and really interesting one but I also have to ask from a user experience and from a safety perspective one of the passions for me algorithmically is camera based detection of obvious sensing the human but detecting what the driver is looking at cognitive load body pose on the computer vision side that's a fascinating problem but do you think there's many an industry you believe you have to have camera based driver monitoring do you think there could be benefit gained from driver monitoring if you have system that's that's out of that's out or below human level reliability then drive monitoring you make sense but if your system is dramatically better more level than than a human then drive Montaigne monitoring is not just not help much and like said you just like as an you wouldn't want someone interview like you don't want someone in the elevator future an elevator do you really want someone with a big lever some random person operating elevator between floors I think they could I wouldn't trust that or rather have the buttons ok you're optimistic about the pace of improvement of the system from what you've seen with a full self-driving car computer the rate of improvement is exponential so one of the other very interesting design choices early on that connects to this is the operational design domain of autopilot so where autopilot is able to be turned on the so contrast another vehicle system that we are studying is the Cadillac super cruise system that's in terms of OGD very constraint is particular cause of highways well mapped tested was much narrower than the OD of Tesla vehicles what's theirs there's a TD yeah that's good this is it's a good life what was a design decision within that different philosophy of thinking where there's pros and cons what we see with a wide OD d is drive Tesla's drivers are able to explore more the limitations of the system at least early on and they understand together with the instrument cluster display they start to understand what are the capabilities so that's a benefit the con is you go you're letting drivers use it basically anywhere anywhere detect lanes with continents was their philosophy design decisions they were challenging they were being made there or from the very beginning was that done on purpose with intent well I mean I think it's frankly it's pretty crazy giving it letting people drive it a 2-ton death machine manually that's crazy like like in the future of people were like I can't believe anyone was just allowed to drive one of these two-ton breath machines and they just drive wherever they wanted just like elevators use like move the elevator with the lever wherever you want it can stop it halfway between floors if you want it's pretty crazy so it's gonna seem like a mad thing in the future that people were driving cars so I have a bunch of questions about the human psychology about behavior and so on and that would be coming that grammar told ya because you have faith in the AI system not faith but the both on the hardware side and the deep learning approach of learning from data we'll make it just far safer than humans yeah exactly recently there are a few hackers who tricked autopilot act and not expected ways of the adversarial examples so we all know that neural network systems are very sensitive to minor disturbances to these adversarial examples on input do you think it's possible to defend against something like this wrong for the street for yeah can you elaborate on the on the confidence behind that answer well the you know in your own air is just like basic punch up make matrix math oh you have to be like a very sophisticated somebody who really understands neural nets and like basically reverse engineer how the matrix is being built and then create a little thing that's just exactly causes the matrix math to be slightly off but it's very easy to then block it block that by having but basically ant here a negative recognition it's like if you if the system sees something that looks like a matrix hack excluded because Sophia is such a easy thing to do so learn both on the validator and the invalid data so basically learn on the adversarial examples to be able to exclude them yeah you like your basically order both know what is what is a car and what is definitely not a car and you trained for this is a car and this is definitely not a car those are two different things people have no idea neural nets really they probably think your license balls like you know fishing net only so as you know so taking a step beyond just Tesla and autopilot current deep learning approaches still seem in some ways to be far from general intelligent systems do you think the current approaches will take us to general intelligence or do totally new ideas need to be invented I think we're missing a few key ideas for general intelligence general artificial general intelligence but it's going to be upon us very quickly and then we'll need to figure out what shall we do if we even have that choice good but it's amazing how people can't differentiate between say the narrow AI that you know allows a car to figure out what a lane line is and and and you know and navigate streets versus general intelligence like these are just very different things like your toaster and your computer or both machines but once much more sophisticated than another you're confident with Tesla you can create the world's best toaster world's best toaster yes but with the world's best self-driving I'm I yes I do to me right now this seems game set match I don't know I mean that's how I almost be complacent overconfident but that's what it appears that is just literally what it how it appears right now I could be wrong but it appears to be the case that Tesla is vastly ahead of everyone do you think we'll ever create an AI system that we can love and loves us back and a deep meaningful way like in the movie her I think AI will be capable of convincing you to fall in love with it very well and that's different than us humans you know we start getting into a metaphysical question of like do emotions and thoughts exist in a different realm in the physical and maybe they do maybe they don't I don't know but but from a physics standpoint I didn't think I tend to think of things you know like physics was my main sort of training and and from a physics standpoint essentially if it loves you in a way that is that you can't tell whether it's real or not it is real it's a physics view of love yeah if there's no if you if you cannot just if you can't prove that it does not if there's no test that you can apply that would make it may allow you to tell the difference then there is no difference right and it's similar to seeing our world a simulation there may not be a test to tell the difference between what the real world is a simulation and therefore from a physics perspective it might as well be the same thing yes and there may be ways to test whether it's a simulation there might be I'm not saying there aren't but you could certainly imagine that a simulation could could correct that once an entity in the simulation found a way to detect the simulation it could either restart the you know pause that simulation start a new simulation or do one of many other things that then corrects for that error so when maybe you or somebody else creates an AGI system and you get to ask her one question what would that question be what's outside the simulation Ilan thank you so much for talking today as a pleasure all right thank you you